[linux-pm] [RFC PATCH 0/4] timers: framework for migration between CPU

Ingo Molnar mingo at elte.hu
Mon Feb 23 02:22:34 PST 2009


* Balbir Singh <balbir at linux.vnet.ibm.com> wrote:

> * Ingo Molnar <mingo at elte.hu> [2009-02-23 10:11:58]:
> 
> > 
> > * Balbir Singh <balbir at linux.vnet.ibm.com> wrote:
> > 
> > > * Ingo Molnar <mingo at elte.hu> [2009-02-20 22:53:18]:
> > > 
> > > > 
> > > > * Arjan van de Ven <arjan at infradead.org> wrote:
> > > > 
> > > > > On Fri, 20 Feb 2009 17:07:37 +0100
> > > > > Ingo Molnar <mingo at elte.hu> wrote:
> > > > > 
> > > > > > 
> > > > > > * Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com> wrote:
> > > > > > 
> > > > > > > > I'd also suggest to not do that rather ugly 
> > > > > > > > enable_timer_migration per-cpu variable, but simply reuse 
> > > > > > > > the existing nohz.load_balancer as a target CPU.
> > > > > > > 
> > > > > > > This is a good idea to automatically bias the timers.  But 
> > > > > > > this nohz.load_balancer is a very fast moving target and we 
> > > > > > > will need some heuristics to estimate overall system idleness 
> > > > > > > before moving the timers.
> > > > > > > 
> > > > > > > I would agree that the power saving load balancer has a good 
> > > > > > > view of the system and can potentially guide the timer biasing 
> > > > > > > framework.
> > > > > > 
> > > > > > Yeah, it's a fast moving target, but it already concentrates 
> > > > > > the load somewhat.
> > > > > > 
> > > > > 
> > > > > I wonder if the real answer for this isn't to have timers be 
> > > > > considered schedulable-entities and have the regular scheduler 
> > > > > decide where they actually run.
> > > > 
> > > > hm, not sure - it's a bit heavy for that.
> > > >
> > > 
> > > I think the basic timer migration policy should exist in user 
> > > space.
> > 
> > I disagree.
> >
> 
> See below
>  
> > > One of the ways of looking at it is, as we begin to 
> > > consolidate, using range timers and migrating all timers to 
> > > lesser number of CPUs would make a whole lot of sense.
> > > 
> > > As far as the scheduler making those decisions is concerned, 
> > > my concern is that the load balancing is a continuous process 
> > > and timers don't necessarily work that way. I'd put my neck 
> > > out and say that irqbalance, range timers and timer migration 
> > > should all belong to user space. irqbalance and range timers 
> > > do, so should timer migration.
> > 
> > As i said it my first reply, IRQ migration is special because 
> > they are not kernel-internal objects, they come externally so 
> > there's a lot of user-space enumeration, policy and other steps 
> > involved. Furthermore, IRQs are migrated in a 'slow' fashion.
> > 
> > Timers on the other hand are fast entities tied to _tasks_ 
> > primarily, not external entities. 
> 
> Timers are also queued due to external events like interrupts 
> (device drivers tend to set of timers all the time). [...]

That is a silly argument. Tasks are created due to 'external 
events' as well such as the user hitting a key.

What matters, and what was my argument is the distinction 
whether the kernel _generates_ the event. For most IRQ events it 
does not, for the overwhelming majority of timers events it 
consciously generates timer events. Which makes them all the 
much different.

> [...] I am not fully against what you've said, at some 
> semantic level what you are suggesting is that at a higher 
> level of power saving, when the scheduler balances timers it 
> is doing a form of soft CPU hotplug on the system by migrating 
> timers and tasks away from idle CPUs when the load can be 
> handled by other CPUs. See below as well.
> 
> > Hence they should migrate 
> > according to the CPU where the activities of the system 
> > concentrates - i.e. where tasks are running.
> > 
> > Another thing: do you argue for the existing timer-migration 
> > code we have in mod_timer() to move to user-space too? It isnt a 
> > consistent argument to push 'some' of it to user-space, and some 
> > of it in kernel-space.
> > 
> 
> No.. mod_timer() is correct where it belongs.

You did not reply to my statement that the argument is a double 
standard. Why do certain migrations in the kernel and some not?

> Consider the powertop usage scenario today
> 
> 1. Powertop displays a list of timers and common causes of wakeup
> 2. It recommends policies in user space that can affect power savings
>    a. usb autosuspend
>    b. wireless link management
>    c. disable HAL polling

That's different - those are PowerTop timer event _reduction_ 
policies. Not migration policies of existing timers.

> My argument is, why can't we add
> 
>    d. Use range timers
>    e. Consolidate timers
> 
> In the future.
> 
> Even sched_mc=n is set by user space, so really the
> policy is in user space.

that is different again. sched_mc is a broad switch not a 
dynamic control like the sysfs migration interface that was 
introduced in this patchset. Which patchset we are discussing.

	Ingo


More information about the linux-pm mailing list