[linux-pm] calling runtime PM from system PM methods
Rafael J. Wysocki
rjw at sisk.pl
Sat Jun 11 16:13:38 PDT 2011
On Saturday, June 11, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw at sisk.pl> writes:
>
> [...]
>
> > Whether or not user space has disabled runtime PM _doesn't_ _matter_ for
> > system suspend, because _you_ _can't_ call pm_runtime_suspend(), or
> > pm_runtime_put_sunc(), from a driver's .suspend() callback _anyway_.
> > The reason is that doing that would cause the subsystem's (or power
> > domain's in this case) .runtime_suspend() callback to be invoked and
> > that's incorrect. Namely, it would require the subsystem (power domain)
> > to expect that its .runtime_suspend() would always be executed indirectly
> > as a result of calling its .suspend() (through the driver's callback)
> > and that expectation may or may not be met (depending on the driver's
> > design).
>
> So here's an interesting scenario which I think it triggers the same
> problem as you highlight above.
>
> Assume you have a driver that's using runtime PM on a per-xfer basis.
> Before each xfer, it does a pm_runtime_get_sync(), after each xfer it
> does a pm_runtime_put_sync() (for this example, it's important that it's
> a _put_sync()). The _put_sync() might happen in an ISR,
It can't happen in an ISR, to be precise.
> or possibly in a thread waiting on a completion which is awoken by the ISR,
> etc. etc. (the runtime PM callbacks are IRQ safe, and device is marked as such.)
>
> The driver is in the middle of an xfer and a system suspend request
> happens.
>
> The driver's ->suspend() callback happens, and the driver
>
> - enables/disables wakeups based on device_may_wakeup()
> - prevents future xfers
> - waits for current xfer to finish
>
> As soon as the xfer finishes, the driver gets notified (completion,
> callback, IRQ, whatever) and calls pm_runtime_put_sync(), which triggers
> subsys->runtime_suspend --> driver->runtime_suspend.
>
> While the driver's ->suspend() callback doesn't directly call
> pm_runtime_put_sync(), the act of waiting for the xfer to finish
> causes the subsystem/driver->runtime_suspend callbacks to be called
> during the subsytem/driver->suspend callback, which is the same problem
> as you highlight above.
It's not exactly the same. The difference is that you're talking about race
conditions between runtime PM and system suspend (I kind of know why I wanted
system suspend to block runtime PM now :-)) that may be prevented by
subsystem-level code from happening (by using locking and some flags etc.),
while that code cannot do much if its .runtime_suspend() callback, for example,
is executed directly from the system suspend code path.
> Based on your commit that removed incrementing the usage count across
> suspend[1], you mentioned "we can rely on subsystems and device drivers
> to avoid doing that unnecessarily." The above example shows that this
> type of thing might not be that obvious to detect and thus avoid.
>
> I suspect the solution to the above will be to add back the usage count
> increment across system suspend, but I'm hoping not. IMO, it would be
> more flexible to allow the subsystems to decide. The subsystems could
> provide locking (or manage dev->power.usage_count) themselves if
> necessary. For example, leave it to the subsystem->prepare() to
> pm_runtime_get_noresume() if it wants to avoid the "nesting" of
> callbacks.
I agree.
> A related question: does the pm_wq need to be freezable? From
> Documentation/power/runtime_pm.txt:
>
> * The power management workqueue pm_wq in which bus types and device drivers can
> put their PM-related work items. It is strongly recommended that pm_wq be
> used for queuing all work items related to run-time PM, because this allows
> them to be synchronized with system-wide power transitions (suspend to RAM,
> hibernation and resume from system sleep states). pm_wq is declared in
> include/linux/pm_runtime.h and defined in kernel/power/main.c.
>
> Is "synchronized with system-wide power transistions" correct here?
> Rather than synchronize, using a freezable workqueue actually _prevents_
> runtime PM events (at least async ones.)
>
> Again, proper locking (or management of dev->power.usage_count) at the
> subsystem level would get you the same effect, but still leave
> flexibility to the subsystem/pwr_domain layer.
No, please.
The problem here is that I don't want runtime PM stuff to be called during
the "noirq" stages of system suspend and resume which the freezing of the
workqueue takes care of nicely.
> P.S. the commit below[1] removed the usage count increment/decrement
> across system suspend/resume, but Documentation/power/runtime_pm.txt
> still refers to it. Patch below[2] removes it, ssuming you're
> not planning on adding it back. ;)
No, I'm not. In fact, I'm going to apply your patch. :-)
Thanks,
Rafael
> [1]
> commit e8665002477f0278f84f898145b1f141ba26ee26
> Author: Rafael J. Wysocki <rjw at sisk.pl>
> Date: Sat Feb 12 01:42:41 2011 +0100
>
> PM: Allow pm_runtime_suspend() to succeed during system suspend
>
> The dpm_prepare() function increments the runtime PM reference
> counters of all devices to prevent pm_runtime_suspend() from
> executing subsystem-level callbacks. However, this was supposed to
> guard against a specific race condition that cannot happen, because
> the power management workqueue is freezable, so pm_runtime_suspend()
> can only be called synchronously during system suspend and we can
> rely on subsystems and device drivers to avoid doing that
> unnecessarily.
>
> Make dpm_prepare() drop the runtime PM reference to each device
> after making sure that runtime resume is not pending for it.
>
> Signed-off-by: Rafael J. Wysocki <rjw at sisk.pl>
> Acked-by: Kevin Hilman <khilman at ti.com>
>
> [2]
> From 8968e3e41d785e7e5ce7584d64f6a55b303e7060 Mon Sep 17 00:00:00 2001
> From: Kevin Hilman <khilman at ti.com>
> Date: Fri, 10 Jun 2011 16:05:51 -0700
> Subject: [PATCH] PM / Runtime: update doc: usage count no longer incremented across system PM
>
> commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow
> pm_runtime_suspend() to succeed during system suspend) removed usage
> count increment across system PM.
>
> Update doc to reflect this.
>
> Signed-off-by: Kevin Hilman <khilman at ti.com>
> ---
> Applies on v3.0-rc2
>
> Documentation/power/runtime_pm.txt | 5 -----
> 1 files changed, 0 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
> index 654097b..22accb3 100644
> --- a/Documentation/power/runtime_pm.txt
> +++ b/Documentation/power/runtime_pm.txt
> @@ -566,11 +566,6 @@ to do this is:
> pm_runtime_set_active(dev);
> pm_runtime_enable(dev);
>
> -The PM core always increments the run-time usage counter before calling the
> -->prepare() callback and decrements it after calling the ->complete() callback.
> -Hence disabling run-time PM temporarily like this will not cause any run-time
> -suspend callbacks to be lost.
> -
> 7. Generic subsystem callbacks
>
> Subsystems may wish to conserve code space by using the set of generic power
>
More information about the linux-pm
mailing list