Bug 7570

Summary: S3: fan doesn't work properly after resume
Product: ACPI Reporter: Aleksander Trofimowicz (aatrof)
Component: Power-FanAssignee: Konstantin Karasyov (konstantin.karasyov)
Status: CLOSED CODE_FIX    
Severity: low CC: aatrof, acpi-bugzilla, tommi.kyntola, trenn, tuukka.tolvanen
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.18.2 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg output
fan/thermal resume handlers fix
hopefully more clean fix patch
the same patch against kernel 2.6.20-rc3 (possibly for all kernels since 2.6.19)
the same patch against kernel 2.6.20-rc3 (possibly for all kernels since 2.6.19)
Additional fix for the same issue

Description Aleksander Trofimowicz 2006-11-22 19:42:20 UTC
Most recent kernel where this bug did *NOT* occur: don't know
Distribution: Fedora Core 6
Hardware Environment: hp compaq nx6325
Software Environment:
Problem Description:

After resume a fan C352 is turned off regardless of a temperature at TZ1,
although the one should work if the temperature is above 16 C (that condition is
always met). It isn't critical bug as the other three fans work fine if the TZ1
state changes.

Steps to reproduce:
Put system into S3 state and then back to S0.

Additional info:
Applying patches from bug 5524 and bug 7122 changes nothing.
Comment 1 Aleksander Trofimowicz 2006-11-22 19:46:13 UTC
Created attachment 9596 [details]
dmesg output

Look at a call trace for more hints.
Comment 2 Konstantin Karasyov 2006-12-04 04:50:00 UTC
Hello,

Does suspend to disk (i.e. S4) work as expected with patches from #7122 applied?
Comment 3 Aleksander Trofimowicz 2006-12-04 16:42:53 UTC
Yes, to great extend it does and under certain conditions I am able to restore a
correct behaviour of all fans. Please read bug 7122, comment 61 for more
details. A quite obvious difference is that there is no call trace in dmesg
output in case of S4.
Comment 4 Konstantin Karasyov 2006-12-07 10:29:07 UTC
Created attachment 9757 [details]
fan/thermal resume handlers fix

As mentioned in the last comment from bug #7122, disabling fan resume handler
solves the problem. The reason is the implementation of the fan resume handler:

'force_power_state'flag being set, disables the check if the required power
state is the same as the current one. In that case the list of power resources
being enabled is the same as the list of power resources being disabled, and
follows to consequent enabling and disabling of these resources.
The attached patch should fix the issue in fan resume handler, and also updates
thermal resume handler. It fixed the problem for my nx6125.
Comment 5 Aleksander Trofimowicz 2006-12-10 20:21:38 UTC
So far so good. I can use S3 and S4 states in various combinations under
different cpu workloads without any negative impact on a fan subsystem. Now the
fan works as it is expected.

The only minor 'but' is that I can still notice the very same call trace in
dmesg output produced by the might_sleep() macro.  I'm not in position to decide
whether this is indicative of some bug, or someone simply left a macro call for
debugging purposes. 
Comment 6 Thomas Renninger 2006-12-12 08:54:23 UTC
Sorry I don't get it. Why is this:
+	if (state == ACPI_STATE_D0) {
+		acpi_bus_set_power(device->handle, ACPI_STATE_D3);
+	} else {
+		acpi_bus_set_power(device->handle, ACPI_STATE_D0);
+	}
 	acpi_bus_set_power(device->handle, state);

in acpi_fan_add and acpi_fan_resume?
I understand that it makes sense to either set it to D0 or D3 instead of 
setting the state from bus_get_power(..), to avoid a D1/D2 state the BIOS could 
have set the fan while suspending or whatever might confuse later.
But why again calling acpi_bus_set_power(device->handle, state);?

I also wonder why bus_set_power needs to be called at all?
If state and kernel power variables are out of sync, shouldn't there something 
be like check_power() in power.c, similar as in thermal.c which gets invoked at 
resume (or notification, or whenever out of sync can happen)?
Comment 7 Konstantin Karasyov 2006-12-13 06:45:39 UTC
1. acpi_bus_set_power() has the check if the device already in the target 
state, so we cannot set it to that state directly.
2. acpi_power_transition() function executes _ON methods for power resources 
associated with target device power state, and then executes _OFF methods for 
power resources associated with current device power state, so if we skip the 
check in acpi_bus_set_power(), the power resources associated with the required 
power state will turned off.
3. Sometimes after resume the fan state reported doesn't match with the actual 
device state, so the only way to bring the consistency is to set the device in 
reported state.
4. The acpi_thermal_active() function uses ACPI_STATE_D0/ACPI_STATE_D3 to 
operate the fans, so I assumed that it's ok to use the same technique to bring 
the fans to appropriate state.
I'm agree that it's not very clean and safe approach, maybe we want to try to 
find the lowest power state available?

Comment 8 Konstantin Karasyov 2006-12-13 07:41:35 UTC
Created attachment 9802 [details]
hopefully more clean fix patch

Here is another patch to fix the issue. Works for me, and use only one call to
interpreter.

Alexander, could you try it also?

Thomas, does this patch looks better for you? Please take in account my
comments in the post above.
Comment 9 Aleksander Trofimowicz 2006-12-16 10:50:32 UTC
Works.

Excuse my ingnorance, but what is so dangerous in calling acpi_bus_set_power()
twice in a row? I skimmed through a body of the function and haven't noticed
anything special.
Comment 10 Konstantin Karasyov 2006-12-17 10:12:11 UTC
Thanks for reply.

The code executes much slower in the interpreter, so if there is possibility 
to avoid or reduce it, better to do so.

Well, I'm closing this bug, please re-open, if there are any problems.
Comment 11 Marian Klein 2007-01-07 13:27:07 UTC
Created attachment 10023 [details]
the same patch against kernel 2.6.20-rc3 (possibly for all kernels since 2.6.19)

the same patch against kernel 2.6.20-rc3 ( and possibly for all kernels since
2.6.19)
the patch is on top of patch 10022 from bug 7122
http://bugzilla.kernel.org/attachment.cgi?id=10022&action=view
Comment 12 Marian Klein 2007-01-07 13:28:39 UTC
Created attachment 10024 [details]
the same patch against kernel 2.6.20-rc3 (possibly for all kernels since 2.6.19)

the same patch against kernel 2.6.20-rc3 ( and possibly for all kernels since
2.6.19)
the patch is on top of patch 10022 from bug 7122
http://bugzilla.kernel.org/attachment.cgi?id=10022&action=view
Comment 13 Len Brown 2007-02-15 23:25:29 UTC
patch from comment #8, as refreshed in comment #12 applied to acpi-test
Comment 14 Konstantin Karasyov 2007-02-20 08:39:57 UTC
Created attachment 10472 [details]
Additional fix for the same issue

Here is the update on the issue for the latest kernel.
Comment 15 Len Brown 2007-02-20 23:09:56 UTC
patch from comment #8, as refreshed in comment #12 shipped in Linux-2.6.21-rc1
patch from comment #14 applied to acpi-test
Comment 16 Len Brown 2007-03-10 21:27:29 UTC
patch from comment #14 shipped in 2.6.21-rc3-git6
closed