Bug 58301
Summary: | When resuming after suspend, HP 2510p turns on fan full blast and never turns it off | ||
---|---|---|---|
Product: | Power Management | Reporter: | Jake Edge (jake) |
Component: | Thermal | Assignee: | Zhang Rui (rui.zhang) |
Status: | CLOSED MOVED | ||
Severity: | normal | CC: | aaron.lu, auxsvr, gojrzan, lenb, me, micgro2, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.7-3.9 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
output from grep + sensors after resume
ll /sys/class/thermal/thermal_zone*/ debug patch to check cooling state transition in step_wise governor dmesg from boot through resume debug patch to check cooling state transition in step_wise governor - v2 patch: change cooling device state based on cached value instead of real state Patch 1/4 patch 2/4 patch 3/4 patch 4/4 |
Description
Jake Edge
2013-05-16 02:55:45 UTC
please attach the output of "ll /sys/class/thermal/thermal_zone*/" when the bug occurs as well. Created attachment 101741 [details]
ll /sys/class/thermal/thermal_zone*/
after the problem occurs (after resume, fan comes on full)
Created attachment 101971 [details]
debug patch to check cooling state transition in step_wise governor
please apply this patch and attach the dmesg output after the problem occurs after resume.
Created attachment 101981 [details]
dmesg from boot through resume
The patch does not apply to 3.9 (no thermal_core.c), but I applied it to thermal_sys.c successfully. Here is the dmesg from boot through the sleep (around 60s in) and after the resume.
Okay, I think I've found the problem. First, In ACPI fan driver, it turns on all fans during suspend. And update the power state, i.e. make sure they are on again during resume. And here is how the problem occurs, before suspend, the fan is in off state, thus the thermal framework thought the fan is off. after resume, there is an thermal notification and thermal framework starts to update the thermal zones. But the temperature returned is equal to the temperature captured last time, which is before suspend, thus the thermal trend is stabling, then thermal core will keep the fan state as it is, without suspend/resume, the thermal framework will keep the fan in OFF state. but after suspend/resume. the thermal framework will keep the fan in ON state... please applied the refreshed debug patch and the fix patch attached later. Created attachment 102011 [details]
debug patch to check cooling state transition in step_wise governor - v2
please apply this refreshed debug patch instead.
Created attachment 102021 [details]
patch: change cooling device state based on cached value instead of real state
please try this patch which I think should fix the problem for you.
Note that I'm still thinking of a proper solution for this problem. thus this is probably not the real fix that target for upstream.
But anyway, please check if it helps or not.
that fixed the problem. multiple suspend/resumes without the fans coming on full blast. also, temp6 is not at 100° as it had been. thanks! H, Jake, The 4 patches attached below are the ones I proposed to fix the problem for upstream. can you please try them WITHOUT the patch in comment #7 to see if they help? Created attachment 102621 [details]
Patch 1/4
Created attachment 102631 [details]
patch 2/4
Created attachment 102641 [details]
patch 3/4
Created attachment 102651 [details]
patch 4/4
please apply the patch at https://patchwork.kernel.org/patch/2633071/ on top of the four patches and see if they help. Built 3.10-rc4, which exhibited the problem (no surprise) ... added the four patches above and the fan was on at a low level right after booting and stayed that way with a basically idle system. After suspend then resume, the fan was on at a higher level, but not at full blast as it has been in the past. added the patch from patchwork, at boot time, no fans are running. after suspend/resume, the fan comes on briefly (at a low level) and then turns off. Seems like the last patch fixes things reasonably ... Hi, Jake, please also try this series on top of a clean upstream kernel, or just pull thermal -next branch and see if the problem still exists. https://patchwork.kernel.org/patch/2733361/ https://patchwork.kernel.org/patch/2733371/ https://patchwork.kernel.org/patch/2733391/ https://patchwork.kernel.org/patch/2733401/ https://patchwork.kernel.org/patch/2733411/ https://patchwork.kernel.org/patch/2733421/ I built 3.10-rc7 and confirmed that it still has the problem (no surprise), though it does have the fan on at a low level after booting, but it seems to turn off the fan after a minute or two. I then added these 6 patches, booted, slept, resumed and the fan came on full blast :( So these patches don't fix the problem for me. What info do you need? thanks I've experienced this bug just after switching to 3.7.x. After resume all Fan-type cooling devices's states were set to 1's, however setting them to 0's caused overheating because the fan never turned on again. Since 3.10.0 (and its RC's) it got a little better. Most of the cooling devices are properly set after resume, except two, which still put the fan at full speed. I don't know whether it's relevant but they have different than the other five /sys/devices/virtual/thermal/cooling_deviceX/device/path values: _TZ_.C3B1 and _TZ_.C3B2, while the five which now work properly have values from _TZ_.C3C8 to _TZ_.C3CC Since sensor labeled temp6 seems to report not the actual temperature but fan speed, I came up with these values: coolingdevice path temp6_value cur_state after resume on 3.10 0 \_TZ_.C3B1 100 1 1 \_TZ_.C3B2 70 1 2 \_TZ_.C3C8 100 0 or 1 if still needed 3 \_TZ_.C3C9 90 0 or 1 if still needed 4 \_TZ_.C3CA 70 0 or 1 if still needed 5 \_TZ_.C3CB 50 0 or 1 if still needed 6 \_TZ_.C3CC 30 0 or 1 if still needed _TZ_.C3B1 and _TZ_.C3B2 seem to put the fan in the same speeds as _TZ_.C3C8 and _TZ_.C3CA do. I've checked on 3.6.11 and the _TZ_.C3B's are never used even at 100% cpu load with the machine stuffed under a pillow. The bug is also in 3.11 rc3 Hi Rui, What's the status of the patches? Any update on this? I'm forced to stick to an old kernel because of this bug and the patches do not apply to any kernel that openSUSE supports. Can you please try 3.15-rc4? We've made a change to the ACPI fan driver that may affect this. Well, it's better, but not completely fixed I would say. I booted 3.15-rc4 and the fan came on at a low level (which it shouldn't in my opinion). But, then I slept it, and resumed -- the fan came back on at the low level and fairly quickly slowed down and stopped. I slept it again and got the same behavior. So I don't think it should come on at boot (and stay on ... I gave it a few minutes but it never shut down). FWIW, it comes on as soon as power is applied, before it even gets out of the BIOS ... so maybe Linux just needs to turn it off at boot time (if it isn't too hot) ... here is the fan status (which, interestingly doesn't change) and sensors output before and after: [root@ouzel talks]# cat /sys/bus/acpi/drivers/fan/PNP0C0B\:0?/thermal_cooling/cu r_state 0 0 0 0 0 0 1 [root@ouzel talks]# sensors acpitz-virtual-0 Adapter: Virtual device temp1: +25.0°C (crit = +70.0°C) temp2: +50.0°C (crit = +256.0°C) temp3: +51.0°C (crit = +110.0°C) temp4: +39.0°C (crit = +105.0°C) temp5: +28.1°C (crit = +110.0°C) temp6: +30.0°C (crit = +110.0°C) coretemp-isa-0000 Adapter: ISA adapter Core 0: +47.0°C (high = +100.0°C, crit = +100.0°C) Core 1: +48.0°C (high = +100.0°C, crit = +100.0°C) <sleep - resume > [root@ouzel talks]# cat /sys/bus/acpi/drivers/fan/PNP0C0B\:0?/thermal_cooling/cu r_state 0 0 0 0 0 0 1 [root@ouzel talks]# sensors acpitz-virtual-0 Adapter: Virtual device temp1: +25.0°C (crit = +70.0°C) temp2: +50.0°C (crit = +256.0°C) temp3: +50.0°C (crit = +110.0°C) temp4: +38.0°C (crit = +105.0°C) temp5: +28.2°C (crit = +110.0°C) temp6: +30.0°C (crit = +110.0°C) coretemp-isa-0000 Adapter: ISA adapter Core 0: +46.0°C (high = +100.0°C, crit = +100.0°C) Core 1: +48.0°C (high = +100.0°C, crit = +100.0°C) what other info do you need? Well, so the behavior seems to have changed with respect to the bug subject and description. I wonder if we should continue debugging it here or open a new bug? I'm open to most anything, though I don't really use that laptop any more (except rarely for scanning, since Fedora seems to have broken that in 19 and 20, sigh, but I digress) ... want me to open a new bug? or is there an existing bug for the fan not being turned off at boot? or we could just drop it ... thanks, jake Well, depending on how much effort you're willing to spend on that. :-) It will involve getting some debug info from that box and trying to figure out what's wrong with it and that may be a couple of things ... Well, I am certainly willing to do some debugging, so I filed another: bug #75741 ... hopefully tracking this down will help others since I don't use that laptop much any more ... OK, I'll mark this one as resolved. Hopefully, it won't regress again suspend-wise. |