Bug 5000
Summary: | fan always on after waking from swsusp | ||
---|---|---|---|
Product: | ACPI | Reporter: | Sanjoy Mahajan (sanjoy) |
Component: | Power-Fan | Assignee: | Konstantin Karasyov (konstantin.karasyov) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, nigel.cunningham, pavel |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.13-rc6 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
syslog msgs with acpi_debug=0x1F
adding suspend/resume functionality fan implementation cleanup Suspend/resume support for fan device suspend/resume functionalty w/ additions from Patrick Mochel updated patch debug patch clean patch |
Description
Sanjoy Mahajan
2005-08-04 23:45:10 UTC
If you build the ACPI fan support as a module and unload it while suspending, does it help at all? Regards, Nigel Created attachment 5517 [details]
syslog msgs with acpi_debug=0x1F
A good idea, which I just tried. Alas, unloading the fan module doesn't help.
I've attached the syslog msgs with acpi_debug=0x1F
Just tried with 2.6.13-rc5-git3 (with unloading and reloading the fan module). Almost the same problem as with -git2: Now it comes back with the fan always off. Like before, /proc/acpi/fan/FN*/state says they are all on. 'acpi -t' says thermal zone 2 is doing active cooling: # acpi -t Battery 1: charged, 100% Battery 2: charged, 100% Thermal 1: ok, 42.0 degrees C Thermal 2: active[0], 39.0 degrees C Thermal 3: ok, 33.0 degrees C Thermal 4: ok, 35.0 degrees C It should be doing active cooling, but it isn't. So -git3 is quieter but more dangerous than -git2 (which has the fan always on). You may want to try echo-ing -n 0/3 into fan status files to force it on/off. Also fan module lacks error checking; if AML call fails, nothing is printed. (I changed the kernel version to -rc6 since it happens there too) I tried echo 3 > fan_state_file and that turned off the fan. After that the fan turns off and on automatically as it should. Should I add some debug lines to fan.c to find out what is going on? Well, that fan.c lacks suspend/resume support. Hint hint :-). Put fan at full speed in _suspend() hook, and make hardware put the fan back to sane state during _resume() hook. Sanjoy, If it is still an issue, could you try the following patches which add suspend/resume support for ACPI devices. First, apply 'common_suspend_resume.patch' to add suspend/resume functionality to ACPI subsystem. Next, to add suspend/resume implementation for fan, apply 'fan_cleanup.patch' followed by 'fan_suspend_resume.patch'. Created attachment 8048 [details]
adding suspend/resume functionality
This patch adds suspend/resume functionality to ACPI devices.
Created attachment 8049 [details]
fan implementation cleanup
Cleans up ACPI fan implementation.
Created attachment 8050 [details]
Suspend/resume support for fan device
This patch implements suspend/resume support for fan.
Variable naming is "interesting", but it looks okay to me. Can you post patches to lmkl for review? Created attachment 8086 [details]
suspend/resume functionalty w/ additions from Patrick Mochel
Avoiding sysdevs use, more safe, some clean-ups and debugging.
applied patches in comment #10 and comment #12 to acpi-test tree did not apply clean-up patch in comment #9 -- please e-mail me this cleanup later vs. early 2.6.18. I just applied the patches (using 2.6.16-rc5). It still has those problems, this time of the dangerous variety (fan doesn't turn on when it should). I hibernated (swsusp) without unloading fan. After resuming I ran a few CPU and disk intensive processes to drive up the temperature, and it got high: $ acpi -t Battery 1: charged, 99% Battery 2: charging, 78%, 00:29:22 until charged Thermal 1: ok, 89.0 degrees C Thermal 2: active[0], 62.0 degrees C Thermal 3: ok, 35.0 degrees C Thermal 4: ok, 39.0 degrees C The trip point for thermal 2 is 45 C. Despite what 'acpi' reports in thermal zone 2 ('active[0]'), the fan was not on. I had to turn it on by hand doing "echo 0' to the fan's state file. does your machine use "smbus unhide" hack? If so, try disabling it (and open new bug report) > does your machine use "smbus unhide" hack?
I don't think so. I don't know what that hack is, but a google search
for it along with 'thinkpad' did not turn up my machine (TP 600X).
The change also seems to confuse S3 sleep/wake, or maybe it's as confused as it ever was. I woke it up from S3 and the noticed that the fan is on even though the thermal system thinks the fan is off: $ acpi -t Battery 1: charged, 95% Battery 2: discharging, 74%, 01:07:19 remaining Thermal 1: ok, 33.0 degrees C Thermal 2: ok, 32.0 degrees C /* trip point is 45C */ Thermal 3: ok, 28.0 degrees C Thermal 4: ok, 30.0 degrees C But the fan module knows tha the fan is on: # cat /proc/acpi/fan/FN20/* status: on I can probably get it into a correct state by changing the THM2 trip point to 27C, so the thermal system will then agree with the actual fan state (i.e. that it's on), then change it back to 45C. Okay, I did that and the fan is now off in reality and according to 'acpi -t' and according to fan/FN20/. Created attachment 8362 [details]
updated patch
Sanjoy,
Could you try this patch - it is against 2.6.17
This patch updates thermal resume method to reset fan states.
It worked for me.
> Could you try this patch - it is against 2.6.17
I just tried it with no luck. I hibernated (swsusp) and it came back
with the fan off, but in a state I've never seen before:
$ acpi -t
Thermal 1: active[0], 45.0 degrees C
Thermal 2: active[0], 43.0 degrees C
Thermal 3: active[0], 33.0 degrees C
Thermal 4: active[0], 35.0 degrees C
Usually I see active[0] only for Thermal 2, but the change may be due
to improvements in 2.6.17's ACPI relative to 2.6.16. Either way, the
problem is that the fan is off but the system thinks it's on, so it'll
be unlikely to turn on (if the actual temperature drops below the trip
point, then it'll be okay because the actual and real states will
match).
patch versions in comment #13 shipped in 2.6.17-git9 Created attachment 8438 [details]
debug patch
Sanjoy,
Could you try this patch - it adds debug prints to suspend/resume and
updates thermal zone structures during resume routine. It applies over
the patch #8362 (the last patch I've post).
After trying this patch could you also check 'dmesg' output - it
should contain strings similar to the following:
..........................
!!! 0 active[0]: trip 3282 temp 3212 state disabled
..........................
!!! 1 active[0]: trip 3282 temp 3242 state disabled
!!! 2 active[0]: trip 3282 temp 3282 state enabled
!!! 3 active[0]: trip 3282 temp 3242 state disabled
..........................
Each string shows particular trip point info (trip point temperature,
current temperature, whether this trip point entered/not entered) in
differrent stages.
String starting with '0' - from suspend method,
'1' - before acpi_thermal_active() called for this trip point,
'2' - after acpi_thermal_active() exited,
'3' - after acpi_thermal_check() called for all thermal zone.
If you note some inconsistencies between 'dmesg' and 'acpi -t', fan
behavior, etc, could you post 'dmesg' output and describe the
situation.
The system I'm using for testing has only 1 active trip point, so I
cannot validate all possible situations, so your help would be very
important.
> updates thermal zone structures during resume routine That might have fixed it. With the new patch, the fan is behaving fine after resume from swsusp. (I haven't tested it with S3 suspend because the vanilla kernel needs extensive hacks to avoid hanging in _PTS -- a.k.a. bug #5989.) Now after swsusp resume, when the fan is running only one zone (THM2) is active, which is the usual behavior. And the 'acpi -t' output is consistent with the fan state. All four zones have an active trip point, but I think they turn on the same physical fan, and I've never seen the others on (except when testing the previous patch!). I'll keep running this kernel and let you know if the fan has any problems. marking as RESOLVED since there is a patch under test/review & consideration for pushing upstream. Created attachment 8493 [details]
clean patch
Here is the clean patch (debug messages removed, fan functionality updated)
against 2.6.17. It replaces patches ##8362, 8438.
Through the wonders of open source, a derivative of the debug patch in comment #21 addressing the thermal.c part of this problem has made it into Linus' tree: http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=bed936f7eab946c60170bc92a1aea597da158e02 Thus, the cleaned up patch in comment #24 that also addresses the fan problem will no longer apply. As the submitter satisfied. This bug report is closed. Konstantin, if there are additional issues that are not addressed by the commit above, they will have to be addressed elsewhere on top of the commit above. |