Bug 9169
Summary: | Fan not working after resume from hibernate but ok after s2ram - Toshiba Satellite U305-S5077 | ||
---|---|---|---|
Product: | Power Management | Reporter: | Romano Giannetti (romano.giannetti) |
Component: | Hibernation/Suspend | Assignee: | power-management_other |
Status: | REJECTED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | acpi-bugzilla, astarikovskiy, romano.giannetti, rui.zhang |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.23 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216 | ||
Attachments: |
Syslogs for all the tested cases (reply #10)
2.6.25-rc1 suspend/hibernation bugfix acpidump |
Description
Romano Giannetti
2007-10-15 13:42:24 UTC
As a new data point: the same happens to 2.6.24-rc2. I had to reboot after 5 minutes, temperature arrived at 70C without any fan working. Suspend to ram is ok, just suspend to disk creates this problem. Can you check if hibernation in the "shutdown" mode causes this problem too? Tried it. (echo shutdown > /sys/power/disk && echo disk > /sys/power/state). The result is different but quite not satisfactory... the fan stays everytime on at the lower speed. Puzzled. As a side note, I had to manually issue the 915resolution script and restart X11 after the resume, the display resumed at a wrong resolution (this laptop need -p -m in s2ram configuration). Is it possible to arrange things in such a way that the ACPI thermal and processor modules won't be loaded by the boot kernel (ie. the one that loads the hibernation image)? Hmmm... no idea. I think that the kernel that loads the hibernation image must be the same as the hibernated one; I suspected it would be a massive hack on the boot scripts... I will try to understand a bit better the userspace part of the hibernation/resume procedure in ubuntu. If you have any pointer, please tell me. Thanks. On openSUSE that depends on which modules are loaded from the initrd. If the processor and thermal modules are not present in the initrd, they are only loaded afterwards, when the regular root is mounted. However, since resume happens before mounting the root directory, they are not loaded by the boot kernel. Besides, is the system i386 or x86-64? It is a i386, CoreDuo, running (now) 2.6.24-rc5. Which kind of information do you need? There is more data on the ACPI of this system in the (resolved) bug#9327. On x86-64 the boot kernel may be different from the hibernated one and that could be helpful. Never mind. Can you test 2.6.24-rc6 with the patches from: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/patches/ applied, please? Reply-To: romanol@upcomillas.es Sent by mail, I hope it will appear on http://bugzilla.kernel.org/show_bug.cgi?id=9169 first of all sorry for not answering via bugzilla, but I am off-line now and I d not know how to use bugzilla via mail. When back to work I will be swamped by real life, so I have used this bit of time to test your patches from http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/patches/ The behaviour of the laptop is quite complex, so I will explain what happened step by step. I have PM debug on, so I will attach the kernel messages of every step. I compiled the kernel with the attached config. After boot, the fan works ok (It goes off when temp is under 49 C, goes on quietly betwenn 52 and somewhere around 70, goes on top speed over that. Temperature will top around 75 C with all two cores at 100% compiling kernels). I hibernated the computer with "shutdown" method. What happened on resume is that the fan stayed everytime on at a quiet level; my suspect is that it is sort-of congealed at the status it had at the moment of hibernation. See syslog-*-shutdown.txt.gz attached. After that, I started again hibernation in platform mode (without rebooting before; I will try next with a reboot and a fresh platform hibernation). Nothing changed. (syslog-*-platform-*). After that, I did a suspend-to-ram cycle (syslog-*-suspend*) and on resume, the fan went off and stayed off, even when temperature reached 80 C. Notice that if I just use suspend to ram from a fresh reboot the fans work perfectly. The behaviour does not change if I use only "platform" hibernate method. The log of this hibrnation cycle after a fresh reboot is in the file syslog-rc6-test-rafael-only-platform.txt.gz. Notice that there is a warning from lockdep: hibernate.sh/7392 is trying to acquire lock: [ 1459.153489] (events){--..}, at: [cleanup_workqueue_thread+16/112] cleanup_workqueue_thread+0x10/0x70 [ 1459.153499] [ 1459.153500] but task is already holding lock: [ 1459.153502] (workqueue_mutex){--..}, at: [workqueue_cpu_callback+235/320] workqueue_cpu_callback+0xeb/0x140 [ 1459.153507] [ 1459.153508] which lock already depends on the new lock. but I do not know if it's relevant. Resuming, on this laptop hibernation is a very dangerous thing to do :-). Thanks again, Romano an opensuse user reports similar symptoms: https://bugzilla.novell.com/show_bug.cgi?id=336538 Created attachment 14449 [details]
Syslogs for all the tested cases (reply #10)
I noticed only now that using bugzilla via mail did not attach any file... Here are the files cited in reply #10.
(In reply to comment #11) > an opensuse user reports similar symptoms: > https://bugzilla.novell.com/show_bug.cgi?id=336538 > Yes, it's similar. Unfortunately using shutdown method does not work for me... Following the hint on the opensuse report, I tried to unload/load the thermal driver when the system failed to start the fan. It did't work. acpi -V gives for thermal "ok, <temp> degrees C" everytime, even for fresh booted system where the fan is working ok. (i.e., I never see the active[0] message cited there.) From the opensuse report, which is solved now: I think the active[0] message is related to the particular configuration of my system. In the /proc/acpi/thermal_zone/THRM/trip_points file I can see critical (S5): 110 C passive: 105 C: tc1=2 tc2=10 tsp=100 devices=CPU1 active[0]: 60 C: devices=FN00 so I assume that the active[0] message comes from here. On the other hand, I solved the problem updating to the kernel 2.6.24-rc6-git11-3. The fan worked again flawlessly. Maybe you should try this one. And be careful with udev during the update!!! You can have a look the the opensuse thread if you want more info Hope it helps Tried with v2.6.24-rc7-71-gfd0b45d, no joy. BTW, my /proc/acpi/fan is void, and (0)rukbat:/usr/src/linux-2.6% cat /proc/acpi/thermal_zone/THRM/trip_points critical (S5): 104 C passive: 104 C: tc1=2 tc2=3 tsp=40 devices=CPU0 Tried to suspend to disk with 2.6.25-rc1, platform method. Fans are working now. I will try again to check against user error, but the problem seems fixed. Created attachment 14840 [details]
2.6.25-rc1 suspend/hibernation bugfix
Please test 2.6.25-rc1 with the attached patch that fixes a serious suspend/hibernation bug in it.
Will try... but suspend/hibernation works for me in 2.6.25-rc1 quite well. Or at least it seems to work. The bug need not affect your box. Hi, Rafael, what the status of this bug? :) I will try again (I do not use hibernation any more, now that suspend works flawlessly for me now, and in few seconds). But as for comment #17, it seems fixed. Will report later today. No, unfortunately, the bug is still here. Temp at 70C and fans at low speed, exactly as when they were at resume from hibernation (echo disk > /sys/power/state, platform method). I need to reboot. BTW, the resume let my X (intel) in bad resolution, I had to run 915resolution by hand and switch console from and back X11 to have it ok. Suspend to ram continue to work ok. Still here at 2.6.26-rc3. Exacty as in the previous comment. Rafael, the status is still NEED_INFO, what kind of info is needed? This is quite a dangerous bug. Status changed to assigned, but I have no idea how to fix it. :-( Thanks. I am available to test things, should you have an idea. Maybe the ACPI people could suggest something? could you please test the patch set from comment #11 to comment #13 in bugzilla #10223? Will try. Stay tuned. Nope. After resume, kernel compile: temp arrived at 76C and the fan was still at minumum speed. hmm, please test the refresh patch set. Applied patches from bug#10223, does not compile: drivers/acpi/sleep/main.c: In function ‘acpi_hibernation_free_nvs_pages’: drivers/acpi/sleep/main.c:302: warning: passing argument 1 of ‘free_pages’ makes integer from pointer without a cast drivers/acpi/sleep/main.c: At top level: drivers/acpi/sleep/main.c:326: error: expected ‘}’ before ‘;’ token drivers/acpi/sleep/main.c: In function ‘acpi_sleep_init’: drivers/acpi/sleep/main.c:617: error: incompatible type for argument 1 of ‘register_pm_notifier’ make[3]: *** [drivers/acpi/sleep/main.o] Error 1 make[2]: *** [drivers/acpi/sleep] Error 2 make[1]: *** [drivers/acpi] Error 2 make: *** [drivers] Error 2 oops. wrong patch attached, please try the new one... thanks. Retested, after the typo correction. It's still bad, but different now. After restore form hibernation, the fan is on at half speed even if the laptop is cold. And if I go up with temperature, fan is not going full speed. After that I did a s2ram cycle, and it seems that the fan is working again (at least now, with the laptop cold, is not running. ? Tried one time on 2.6.26-rc9: seems fixed. I restored from hibernation (no previous s2ram cycles) and fans are working. I hope it's not a glitch... Hi, Romano How about the current status of this bug? Does the problem still exist after the latest kernel(2.6.26-rc9) is used? Hi, Rafael, seems that the problem is fixed in the latest kernel. can we close this bug now? Sure, I see it's been closed already. Hmmm. Happened again in 2.6.28.2. No fans after resume from disk. A s2ram cycle make the thing works again. I am reopening this one, but I'm at a loss about it. It seems a real heisenbug, and added to the fact that I almost never use hibernation (which is really slow here) it's very difficult to grasp it. So if you feel it's not so important, feel free to re-close it. Hi, Romano Do you mean that this issue happened again on the kernel of 2.6.28.2? Will you please attach the output of acpidump? Thanks. Yes. It happened again. I didn't test it more (I use really rarely hibernation, STR is much more useful and fast), but if you want I can test it. I will attach acpidump to this message (made with my current kernel which is 2.6.29-rc5; I did not test fans with hibernate with this kernel, will do it shortly. Created attachment 20325 [details]
acpidump
Tested. Still no fan after resume form hibernation in 2.6.29-rc5. Temperature reached 75C without any fan triggering. Trying to STR now... After a STR cycle, the fans are working again. So the only thing needed is doing a STR cycle after the STD one... Hi, Romano thanks for the acpidump. From the acpidump it seems that there is no ACPI fan device("PNP0C0B"). And the fan device is not controlled by toshiba_acpi driver as there is no the following ACPI object. >\_SB_.VALZ.GHCI Maybe it is controlled by BIOS. In such case OS can do nothing about it. Hi, Rui How about reject this bug or assign it to other category? Thanks. well. I'm not sure. but obviously the bug is not caused by ACPI fan driver. Maybe the BIOS, or some platform specific ways. re-assign to the suspend/hibernation category. so it seems that 2.6.26-rc9 is the only kernel you've tried w/o this problem. can you use git-bisect to see which commit fixes the problem and which commit brings the problem back? Hi, I'm not sure if a bisect will help. I have the impression that the problem sometime does not happen, and so the bisection - a long one, by the way, like the one needed here - could easily give wrong results. I have no time now, but if I can find an idle afternoon, I will try. Hi, Romano, does the problem still exists in the latest kernel? say 2.6.31. Will try. I really am not using STD anymore, STR working so well. I'll do a STD before going to work and on resume I'll report back. Hmmm... cannot report. 2.6.31.4 refuses to STD. In dmesg I have: [ 4320.348530] PM: Marking nosave pages: 000000000009f000 - 0000000000100000 [ 4320.348537] PM: Basic memory bitmaps created [ 4320.348539] PM: Syncing filesystems ... done. [ 4320.370248] Freezing user space processes ... (elapsed 0.00 seconds) done. [ 4320.371461] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. [ 4320.371576] PM: Shrinking memory... \ [ 4342.324063] PM: Image restored successfully. [ 4342.338309] Restarting tasks ... done. [ 4342.341078] PM: Basic memory bitmaps freed ...STR works like a charm. please make sure /sys/power/pm_test==none and /sys/power/disk==platform before STD. please re-open it if the problem still exists in 2.6.32. |