Bug 9169

Summary: Fan not working after resume from hibernate but ok after s2ram - Toshiba Satellite U305-S5077
Product: Power Management Reporter: Romano Giannetti (romano.giannetti)
Component: Hibernation/SuspendAssignee: power-management_other
Status: REJECTED INSUFFICIENT_DATA    
Severity: normal CC: acpi-bugzilla, astarikovskiy, romano.giannetti, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.23 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: Syslogs for all the tested cases (reply #10)
2.6.25-rc1 suspend/hibernation bugfix
acpidump

Description Romano Giannetti 2007-10-15 13:42:24 UTC
Most recent kernel where this bug did not occur: unknown
Distribution: Ubuntu Feisty

Hardware Environment: Toshiba Satellite U305-S5077

Software Environment: Feisty with upgraded kernel. See https://www.dea.icai.upcomillas.es/romano/pmwiki/pmwiki.php/Main/ToshibaU305

Problem Description: After resuming from hibernation (suspend-to-disk), fan is not started when temperature goes up. I tried some acpi patches (see: 
http://sourceforge.net/mailarchive/forum.php?thread_name=1192224624.7038.9.camel%40rukbat&forum_name=suspend-devel
) but I had no luck. Patch tested were:
http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.23/patches/38-ACPI-power-don_t-cache-power-resource-state.patch
http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.23/patches/39-ACPI-Fan-fan-device-does-not-need-own-structure.patch
http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.23/patches/40-Fan-Drop-force_power_state-acpi_device-option.patch
If resuming from suspend-to-ram all is working well.

Steps to reproduce: simply chose hibernate from the menu, resume, run some intensive cpu task and see the temperature go up till 80 degree celsius without triggering the fan (in normal conditions it starts at 60 celsius). 

I use vmware and ndiswrapper modules, but I have reproduced the problem without any binary driver loaded.
Comment 1 Romano Giannetti 2007-11-15 01:18:07 UTC
As a new data point: the same happens to 2.6.24-rc2. I had to reboot after 5 minutes, temperature arrived at 70C without any fan working. Suspend to ram is ok, just suspend to disk creates this problem.
Comment 2 Rafael J. Wysocki 2007-12-12 16:31:29 UTC
Can you check if hibernation in the "shutdown" mode causes this problem too?
Comment 3 Romano Giannetti 2007-12-13 01:52:27 UTC
Tried it. (echo shutdown > /sys/power/disk && echo disk > /sys/power/state). 
The result is different but quite not satisfactory... the fan stays everytime on at the lower speed. Puzzled. 

As a side note, I had to manually issue the 915resolution script and restart X11 after the resume, the display resumed at a wrong resolution (this laptop need -p -m in s2ram configuration).

 
Comment 4 Rafael J. Wysocki 2007-12-13 07:35:28 UTC
Is it possible to arrange things in such a way that the ACPI thermal and processor modules won't be loaded by the boot kernel (ie. the one that loads the hibernation image)?
Comment 5 Romano Giannetti 2007-12-13 08:22:10 UTC
Hmmm... no idea. I think that the kernel that loads the hibernation image must be the same as the hibernated one; I suspected it would be a massive hack on the boot scripts... I will try to understand a bit better the userspace part of the hibernation/resume procedure in ubuntu. If you have any pointer, please tell me.
Thanks.
Comment 6 Rafael J. Wysocki 2007-12-13 08:28:58 UTC
On openSUSE that depends on which modules are loaded from the initrd.  If the processor and thermal modules are not present in the initrd, they are only loaded afterwards, when the regular root is mounted.  However, since resume happens before mounting the root directory, they are not loaded by the boot kernel.

Besides, is the system i386 or x86-64?
Comment 7 Romano Giannetti 2007-12-14 01:05:46 UTC
It is a i386, CoreDuo, running (now) 2.6.24-rc5. Which kind of information do you need? There is more data on the ACPI of this system in the (resolved) bug#9327.
Comment 8 Rafael J. Wysocki 2007-12-14 15:32:55 UTC
On x86-64 the boot kernel may be different from the hibernated one and that could be helpful.  Never mind.
Comment 9 Rafael J. Wysocki 2007-12-28 12:42:41 UTC
Can you test 2.6.24-rc6 with the patches from:
http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/patches/
applied, please?
Comment 10 Anonymous Emailer 2008-01-09 03:27:16 UTC
Reply-To: romanol@upcomillas.es


Sent by mail, I hope it will appear on
http://bugzilla.kernel.org/show_bug.cgi?id=9169

first of all sorry for not answering via bugzilla, but I am off-line now
and I d not know how to use bugzilla via mail. When back to work I will
be swamped by real life, so I have used this bit of time to test your
patches from 
http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/patches/

The behaviour of the laptop is quite complex, so I will explain what
happened step by step. I have PM debug on, so I will attach the kernel
messages of every step. 

I compiled the kernel with the attached config. After boot, the fan
works ok (It goes off when temp is under 49 C, goes on quietly betwenn
52 and somewhere around 70, goes on top speed over that. Temperature
will top around 75 C with all two cores at 100% compiling kernels). 

I hibernated the computer with "shutdown" method. What happened on
resume is that the fan stayed everytime on at a quiet level; my suspect
is that it is sort-of congealed at the status it had at the moment of
hibernation. See syslog-*-shutdown.txt.gz attached.

After that, I started again hibernation in platform mode (without
rebooting before; I will try next with a reboot and a fresh platform
hibernation). Nothing changed. (syslog-*-platform-*). 

After that, I did a suspend-to-ram cycle (syslog-*-suspend*) and on
resume, the fan went off and stayed off, even when temperature reached
80 C. Notice that if I just use suspend to ram from a fresh reboot the
fans work perfectly.

The behaviour does not change if I use only "platform" hibernate method.
The log of this hibrnation cycle after a fresh reboot is in the file 
syslog-rc6-test-rafael-only-platform.txt.gz.

Notice that there is a warning from lockdep:

hibernate.sh/7392 is trying to acquire lock:
 [ 1459.153489]  (events){--..}, at: [cleanup_workqueue_thread+16/112] cleanup_workqueue_thread+0x10/0x70
 [ 1459.153499] 
 [ 1459.153500] but task is already holding lock:
 [ 1459.153502]  (workqueue_mutex){--..}, at: [workqueue_cpu_callback+235/320] workqueue_cpu_callback+0xeb/0x140
 [ 1459.153507] 
 [ 1459.153508] which lock already depends on the new lock.

but I do not know if it's relevant. 

Resuming, on this laptop hibernation is a very dangerous thing to do :-).
 
Thanks again,
	Romano
Comment 11 Len Brown 2008-01-13 21:18:32 UTC
an opensuse user reports similar symptoms:
https://bugzilla.novell.com/show_bug.cgi?id=336538
Comment 12 Romano Giannetti 2008-01-14 00:40:19 UTC
Created attachment 14449 [details]
Syslogs for all the tested cases (reply #10)

I noticed only now that using bugzilla via mail did not attach any file... Here are the files cited in reply #10.
Comment 13 Romano Giannetti 2008-01-14 00:41:34 UTC
(In reply to comment #11)
> an opensuse user reports similar symptoms:
> https://bugzilla.novell.com/show_bug.cgi?id=336538
> 

Yes, it's similar. Unfortunately using shutdown method does not work for me...
Comment 14 Romano Giannetti 2008-01-14 02:59:36 UTC
Following the hint on the opensuse report, I tried to unload/load the thermal driver when the system failed to start the fan. It did't work. 

acpi -V gives for thermal "ok, <temp> degrees C" everytime, even for fresh booted system where the fan is working ok. (i.e., I never see the active[0] message cited there.)
Comment 15 Alejandro Vaquero 2008-01-15 03:15:47 UTC
From the opensuse report, which is solved now:

I think the active[0] message is related to the particular configuration of my system. In the /proc/acpi/thermal_zone/THRM/trip_points file I can see

critical (S5):           110 C
passive:                 105 C: tc1=2 tc2=10 tsp=100 devices=CPU1
active[0]:               60 C: devices=FN00

so I assume that the active[0] message comes from here. On the other hand, I solved the problem updating to the kernel 2.6.24-rc6-git11-3. The fan worked again flawlessly. Maybe you should try this one.

And be careful with udev during the update!!!

You can have a look the the opensuse thread if you want more info

Hope it helps
Comment 16 Romano Giannetti 2008-01-15 06:09:15 UTC
Tried with v2.6.24-rc7-71-gfd0b45d, no joy. 

BTW, my /proc/acpi/fan is void, and 

(0)rukbat:/usr/src/linux-2.6% cat /proc/acpi/thermal_zone/THRM/trip_points
critical (S5):           104 C
passive:                 104 C: tc1=2 tc2=3 tsp=40 devices=CPU0 
Comment 17 Romano Giannetti 2008-02-14 11:03:21 UTC
Tried to suspend to disk with 2.6.25-rc1, platform method. 
Fans are working now. I will try again to check against user error, but the problem seems fixed. 
Comment 18 Rafael J. Wysocki 2008-02-14 11:33:47 UTC
Created attachment 14840 [details]
2.6.25-rc1 suspend/hibernation bugfix

Please test 2.6.25-rc1 with the attached patch that fixes a serious suspend/hibernation bug in it.
Comment 19 Romano Giannetti 2008-02-14 11:50:20 UTC
Will try... but suspend/hibernation works for me in 2.6.25-rc1 quite well. Or at least it seems to work. 
Comment 20 Rafael J. Wysocki 2008-02-14 11:54:52 UTC
The bug need not affect your box.
Comment 21 Zhang Rui 2008-03-24 00:09:34 UTC
Hi, Rafael,
what the status of this bug? :)
Comment 22 Romano Giannetti 2008-03-24 02:15:56 UTC
I will try again (I do not use hibernation any more, now that suspend works
flawlessly for me now, and in few seconds). But as for comment #17, it seems fixed. Will report later today.
Comment 23 Romano Giannetti 2008-03-24 02:29:17 UTC
No, unfortunately, the bug is still here. Temp at 70C and fans at low speed, exactly as when they were at resume from hibernation (echo disk > /sys/power/state, platform method). I need to reboot.

BTW, the resume let my X (intel) in bad resolution, I had to run 915resolution by hand and switch console from and back X11 to have it ok. 

Suspend to ram continue to work ok. 
Comment 24 Romano Giannetti 2008-05-20 05:34:02 UTC
Still here at 2.6.26-rc3. Exacty as in the previous comment. 

Rafael, the status is still NEED_INFO, what kind of info is needed? 

This is quite a dangerous bug.
  
Comment 25 Rafael J. Wysocki 2008-05-20 09:30:18 UTC
Status changed to assigned, but I have no idea how to fix it. :-(
Comment 26 Romano Giannetti 2008-05-21 01:12:25 UTC
Thanks. 

I am available to test things, should you have an idea. Maybe the ACPI people could suggest something?
Comment 27 Zhang Rui 2008-06-16 22:22:34 UTC
could you please test the patch set from comment #11 to comment #13 in bugzilla #10223?
Comment 28 Romano Giannetti 2008-06-17 01:07:05 UTC
Will try. Stay tuned.
Comment 29 Romano Giannetti 2008-06-17 01:35:16 UTC
Nope. After resume, kernel compile: temp arrived at 76C and the fan was still at minumum speed. 
Comment 30 Zhang Rui 2008-06-19 01:45:54 UTC
hmm, please test the refresh patch set.
Comment 31 Romano Giannetti 2008-06-19 07:05:17 UTC
Applied patches from bug#10223, does not compile:

drivers/acpi/sleep/main.c: In function ‘acpi_hibernation_free_nvs_pages’:
drivers/acpi/sleep/main.c:302: warning: passing argument 1 of ‘free_pages’ makes integer from pointer without a cast
drivers/acpi/sleep/main.c: At top level:
drivers/acpi/sleep/main.c:326: error: expected ‘}’ before ‘;’ token
drivers/acpi/sleep/main.c: In function ‘acpi_sleep_init’:
drivers/acpi/sleep/main.c:617: error: incompatible type for argument 1 of ‘register_pm_notifier’
make[3]: *** [drivers/acpi/sleep/main.o] Error 1
make[2]: *** [drivers/acpi/sleep] Error 2
make[1]: *** [drivers/acpi] Error 2
make: *** [drivers] Error 2
Comment 32 Zhang Rui 2008-06-19 18:22:22 UTC
oops. wrong patch attached, please try the new one...
thanks.
Comment 33 Romano Giannetti 2008-06-20 02:43:25 UTC
Retested, after the typo correction.

It's still bad, but different now. After restore form hibernation, the fan is on at half speed even if the laptop is cold. And if I go up with temperature, fan is not going full speed. 

After that I did a s2ram cycle, and it seems that the fan is working again (at least now, with the laptop cold, is not running. 

? 
Comment 34 Romano Giannetti 2008-07-07 03:23:44 UTC
Tried one time on 2.6.26-rc9: seems fixed. I restored from hibernation (no previous s2ram cycles) and fans are working. 
I hope it's not a glitch...
Comment 35 ykzhao 2008-07-17 20:38:22 UTC
Hi, Romano
   How about the current status of this bug? 
   Does the problem still exist after the latest kernel(2.6.26-rc9) is used? 
Comment 36 Zhang Rui 2008-08-28 01:33:45 UTC
Hi, Rafael,
seems that the problem is fixed in the latest kernel.
can we close this bug now?
Comment 37 Rafael J. Wysocki 2008-08-28 04:35:36 UTC
Sure, I see it's been closed already.
Comment 38 Romano Giannetti 2009-01-27 00:57:04 UTC
Hmmm. Happened again in 2.6.28.2. No fans after resume from disk. A s2ram cycle 
make the thing works again. 

I am reopening this one, but I'm at a loss about it. It seems a real heisenbug, and added to the fact that I almost never use hibernation (which is really slow here) it's very difficult to grasp it. So if you feel it's not so important, feel free to re-close it.
Comment 39 ykzhao 2009-02-22 23:24:57 UTC
Hi, Romano
    Do you mean that this issue happened again on the kernel of 2.6.28.2?
    Will you please attach the output of acpidump?
    Thanks.
Comment 40 Romano Giannetti 2009-02-23 01:02:08 UTC
Yes. It happened again. I didn't test it more (I use really rarely hibernation, STR is much more useful and fast), but if you want I can test it. I will attach acpidump to this message (made with my current kernel which is 2.6.29-rc5; I did not test fans with hibernate with this kernel, will do it shortly. 
Comment 41 Romano Giannetti 2009-02-23 01:02:51 UTC
Created attachment 20325 [details]
acpidump
Comment 42 Romano Giannetti 2009-02-23 01:17:37 UTC
Tested. Still no fan after resume form hibernation in 2.6.29-rc5. Temperature reached 75C without any fan triggering.
Trying to STR now...
Comment 43 Romano Giannetti 2009-02-23 01:20:34 UTC
After a STR cycle, the fans are working again. So the only thing needed is doing a STR cycle after the STD one... 
Comment 44 ykzhao 2009-02-23 21:44:15 UTC
Hi, Romano
    thanks for the acpidump.
    From the acpidump it seems that there is no ACPI fan device("PNP0C0B"). And the fan device is not controlled by toshiba_acpi driver as there is no the following ACPI object.
   >\_SB_.VALZ.GHCI 
 Maybe it is controlled by BIOS.
In such case OS can do nothing about it. 
    
Hi, Rui 
    How about reject this bug or assign it to other category?
   
    Thanks.
    
Comment 45 Zhang Rui 2009-02-23 22:59:23 UTC
well. I'm not sure.
but obviously the bug is not caused by ACPI fan driver.
Maybe the BIOS, or some platform specific ways.
re-assign to the suspend/hibernation category.
Comment 46 Zhang Rui 2009-03-18 19:04:44 UTC
so it seems that 2.6.26-rc9 is the only kernel you've tried w/o this problem.
can you use git-bisect to see which commit fixes the problem and which commit brings the problem back?
Comment 47 Romano Giannetti 2009-03-19 09:13:33 UTC
Hi,

I'm not sure if a bisect will help. I have the impression that the problem sometime does not happen, and so the bisection - a long one, by the way, like the one needed here - could easily give wrong results. I have no time now, but if I can find an idle afternoon, I will try.
Comment 48 Zhang Rui 2009-10-15 07:50:05 UTC
Hi, Romano,
does the problem still exists in the latest kernel? say 2.6.31.
Comment 49 Romano Giannetti 2009-10-16 06:34:40 UTC
Will try. I really am not using STD anymore, STR working so well. I'll do a STD before going to work and on resume I'll report back.
Comment 50 Romano Giannetti 2009-10-16 08:25:19 UTC
Hmmm... cannot report. 2.6.31.4 refuses to STD. In dmesg I have:

[ 4320.348530] PM: Marking nosave pages: 000000000009f000 - 0000000000100000
[ 4320.348537] PM: Basic memory bitmaps created
[ 4320.348539] PM: Syncing filesystems ... done.
[ 4320.370248] Freezing user space processes ... (elapsed 0.00 seconds) done.
[ 4320.371461] Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
[ 4320.371576] PM: Shrinking memory... \
[ 4342.324063] PM: Image restored successfully.
[ 4342.338309] Restarting tasks ... done.
[ 4342.341078] PM: Basic memory bitmaps freed

...STR works like a charm.
Comment 51 Zhang Rui 2009-10-20 03:29:39 UTC
please make sure /sys/power/pm_test==none and /sys/power/disk==platform before STD.
Comment 52 Zhang Rui 2009-12-28 08:22:17 UTC
please re-open it if the problem still exists in 2.6.32.