Bug 42796
Summary: | thermal temperature stops updating after hibernate - HP Presario A900 Notebook PC/30ED | ||
---|---|---|---|
Product: | ACPI | Reporter: | Philip Ashmore (contact) |
Component: | Power-Thermal | Assignee: | Zhang Rui (rui.zhang) |
Status: | CLOSED UNREPRODUCIBLE | ||
Severity: | normal | CC: | acpi-bugzilla, contact, jrnieder, lenb, rui.zhang |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.2.0-1-amd64 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
output of "acpidump" before hibernate
output of "grep . /sys/class/thermal/*/*" before hibernate output of "acpidump" after hibernate output of "grep . /sys/class/thermal/*/*" after hibernate output of "dmesg" after hibernate A picture of Trinity/konsole from my desktop after hibernate /var/log/apt/term.log /var/log/dpkg.log customized DSDT |
Description
Philip Ashmore
2012-02-18 13:48:02 UTC
Created attachment 72438 [details]
output of "grep . /sys/class/thermal/*/*" before hibernate
Created attachment 72439 [details]
output of "acpidump" after hibernate
Created attachment 72440 [details]
output of "grep . /sys/class/thermal/*/*" after hibernate
Created attachment 72441 [details]
output of "dmesg" after hibernate
(In reply to comment #0) > This started with > Presario A975 EM: fan runs at a constant (low) speed after hibernate, until > it > starts to overheat > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=640293 > > The thermal information from before the hibernate is used but it's not the > actual temperature. > Rebooting shows the actual elevated temperature resulting from running both > cores at 100%. Can you spell this out for me? What steps should I run to reproduce this, what should I expect to happen, and what will actually happen instead? The above description suggests that you are saying the temperature information after hibernation is wrong because the temperature can change while the computer is off. But that doesn't make sense, since the sensed temperature is not a static value and is supposed to be always changing (in particular changing after the resume). Grasping at straws: could you try appending acpi_sleep=nonvs to the kernel command line? It seems to have helped recently on some Asus and Sony systems. Steps to reproduce: 1. power on pc. 2. hibernate. 3. power on pc (this resumes from hibernate) 4. run some 100% CPU intensive tasks What should have happened: the thermal information updated to indicate that the CPUs were getting hot, resulting in the fan speeding up. What actually happened: the thermal information reported was the temperature before hibernation. Running some 100% CPU tasks goes unnoticed until a kind of emergency alarm kicks in and the fan goes straight to its top speed - I haven't waited for this recently as it seems a bad idea to rely on. A reboot restores the CPU thermal information and shows the CPUs are still a bit hot but the fan behaves normally to cool them down and the temperature falls. acpi_sleep=nonvs makes no difference. Does this happen after suspend? No. Also I get this pop-up message box. Sound server fatal error: cpu overload, aborting please try changing /sys/power/disk to "shutdown" instead of "platform" and report if any difference after the hibernate/resume. The resulting screen and font corruption prevented me from doing any useful tests after restoring from hibernate. I managed to get a snapshot of the screen - see attached. Also, after reboot, the [platform] option was once again selected in /sys/power/disk, if that helps. Philip Created attachment 72545 [details]
A picture of Trinity/konsole from my desktop after hibernate
Grasping at straws again: does 3fa016a0b5c5 (drm/i915: suspend fbdev device around suspend/hibernate, 2012-03-28) help? The good news is that that screen shot found its way to someone who knew what it meant - the screen+font corruption problem is fixed. You'll have to dumb down that you mean by "3fa016a0b5c5 drm/i915: suspend fbdev device" - what package(s) names does that map to in Debian Wheezy/Sid? If it helps, dpkg -l '*intel*' gets me ii libdrm-intel1:amd64 2.4.32-1 Userspace interface to intel-specific kernel DRM services -- runtime ii xserver-xorg-video-intel 2:2.18.0-1 X.Org X server -- Intel i8xx, i9xx display driver Sorry about that. I meant to ask if applying the patch with that name to the i915 kernel driver helps. Instructions are at [1], and if you have any questions, please don't hesitate to ask. Thanks, Jonathan [1] http://bugs.debian.org/645547#26 (In reply to comment #15) > The good news is that that screen shot found its way to someone who knew what > it meant - the screen+font corruption problem is fixed. Was that in the kernel or in userspace? Is there a relevant patch that distro people should consider backporting? I don't know for sure that my screen shot was the trigger for the fix. All I know is - it's fixed. Hope it doesn't break again. As for "Was that in the kernel or in userspace?" you'll have to dumb it down for me. I started working on the patch issue mentioned in #16, but don't hold your breadth - I'm fumbling in the dark. I would have thought that Intel would be the people with the relevant knowledge and skill set to track this down. It's sad that the PC vendors don't consider this an issue worthy of their time and resources even though Intel provide open source drivers - strange. (In reply to comment #18) > I don't know for sure that my screen shot was the trigger for the fix. > All I know is - it's fixed. > Hope it doesn't break again. > > As for "Was that in the kernel or in userspace?" you'll have to dumb it down > for me. Sorry for the lack of clarity. I meant to ask what components you upgraded in order to get the fix. (/var/log/dpkg.log might help in figuring that out if you remember when the fix happened and when you rebooted.) > I would have thought that Intel would be the people with the relevant > knowledge > and skill set to track this down. Part of the beauty of free software is that work can be distributed more widely. ;-) Keith Packard who reviewed the patch works for Intel. (In reply to comment #14) > Grasping at straws again: does 3fa016a0b5c5 (drm/i915: suspend fbdev device > around suspend/hibernate, 2012-03-28) help? A simpler way to test this guess: if you boot with i915.modeset=0 on the kernel command line and boot in "recovery mode" (i.e., don't start X), do you still get fan control trouble after hibernating with "echo disk >/sys/power/state"? (In reply to comment #20) > A simpler way to test this guess: if you boot with i915.modeset=0 on the > kernel > command line and boot in "recovery mode" (i.e., don't start X), do you still > get fan control trouble after hibernating with "echo disk >/sys/power/state"? For "fan control trouble" please read "incorrect temperature readings". Sorry for the noise. I'd also suggest trying the test suggested by Len Brown in comment #11: echo shutdown >/sys/power/disk echo disk >/sys/power/state (In reply to comment #22) > echo shutdown >/sys/power/disk > echo disk >/sys/power/state Ah, now that I look more carefully I see that you tried this but I didn't understand the result. You wrote: > The resulting screen and font corruption prevented me from doing any useful > tests after restoring from hibernate. > > I managed to get a snapshot of the screen - see attached. > > Also, after reboot, the [platform] option was once again selected in > /sys/power/disk, > if that helps. Do I understand correctly that the screen and font corruption only occurs in "shutdown" mode and not in "platform" mode? Is the thermal information right or wrong in that state? (It should be possible to check by writing sensor info to a file and then rebooting to read it.) I've attached /var/log/dpkg.log although it only goes as far back as 2012-03-22 /var/log/dpkg.log.1 ends at 2012-02-08. I've also attached /var/log/term.log as it gives the previous versions. I had a system freeze after (I think) some libdrm updates so I restored from backup and reverted to Squeeze for a few weeks and tried again - that could explain the gap. I'd be a liar if I said the freeze definitely wasn't after trying hibernate again. The font/screen corruption problems definitely happen(ed) with Squeeze. Created attachment 72773 [details]
/var/log/apt/term.log
Created attachment 72774 [details]
/var/log/dpkg.log
Yeah I meant /var/log/apt/term.log I ran the following script after booting in single user mode. I did this by adding i915.modeset=0 single to the kernel command line<<EOF echo shutdown >/sys/power/disk echo disk >/sys/power/state pm-hibernate EOF The system hibernated ok but when I rebooted it finished reading the hibernate image and did a hard power off. I booted up again with noresume on the kernel command line to boot normally and add this comment. (In reply to comment #27) > I did this by adding > > i915.modeset=0 single > > to the kernel command line<<EOF > echo shutdown >/sys/power/disk > echo disk >/sys/power/state > pm-hibernate > EOF > > The system hibernated ok but when I rebooted it finished reading the > hibernate > image and did a hard power off. Thanks for testing. Am I correct in assuming the same thing happens when you try to hibernate in "platform" mode without modesetting enabled, too? Regarding the package manager logs: I don't think anyone here is going to read them. The only one who has the context that would allow those logs to jog memories about such events as when each symptom appeared and when the machine was rebooted to actually use each kernel is you. (In reply to comment #28) > (In reply to comment #27) > > I did this by adding > > > > i915.modeset=0 single > > > > to the kernel command line<<EOF > > echo shutdown >/sys/power/disk > > echo disk >/sys/power/state > > pm-hibernate > > EOF > > > > The system hibernated ok but when I rebooted it finished reading the > hibernate > > image and did a hard power off. > > Thanks for testing. Am I correct in assuming the same thing happens when you > try to hibernate in "platform" mode without modesetting enabled, too? Yep. > > Regarding the package manager logs: I don't think anyone here is going to > read > them. The only one who has the context that would allow those logs to jog > memories about such events as when each symptom appeared and when the machine > was rebooted to actually use each kernel is you. Yeah, I wish I'd tracked it more closely. It appear(s/ed) to be more related to using swap space, something hibernate also does. I was focusing on the thermal issue. Once that's fixed I'll make sure to note when problems occur. Created attachment 87501 [details]
customized DSDT
please apply this customized DSDT.
and then
1. echo 1 > /sys/modules/acpi/parameters/aml_debug_output
2. grep . /sys/class/thermal/*/*
3. hibernate
4. resume
5. grep ./sys/class/thermal/*/*
6. dmesg > dmesg.out
and attach the dmesg.out here.
ping... Sorry for the delay - I gave the laptop to someone else when I got a new one. I updated to 3.2.0-04 and lm-sensors (after a fresh Gnome install) and I'm getting readings from acpiz and both cores - they update just fine after hibernate - the fans are working. Bug closed. please feel free to re-open it once you can reproduce the problem again. The original problem was with 3.2.0-1-amd64, and since 3.2.0-4 doesn't show the problem I propose that the problem was with the original kernel. Sadly no-one with the same make/model could confirm the bug with the old kernel, or that the new kernel fixed it, so it might be more accurate to change UNREPRODUCIBLE to CODE_FIX or whatever indicates that it was fixed in the later kernel version. |