Bug 30492
Summary: | kernel sometimes hangs on hibernation | ||
---|---|---|---|
Product: | Power Management | Reporter: | Martin Steigerwald (Martin) |
Component: | Hibernation/Suspend | Assignee: | power-management_other |
Status: | CLOSED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | rjw, rui.zhang |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.38-rc7 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216 | ||
Attachments: |
PM/Hibernate: Try to avoid crashing the kernel unnecessarily
PM / Hibernate: Alternative method of avoiding crashes PM / Hibernate: Alternative method of avoiding crashes (v2) PM / Hibernate: Alternative method of avoiding crashes (v3) |
Description
Martin Steigerwald
2011-03-05 14:48:59 UTC
This all is with in kernel suspend via hibernate script 1.99. I am using the following wrapper script for some of my own stuff: shambhala:/etc> cat acpi/hibernate-extra.sh #!/bin/sh # Zur Sicherheit gleich am Anfang alle ausstehenden Änderungen schreiben sync # Versuchen, möglichst viele LowMem Pages freizubekommen # Dir Entries legt Ext4 offenbar auch ins LowMem # Und mit zu wenig LowMem Pages klappt der Tiefschlaf mit # Radeon DRM KMS nicht. #echo 3 > /proc/sys/vm/drop_caches # Alternativ kleineres Image bauen, siehe LKML: # Re: does hibernate to disk try hard enough to free memory? (23.2.2011) echo 710000000 > /sys/power/image_size # Network Manager schlafen legen # siehe /usr/lib/pm-utils/sleep.d/55NetworkManager dbus-send --print-reply --system \ --dest=org.freedesktop.NetworkManager \ /org/freedesktop/NetworkManager \ org.freedesktop.NetworkManager.sleep # ifplugd stoppen #/etc/init.d/ifplugd stop #ifdown eth0 # Systemzeit in Hardware-Uhr speichern /etc/init.d/hwclock.sh stop # Uptimed stoppen, damit er die Rekorde schreibt /etc/init.d/uptimed stop # Zur Sicherheit hier nochmal alle ausstehenden Änderungen schreiben sync # Gutnacht # /etc/acpi/hibernate.sh #echo 1 > /sys/power/tuxonice/do_hibernate #pm-suspend-hybrid #pm-hibernate hibernate-disk # Uptimed wieder starten. Dabei schreibt er erneut die Rekorde /etc/init.d/uptimed start # Rekorde gleich schreiben sync # Festplatten-Parameter wieder setzen /etc/init.d/hdparm start # Systemzeit anhand Hardware-Uhr wieder setzen /etc/init.d/hwclock.sh start # Network Manager aufwecken dbus-send --print-reply --system \ --dest=org.freedesktop.NetworkManager \ /org/freedesktop/NetworkManager \ org.freedesktop.NetworkManager.wake # ifplugd starten #/etc/init.d/ifplugd start Is the problem reproducible at any reasonable rate? Well, I thought not so, cause it could take 10 to 12 days for it to happen. But now it just happened again, with 2.6.38-rc7-g212e349. It seems to trigger more easily when I keep more applications open prior to hibernation. I worked around it by closing all, or all but one application prior to initiating hibernation. I could stop using this work around in order to try to more easily reproduce this bug or even start some more applications... But I think first I test your patch from bug #30482, unless you direct me otherwise. Please do. The issues are independent of each other anyway. Created attachment 50202 [details]
PM/Hibernate: Try to avoid crashing the kernel unnecessarily
Please check if this patch makes a difference.
If it helps, please see if you're able to trigger the WARN_ON().
Trying with patch now. Thanks Did not have any hang so far with this patch and the one of bug #30482 applied. But since I had a image size of 700000000 set manually before and still had this hang occasionally, I think the second patch, the one from this bug report, helps. I did not see any warn: shambhala:/> zgrep "WARN" /var/log/syslog* | grep kernel | grep -v thinkpad_acpi /var/log/syslog.1:Mar 8 23:07:39 shambhala kernel: WARNING: at drivers/ata/libata-core.c:6130 ata_host_detach+0xf6/0x100() Do I have to grep differently for that? Will try harder - with more applications open, more memory pressure - when I am back to Germany, but until now all seems fine. Kernel hung again even with both patches applied. This time 2.6.38 plus those two patches: martin@shambhala:~/Computer/Shambhala/Kernel/2.6.38/linux-2.6.38.y> patch -p1 < ../0002-try-to-avoid-crashing-the-kernel-unnecessarily-bug-30492.patch patching file kernel/power/snapshot.c Reversed (or previously applied) patch detected! Assume -R? [n] ^C -crashing-the-kernel-unnecessarily-bug-30492.patch martin@shambhala:~/Computer/Shambhala/Kernel/2.6.38/linux-2.6.38.y> patch -p1 < ../0001-refine-autoestimation-of-image-size-bug-30482.patch patching file kernel/power/snapshot.c Reversed (or previously applied) patch detected! Assume -R? [n] c^C martin@shambhala:~/Computer/Shambhala/Kernel/2.6.38/linux-2.6.38.y> cat /sys/power/image_size 703320064 I had quite some stuff open, including Iceweasel, KMail, Kontact. Seems that your patch does not yet (completely) fix this issue. Last stuff in syslog is this: Mar 17 00:06:00 shambhala kernel: e1000: eth0 NIC Link is Down Mar 17 00:06:01 shambhala NetworkManager[2077]: <info> (eth0): carrier now OFF (device state 8, deferring action for 4 seconds) Mar 17 00:06:01 shambhala kernel: usb 3-2: USB disconnect, address 2 Mar 17 00:06:05 shambhala NetworkManager[2077]: <info> (eth0): device state change: 8 -> 2 (reason 40) Mar 17 00:06:05 shambhala NetworkManager[2077]: <info> (eth0): deactivating device (reason: 40). Mar 17 00:06:06 shambhala NetworkManager[2077]: <info> (eth0): canceled DHCP transaction, DHCP client pid 4669 Mar 17 00:06:17 shambhala kernel: PM: Marking nosave pages: 000000000009f000 - 0000000000100000 Mar 17 00:06:17 shambhala kernel: PM: Basic memory bitmaps created Mar 17 00:06:18 shambhala kernel: PM: Syncing filesystems ... done. Testing with that patch for quite a while. It again hung on hibernation. Thus the patch does not seem to fix the issue, at least not completely. Or there are several causes for hangs on hibernation and it fixed one, but not the other. Created attachment 56672 [details]
PM / Hibernate: Alternative method of avoiding crashes
Hi,
Please test this patch instead. It should fail hibernation if there's
a problem that would crash the kernel in free_unnecessary_pages().
If that happens, please attach the output of dmesg.
Created attachment 56682 [details]
PM / Hibernate: Alternative method of avoiding crashes (v2)
Sorry, please test this one instead.
Created attachment 56772 [details]
PM / Hibernate: Alternative method of avoiding crashes (v3)
The previous patch was broken, sorry about that. Please try this one.
Rafael, I did not have any hibernate related crashes in the last time, possibly cause I closed applications in order for the suspend to succeed. Shall I test this one together with the one from bug 34102? Yes, please. Anyway, you're the only person known to me that could reproduce this problem, so please keep this patch on top of your kernel just in case the problem triggers. I am now compiling a kernel with that patch. I had a hang at preallocation as I accidentally hibernated the machine - I wanted to change brightness, but hit the wrong key - a hour ago or so. Hopefully it triggers another time. This was while hibernating with two KDE 4 sessions running - instead of one which for now always worked okay. No, I think it wasn't a hang, cause actually the disk drive was still spinning. I think it would have come back, after a longer time. Cause now I tested with higher reserved_size values and it always came back (from a failed hibernation, but I will detail this in bug 34102) With the patch from comment #13 I think you'll rather see a failing hibernation than a hang. Anyway, please report if you see any of them. :-) It's great that the kernel bugzilla is back. Can you please verify if the problem still exists in the latest upstream kernel, with & without the patch in comment #13? Thanks for this reminder as well. I didn´t have it again and the ThinkPad T42 I talked about. But since I got that shiny new T520 I did not use it to memory heavy workloads anymore. Thus I am not sure whether this bug has been resolved or whether I am not triggering anymore. Since for now, I do not intend to use memory heavy workloads on the T42, I will just close this one as well. Should I ever get this hang again, I can always reopen it. |