Beware, this could well be an TuxOnIce issue. I am posting here due to a comment of Raphael in bug #15685 (https://bugzilla.kernel.org/show_bug.cgi?id=15685#c24). Please advise whether to continue doing so. I can report in the TuxOnIce bugtracker as well. I will write a note to tuxonice-devel mailing list, also. I added a CC to Nigel. Please reassign bug to Nigel as you see fit. Maybe there should be a TuxOnIce compoment in Power Management to track those bugs for now? Last kernel version known to work: 2.6.32.8-tp42-toi-3.0.99.49 In some rare cases on resuming I get a: Compress_read returned -22. Kernel panic - not synching: Read chunk returned (-22). I attach a screenshot. When it happens it happens consistently. Means I can try again and it happens again. I have to press SPACE to remove the image and do a regular boot. martin@shambhala:~> cat /sys/power/tuxonice/debug_info TuxOnIce debugging info: - TuxOnIce core : 3.1 - Kernel Version : 2.6.33.2-tp42-toi-3.1-lowmem-free-991-992-04964-gf00c7ec-dirty - Compiler vers. : 4.4 - Attempt number : 0 - Parameters : 0 667648 0 0 0 0 - Overall expected compression percentage: 0. - Checksum method is 'md4'. 0 pages resaved in atomic copy. - Compressor is 'lzo'. - Block I/O active. - Max outstanding reads 1. Max writes 1. Memory_needed: 1024 x (4096 + 200 + 76) = 4476928 bytes. Free mem throttle point reached 0. - Swap Allocator enabled. Swap available for image: 732955 pages. - File Allocator active. Storage available for image: 0 pages. - No I/O speed stats available. - Extra pages : 0 used/500. - Result : No hibernation attempts so far. Some information on memory layout (after this new boot): martin@shambhala:~> cat /proc/meminfo MemTotal: 2072596 kB MemFree: 67200 kB Buffers: 374968 kB Cached: 601312 kB SwapCached: 0 kB Active: 927984 kB Inactive: 712928 kB Active(anon): 613460 kB Inactive(anon): 52300 kB Active(file): 314524 kB Inactive(file): 660628 kB Unevictable: 4 kB Mlocked: 4 kB HighTotal: 1187144 kB HighFree: 10540 kB LowTotal: 885452 kB LowFree: 56660 kB SwapTotal: 2931820 kB SwapFree: 2931820 kB Dirty: 128 kB Writeback: 0 kB AnonPages: 664652 kB Mapped: 141896 kB Shmem: 1128 kB Slab: 142728 kB SReclaimable: 116476 kB SUnreclaim: 26252 kB KernelStack: 2808 kB PageTables: 7984 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 3968116 kB Committed_AS: 1744488 kB VmallocTotal: 122880 kB VmallocUsed: 16212 kB VmallocChunk: 100196 kB DirectMap4k: 294904 kB DirectMap4M: 614400 kB That kernel contains to patches by Nigel to improve handling of lowmem pages. I have no idea how the relate to this regression. I will attach them as well. Except these patches tree is: git://git.kernel.org/pub/scm/linux/kernel/git/nigelc/tuxonice-2.6.33.git as of f00c7ecd068a14c9bd2dd1f237aa9a2e6db0c48f. This is happening on a ThinkPad T42: martin@shambhala:~> lspci -nn 00:00.0 Host bridge [0600]: Intel Corporation 82855PM Processor to I/O Controller [8086:3340] (rev 03) 00:01.0 PCI bridge [0604]: Intel Corporation 82855PM Processor to AGP Controller [8086:3341] (rev 03) 00:1d.0 USB Controller [0c03]: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 [8086:24c2] (rev 01) 00:1d.1 USB Controller [0c03]: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 [8086:24c4] (rev 01) 00:1d.2 USB Controller [0c03]: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 [8086:24c7] (rev 01) 00:1d.7 USB Controller [0c03]: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller [8086:24cd] (rev 01) 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev 81) 00:1f.0 ISA bridge [0601]: Intel Corporation 82801DBM (ICH4-M) LPC Interface Bridge [8086:24cc] (rev 01) 00:1f.1 IDE interface [0101]: Intel Corporation 82801DBM (ICH4-M) IDE Controller [8086:24ca] (rev 01) 00:1f.3 SMBus [0c05]: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller [8086:24c3] (rev 01) 00:1f.5 Multimedia audio controller [0401]: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller [8086:24c5] (rev 01) 00:1f.6 Modem [0703]: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Modem Controller [8086:24c6] (rev 01) 01:00.0 VGA compatible controller [0300]: ATI Technologies Inc RV350 [Mobility Radeon 9600 M10] [1002:4e50] 02:00.0 CardBus bridge [0607]: Texas Instruments PCI4520 PC card Cardbus Controller [104c:ac46] (rev 01) 02:00.1 CardBus bridge [0607]: Texas Instruments PCI4520 PC card Cardbus Controller [104c:ac46] (rev 01) 02:01.0 Ethernet controller [0200]: Intel Corporation 82540EP Gigabit Ethernet Controller (Mobile) [8086:101e] (rev 03) 02:02.0 Network controller [0280]: Intel Corporation PRO/Wireless 2200BG [Calexico2] Network Connection [8086:4220] (rev 05)
Created attachment 26178 [details] First of the mentioned patches from Nigel, version 2 of it
Created attachment 26179 [details] Second of the mentioned patches from Nigel Reason for these are Radeon KMS induced problems to free enough low memory pages: http://lists.tuxonice.net/pipermail/tuxonice-devel/2010-April/006075.html
Created attachment 26180 [details] tuxonice config for hibernate script I am using 1.99-1.1 of debian hibernate package for squeeze/sid, without text user interface and with no_console_suspend kernel parameter to gather any messages of the kernel during snapshot cycles.
It's definitely a TuxOnIce issue - a known one. I just haven't managed to find the time to put into find into finding the cause (sorry!)
Now I had this on my ThinkPad T42 with: martin@deepdance:~> cat /proc/version Linux version 2.6.34.1-tp23-toi-3.1.1.1-04990-g3a7d1f4 (martin@deepdance) (gcc version 4.4.4 (Debian 4.4.4-5) ) #2 PREEMPT Tue Jul 6 20:27:13 CEST 2010 I didn't see this yet on my ThinkPad T42, and with 2.6.33 I never saw it on my T23 as far as I remember.
Hmm, I might should add, that I tried two more resumes of the same image and the machine always just rebooted after having read 150 MB of the caches.
Created attachment 27078 [details] happened again with 2.6.34.1 and tuxonice 3.1.1.1, photo of backtrace It happened again with that kernel. This time I have a photo of the backtrace. Hope it helps. Thanks.
After 3-4 attempts and about 7 days uptime it happened again with my T23. It definately happens more often with 2.6.34 than with 2.6.33. I now switched compressor to LZF. Maybe that helps.
I switched compression on my ThinkPad T23 from LZO to LZF. And since then I didn't get the error anymore, but with only 5 attempts so far, so I am not sure whether switching to LZF "fixed" it: deepdance:~> cat /sys/power/tuxonice/debug_info TuxOnIce debugging info: - TuxOnIce core : 3.1.1.1 - Kernel Version : 2.6.34.1-tp23-toi-3.1.1.1-04990-g3a7d1f4 - Compiler vers. : 4.4 - Attempt number : 5 - Parameters : 0 667656 0 1 0 0 - Overall expected compression percentage: 0. - Checksum method is 'md4'. 0 pages resaved in atomic copy. - Compressor is 'lzf'. Compressed 776593408 bytes into 359897499 (53 percent compression). - Block I/O active. - Max outstanding reads 714. Max writes 5. Memory_needed: 1024 x (4096 + 200 + 76) = 4476928 bytes. Free mem throttle point reached 983. - Swap Allocator enabled. Swap available for image: 229016 pages. - File Allocator active. Storage available for image: 0 pages. - I/O speed: Write 28 MB/s, Read 33 MB/s. - Extra pages : 26 used/500. - Result : Succeeded. I will have this one running for at least 10 or 15 attempts, let's see how it goes.
Pedro has some more information on this bug: Date Thu, 29 Jul 2010 02:30:37 +0100 Subject kcryptd oops when resuming with TuxOnIce with KDB oops afterwards From Pedro Ribeiro I hit a bug when resuming with TuxOnIce. At the middle of a resume, it says Compress Read -22 and locks up. I caught the stack trace with kdb and took photos of that. I'm running 2.6.35-rc6 on a Lenovo T400. I have an encrypted LUKS partition (aes-cbc-essiv-128) which contains an LVM2 with my root, swap and home partitions inside. http://lkml.org/lkml/2010/7/28/478
Today I had this with a Dell Dimension 5100 workstation. I switched this one from lzo to lzf compression as well, as I didn't have another failure on my ThinkPad T23 after the switch. So it might really be something related to lzo compression in combination with TuxOnIce.
I found the cause yesterday - it was a locking issue in TuxOnIce. I'm not sure why it's triggered more easily with LZO, but LZO isn't the cause. The fix hasn't yet been committed to my git trees, but will be soon.
This should be fixed with the current release (3.2-rc2).
Created attachment 29472 [details] error still occurs with 2.6.36-rc3 + TOI 3.2-rc1, photo of backtrace Hi Nigel, unfortunately this still happens with 2.6.36-rc3-tp42-toi-3.2-rc1-vmembase-0-05032-g60140c1-dirty. In the interest of having a stable kernel I will switch to LZF as so far with LZF it didn't happen to me. In case you have any idea, on what debug information to gather, please tell me. I am on the jump, thus this time I haven't taken time for additional steps.
It's great that kernel bugzilla is back. can you please verify if the problem still exists in the latest upstream kernel?
Thanks for the reminder, Zhang. This is soo old and not even related to upstream code. I just close it. I did not use TuxOnIce for the last time, I think I like to use it again, but then I can have a look again.