Bug 15816

Summary: kernel BUG at kernel/power/snapshot.c:522
Product: Power Management Reporter: Maciej Rutecki (maciej.rutecki)
Component: Hibernation/SuspendAssignee: Rafael J. Wysocki (rjw)
Status: CLOSED CODE_FIX    
Severity: normal CC: bugzilla, chris+kern, florian, gentoo.integer, kokoko3k, maciej.rutecki, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.33.2 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 14885    
Attachments: Debug patch
crash log 2.6.39.1 on hibernate.
PM / Hibernate: Fix free_unnecessary_pages()
hibernate log 2.6.39.2 + patch

Description Maciej Rutecki 2010-04-20 19:02:37 UTC
Subject    : kernel BUG at kernel/power/snapshot.c:522
Submitter  : Joerg Platte <jplatte@naasa.net>
Date       : 2010-04-17 14:01
Message-ID : 201004171601.05025.jplatte@naasa.net
References : http://marc.info/?l=linux-kernel&m=127151346415534&w=2

This entry is being used for tracking a regression from 2.6.32.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Joerg Platte 2010-06-13 08:17:57 UTC
The problem can still be triggered with 2.6.34, but less often than with 2.6.33.
Comment 2 Rafael J. Wysocki 2010-09-20 16:22:06 UTC
Fixed by commit 6715045ddc7472a22be5e49d4047d2d89b391f45 .
Comment 3 Rafael J. Wysocki 2010-09-20 16:23:33 UTC
*** Bug 18752 has been marked as a duplicate of this bug. ***
Comment 4 Joerg Platte 2010-09-21 12:16:01 UTC
The bug is still there, even with the latest kernel (2.6.36-rc5). You can get a picture of the error message here (please ignore the dust on my screen ;):
https://ferdi.naasa.net/url/jplatte/IMG_4526.jpg
Please reopen the bugreport.
Comment 5 Rafael J. Wysocki 2010-09-21 18:17:27 UTC
OK, so this is a different issue.
Comment 6 Rafael J. Wysocki 2010-09-21 18:23:58 UTC
What happens if you comment out free_unnecessary_pages()
in hibernate_preallocate_memory()?
Comment 7 Rafael J. Wysocki 2010-09-23 23:18:57 UTC
Created attachment 31122 [details]
Debug patch

Regardless of the result of the previous test, please apply this patch and see
what number appears in the "PM: pfn = ... not found in memory bitmap!" message (if the failure is still reproducible).
Comment 8 Joerg Platte 2010-09-24 05:45:05 UTC
The bug is partly reproducible, it happens at least one time each week, but I did not find any way to trigger it with a high probability. Currently I'm running a kernel with the first patch applied and I wasn't able to trigger the bug, but I may need more time. Will apply the second patch today or tomorrow. Thank you for your help!
Comment 9 Joerg Platte 2010-10-07 03:38:15 UTC
There are two new probems with kernel 2.6.36-rc5-00086-ga850ea3-dirty and your patch applied.
1. Sometimes the computer fails to suspend and returns to normal operation.
2. Sometimes, the display is switched off during suspend and never switched on again to show the percentage of pages written to disk and the computer does not suspend at all. This bug is similar to the one of the initial bug report, but now I cannot see any error message due to the dark screen.
Comment 10 Rafael J. Wysocki 2011-01-16 22:30:13 UTC
Is the problem still present in 2.6.37?
Comment 11 Joerg Platte 2011-01-17 18:39:35 UTC
Unfortunaltey, yes. I hit the bug the last time this morning right after reading this mail :) It occurs less frequent than in the past, but it is still there.
Comment 12 Rafael J. Wysocki 2011-01-17 22:27:37 UTC
Thanks for the update.  It seems to require some serious debugging ...
Comment 13 Joerg Platte 2011-04-08 20:14:08 UTC
With 2.6.39-rc2 I'm basically triggering this bug on every hibernate attempt. re-applying your patch to see if it helps to debug the problem... The last time I tried it it prevented successful hibernates but did not trigger the bug.
Comment 14 Joerg Platte 2011-04-09 12:36:46 UTC
Applying the patch does not help here, since it results in a black screen with a blinking hibernate LED during suspend, but suspend fails and computer needs to be switched off. Is there anything else I can do to debug the problem?
Comment 15 kokoko3k@gmail.com 2011-05-07 13:25:55 UTC
I didn't tried to apply the patch, however sometimes i've:

[..]
kernel bug at kernel/power/snapshot.c:528
invalid opcode 0000[#1] preempt smp
[..]

This on 2.6.38.5
Comment 16 chr() 2011-06-03 19:07:26 UTC
Hi!

I ran into this problem after upgrading from 2.6.38.4 to 2.6.39

My Machine always crashes on hibernate now (MacBookPro, 4GB RAM).

I run the 2.6.39.1 kernel in a virtual box, with and without the patch, to get a decent log.

Hibernate and resume works fine unless I increase /sys/power/image_size
and do some memory stress:

[ 1754.708321] PM: Preallocating image memory... 
[ 1851.016955] PM: pfn = 18446744073709551615 not found in memory bitmap!
[ 1851.026724] ------------[ cut here ]------------
[ 1851.035205] kernel BUG at .../kernel/linux-2.6.39.y/kernel/power
/snapshot.c:530!
[ 1851.038431] invalid opcode: 0000 [#1] SMP 

(I'll attach the complete log)
(about the SMP message: it crashes on single and more (virtual) cpu's)
(about patch: in this kernel I only patched the message, so this is w/o patch)

I tested the patch. In the virtual box, I ran at least 10 times a hibernate/resume cycle w/o problems, except "Preallocating image memory" may take up to 5 minutes, depends on memory usage.

On the real machine (MacBook), 2.6.39 with patch, hibernate worked (once or twice) but lately it hung at "Preallocating ..". I waited 10 minutes but machine seemed to be bricked.


ps: note for testers

make sure the kernel logging level is 7 or you just see a black screen and no messages:

echo 7 | sudo tee /proc/sys/kernel/printk
Comment 17 chr() 2011-06-03 19:09:31 UTC
Created attachment 60672 [details]
crash log 2.6.39.1 on hibernate. 

/sys/power/image_size was 2*factory default
Comment 18 Rafael J. Wysocki 2011-07-06 17:32:23 UTC
Created attachment 64822 [details]
PM / Hibernate: Fix free_unnecessary_pages()

Please test if the attached patch helps (it certainly fixes a bug, but the
question is if that's the relevant one).
Comment 19 chr() 2011-07-07 00:11:41 UTC
Created attachment 64872 [details]
hibernate log 2.6.39.2 + patch

I tested the patch in virtualbox at least 10 times stress/hibernate/resume.

No BUG() so far. I'm going to install it on my real machine.

However, "PM: Preallocating image memory..." may take a long time even if I increase image_size > RAM

PM: Allocated 1362836 kbytes in 107.39 seconds (12.69 MB/s)
Comment 20 Rafael J. Wysocki 2011-07-07 20:56:24 UTC
I'm closing this bug, because the fix is in the Linus' tree now.

If the preallocation time is a real issue, please file a separate bug entry
for tracking it (you can assign it to me right away).

Fixed by http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4d4cf23cdde2f8f9324f5684a7f349e182039529 .