|Summary:||DMI low-memory-protect quirk causes resume hang on Samsung NC10|
|Product:||Power Management||Reporter:||Patrick Walton (pcwalton)|
|Component:||Hibernation/Suspend||Assignee:||Ingo Molnar (mingo)|
|Bug Depends on:|
|Bug Blocks:||7216, 11808|
lspci for Samsung NC10
Patch from Comment #12
Description Patrick Walton 2009-02-06 18:35:15 UTC
Latest working kernel version: 126.96.36.199 Earliest failing kernel version: 188.8.131.52 Distribution: Arch Linux, all problems reproduced using a mainline kernel Hardware Environment: Samsung NC10 Netbook, purchased 02/2009 Software Environment: Linux, latest BIOS, triggered suspend in early userspace before modules are loaded Problem Description: The kernel hangs before the video is turned on (but after the kernel takes control) when resuming from suspend. Using pm_trace produces nothing useful, because the RTC does not get updated at all. Steps to reproduce: Suspend at any time by using pm-suspend or echo mem. The HDD light will blink when resuming, indicating that the kernel has control, but it will hang.
Comment 1 Patrick Walton 2009-02-06 18:35:58 UTC
Marking as a regression, because it works in 184.108.40.206 and not in 220.127.116.11.
Comment 2 Rafael J. Wysocki 2009-02-07 05:39:25 UTC
Does it also happen if the kernel is booted with init=/bin/bash ?
Comment 3 Patrick Walton 2009-02-07 10:18:57 UTC
Yes, I tried solely booting off the initramfs (without loading modules) and it still resulted in a hang.
Comment 4 Rafael J. Wysocki 2009-02-07 14:59:00 UTC
Unfortunately, I have no idea what's wrong. There were not too many suspend-specific changes between 2.6.27 and 2.6.28. Can you check if the latest stable 2.6.27.y still works for you? Also please attach the output of lspci from your box.
Comment 5 Patrick Walton 2009-02-10 01:40:55 UTC
Created attachment 20180 [details] lspci for Samsung NC10
Comment 6 Patrick Walton 2009-02-10 01:43:04 UTC
I've been triaging it some. Earliest working kernel is 18.104.22.168. The 22.214.171.124 kernel, however, fails in the same way, so it seems that this bug actually appeared in the 2.6.27 series. If nobody has further information I'll continue trying to bisect it down to the commit level if possible. Also I attached the lspci. I believe that there are different revisions of the Samsung NC10 hardware, because not everybody with one has encountered this bug.
Comment 7 Rafael J. Wysocki 2009-02-10 05:39:43 UTC
If 126.96.36.199 works for you and 188.8.131.52 doesn't, it should be possible to use bisection to find the -stable commit that broke things for you.
Comment 8 Patrick Walton 2009-02-12 01:19:14 UTC
Bisection successful. Here's the bad commit: -------- fb03039affb5a36920abcfb5523c30ca39098498 is first bad commit commit fb03039affb5a36920abcfb5523c30ca39098498 Author: Philipp Kohlbecher <email@example.com> Date: Sun Nov 16 12:11:01 2008 +0100 x86: more general identifier for Phoenix BIOS commit 0af40a4b1050c050e62eb1dc30b82d5ab22bf221 upstream. Impact: widen the reach of the low-memory-protect DMI quirk -------- It seems that the kernel's *avoidance* of the first 64k region is causing it to be unable to resume. I don't know why this would be the case.
Comment 9 Rafael J. Wysocki 2009-02-12 13:52:14 UTC
Great, thanks for bisecting it. Notify-Also : Philipp Kohlbecher <firstname.lastname@example.org> Notify-Also : Ingo Molnar <email@example.com> First-Bad-Commit : 0af40a4b1050c050e62eb1dc30b82d5ab22bf221 Ingo, are we going to fix or revert it?
Comment 10 Rafael J. Wysocki 2009-02-25 14:21:39 UTC
On Tuesday 24 February 2009, Patrick Walton wrote: > Rafael J. Wysocki wrote: > > This message has been generated automatically as a part of a report > > of regressions introduced between 2.6.27 and 2.6.28. > > > > The following bug entry is on the current list of known regressions > > introduced between 2.6.27 and 2.6.28. Please verify if it still should > > be listed and let me know (either way). > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12645 > > Subject : DMI low-memory-protect quirk causes resume hang on > Samsung NC10 > > Submitter : Patrick Walton <firstname.lastname@example.org> > > Date : 2009-02-06 18:35 (18 days old) > > First-Bad-Commit: > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0af40a4b1050c050e62eb1dc30b82d5ab22bf221 > > > > > > Yes, it should remain listed, as it's still a problem; there are others > now experiencing the same issue.
Comment 11 Rafael J. Wysocki 2009-03-02 14:43:48 UTC
First-Bad-Commit : 0af40a4b1050c050e62eb1dc30b82d5ab22bf221
Comment 12 Ingo Molnar 2009-03-15 00:36:29 UTC
> > Yes, it should remain listed, as it's still a problem; there > > are others now experiencing the same issue. There's a fix from today that might solve this resume-hang bug. Could people who are affected by this bug please test the latest -tip tree: http://people.redhat.com/mingo/tip.git/README Is the problem fixed? The patch in question is: 6d7942d: x86: fix 64k corruption-check Ingo
Comment 13 Rafael J. Wysocki 2009-03-15 03:25:46 UTC
Created attachment 20528 [details] Patch from Comment #12 x86: fix 64k corruption-check Impact: fix boot crash Need to exit early if the addr is far above 64k. The crash got exposed by: 78a8b35: x86: make e820_update_range() handle small range update
Comment 14 Zhang Rui 2009-03-18 20:02:01 UTC
patrick, please try the patch in comment #13 and see if it helps.
Comment 15 Len Brown 2009-04-14 01:36:34 UTC
moving to RESOLED state since patch is proposed.
Comment 16 Len Brown 2011-07-30 05:53:42 UTC
closed by: commit 6d7942dc2a70a7e74c352107b150265602671588 Author: Yinghai Lu <email@example.com> Date: Sat Mar 14 14:32:41 2009 -0700 x86: fix 64k corruption-check Impact: fix boot crash Need to exit early if the addr is far above 64k. The crash got exposed by: 78a8b35: x86: make e820_update_range() handle small range update Signed-off-by: Yinghai Lu <firstname.lastname@example.org> Cc: <email@example.com> LKML-Reference: <49BC2279.firstname.lastname@example.org> Signed-off-by: Ingo Molnar <email@example.com>