Bug 150021 - kernel panic: "kernel tried to execute NX-protected page" when resuming from hibernate to disk
Summary: kernel panic: "kernel tried to execute NX-protected page" when resuming from ...
Status: CLOSED CODE_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Rafael J. Wysocki
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-25 21:16 UTC by shuzzle
Modified: 2016-08-01 03:31 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.6.x
Tree: Mainline
Regression: No


Attachments
last working .config (107.45 KB, application/octet-stream)
2016-07-25 21:16 UTC, shuzzle
Details
git bisect result (1.85 KB, text/plain)
2016-07-25 21:20 UTC, shuzzle
Details
bisect.log (4.17 KB, text/x-log)
2016-07-25 21:21 UTC, shuzzle
Details
Oops picture (4.79 MB, image/png)
2016-07-27 06:45 UTC, shuzzle
Details
bisect.log to comment 11 (3.07 KB, text/x-log)
2016-07-28 08:02 UTC, shuzzle
Details
hibernate-frame-fix.patch (925 bytes, patch)
2016-07-28 12:29 UTC, Josh Poimboeuf
Details | Diff

Description shuzzle 2016-07-25 21:16:29 UTC
Created attachment 226381 [details]
last working .config

Overview: 

When commit 13523309495cdbd57a0d344c0d5d574987af007f is applied to my kernel sources my kernel panics when trying to resume from hibernate to disk.


Steps to Reproduce: 

1. have a working hibernate/resume setup
2. compile 4.6.x kernel
3. boot and hibernate to disk
4. test various kernels using "git bisect".


Actual Results: kernel panics when trying to resume from hibernate to disk.

Expected Results: Resume from hibernate to disk like kernels without commit 13523309495cdbd57a0d344c0d5d574987af007f did.


I attached my working .config of my 4.5.7 kernel.

Any help will be appreciated. Thanks!
Comment 1 shuzzle 2016-07-25 21:20:01 UTC
Created attachment 226391 [details]
git bisect result
Comment 2 shuzzle 2016-07-25 21:21:19 UTC
Created attachment 226401 [details]
bisect.log
Comment 3 Rafael J. Wysocki 2016-07-26 11:24:02 UTC
Can you please try 4.7 and see if the problem is still there and, if so, whether or not reverting commit 13523309495cdbd57a0d344c0d5d574987af007f makes it go away?
Comment 4 shuzzle 2016-07-26 18:36:02 UTC
I experience the same issue with linux-4.7.

I reverted commit 13523309495cdbd57a0d344c0d5d574987af007f and the issue is gone.
Comment 5 Rafael J. Wysocki 2016-07-26 20:06:58 UTC
OK, thanks!

From the bug subject it looks like you are able to see the panic at least.

Would it be possible to capture it somehow or take a photo of it and attach that here?
Comment 6 Rafael J. Wysocki 2016-07-26 23:12:50 UTC
One more question: Do you use suspend to RAM in addition to using hibernation (suspend to disk)?
Comment 7 shuzzle 2016-07-27 06:45:35 UTC
Created attachment 226571 [details]
Oops picture
Comment 8 shuzzle 2016-07-27 12:55:08 UTC
(In reply to Rafael J. Wysocki from comment #6)
> One more question: Do you use suspend to RAM in addition to using
> hibernation (suspend to disk)?

With Kernel 4.7 resuming from hibernate to RAM works as expected while resuming from hibernate to disk doesn't.
Comment 9 Rafael J. Wysocki 2016-07-27 21:46:53 UTC
(In reply to shuzzle from comment #8)
> (In reply to Rafael J. Wysocki from comment #6)
> > One more question: Do you use suspend to RAM in addition to using
> > hibernation (suspend to disk)?
> 
> With Kernel 4.7 resuming from hibernate to RAM works as expected while
> resuming from hibernate to disk doesn't.

OK, thanks.

To be more precise, the commit you have identified as the culprit only affects suspend to RAM.  It doesn't even modify any code related to suspend to disk, so if you suspended to RAM, resumed and then attempted to suspend to disk, it might influenced things somehow, but otherwise it is hard to say what's up.

So to clarify, if you boot 4.7 afresh and then suspend to disk, does the resume from it fail?
Comment 10 Rafael J. Wysocki 2016-07-28 00:11:27 UTC
In any case, please check if this patch (should apply on top of 4.7):

https://patchwork.kernel.org/patch/9250263/

makes any difference for you.

In case it doesn't and the CPU in your system is an Intel one, please also test this patch:

https://patchwork.kernel.org/patch/9228667/
Comment 11 shuzzle 2016-07-28 08:00:19 UTC
Well, this is embarassing.

Since neither of both patches did fix it and the commit isn't involved in suspend to disk, I made the choice to 'bisect' again.

So here's what I did:

1.
1.1 patched 4.7 kernel with both patches, but with one patch at a time though. No success.



2.
2.1. git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-stable
2.2. git checkout 13523309495cdbd57a0d344c0d5d574987af007f
2.3. built the kernel which successfully resumes from suspend to disk!

So I felt really bad and took another 2 hours bisecting to make sure what commit we'll really need to talk about. :(

3.1. git bisect start
3.2. git bisect bad master
3.3. Done a few kernel builds with 'git bisect good' and so on.
3.4. git tells me commit ef0f3ed5a4acfb24740480bf2e50b178724f094d fails.

4.1. git checkout master
4.2. built the kernel which fails to resume from suspend to disk
4.3. git checkout ef0f3ed5a4acfb24740480bf2e50b178724f094d
4.4. built the kernel which fails to resume from suspend to disk
4.5. git checkout ef0f3ed5a4acfb24740480bf2e50b178724f094d~
4.6. built the kernel which SUCCESSFULLY resumes from suspend to disk


So this is very embarassing to me. I am sorry for blaming the other commit.

Please tell me if I can do any more to make sure it really is ef0f3ed5a4acfb24740480bf2e50b178724f094d this time.
Comment 12 shuzzle 2016-07-28 08:02:41 UTC
Created attachment 226651 [details]
bisect.log to comment 11
Comment 13 shuzzle 2016-07-28 08:11:39 UTC
(In reply to Rafael J. Wysocki from comment #9)
> So to clarify, if you boot 4.7 afresh and then suspend to disk, does the
> resume from it fail?

Yes.

It doesn't matter if or if not suspend to RAM has been used before suspend to disk.
Comment 14 Josh Poimboeuf 2016-07-28 12:27:43 UTC
Yeah, that commit is at least directly related to suspend to disk, so that seems much more likely.
Comment 15 Josh Poimboeuf 2016-07-28 12:29:21 UTC
Created attachment 226741 [details]
hibernate-frame-fix.patch

Could you please try and see if this patch fixes it?
Comment 16 shuzzle 2016-07-28 13:09:45 UTC
(In reply to Josh Poimboeuf from comment #15)
> Created attachment 226741 [details]
> hibernate-frame-fix.patch
> 
> Could you please try and see if this patch fixes it?

I can confirm this patch fixes the issue I reported. Thanks.
Comment 17 Rafael J. Wysocki 2016-07-28 13:52:46 UTC
Cool!

@Josh: Unless you have submitted this already, can you do that ASAP please?
Comment 18 Josh Poimboeuf 2016-07-28 15:36:03 UTC
Patch posted:

  https://lkml.kernel.org/r/20160728151707.nmtkzri4jtumaq6h@treble
Comment 19 Chen Yu 2016-08-01 03:31:55 UTC
Closed as patch been merged:

commit 4ce827b4cc58bec7952591b96cce2b28553e4d5b
Author: Josh Poimboeuf <jpoimboe@redhat.com>
Date:   Thu Jul 28 23:15:21 2016 +0200

    x86/power/64: Fix hibernation return address corruption

Note You need to log in before you can comment on or make changes to this bug.