Bug 112761 - Kernel fails to restore from hibernation (works with 3.19 kernel + f5b2831d654167d7 reverted )- Dell Latitude E7440 (Intel i7-4600U)
Summary: Kernel fails to restore from hibernation (works with 3.19 kernel + f5b2831d65...
Status: CLOSED CODE_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Rafael J. Wysocki
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-20 23:25 UTC by Robin Obůrka
Modified: 2016-05-18 09:04 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.19, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5RC
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Console output druring failure (1.09 MB, image/jpeg)
2016-02-20 23:25 UTC, Robin Obůrka
Details

Description Robin Obůrka 2016-02-20 23:25:30 UTC
Created attachment 204121 [details]
Console output druring failure

Overview
--------

Kernel fails to restore from hibernation. It just freez without stack trace or any reasonable error message. Console stops to produce more output, no disk activity detected. I'm using Dell Latitude E7440 with Intel i7-4600U processor.

Steps to Reproduce
------------------

On expected architecture:

1) Boot kernel 3.19 or newer.
2) Hibernate computer.
3) Try to start it again.
4) Image is loaded but system doesn't start.
5) If it starts - try the step 2 again. Sometimes, It works first time. The second try fails every time.

Actual Results
--------------

System is not laoded.

Expected Results
----------------

System should be restored from hibernation.

Build Date & Hardware
---------------------

Every kernel version after 3.18.x.

Dell Latitude E7440 with Intel i7-4600U processor. BIOS in Legay mode (no EFI).

Additional Information
----------------------

I have found several similar bug reports but I'm not sure if it is the same issue. I have no stack trace and the versions didn't match.

It is all OK with kernel 3.18.x and older versions. Kernel 3.19.0 is the first one with this bug. Thanks to no_console_suspend I have some output. (See arrachment)
Comment 1 Chen Yu 2016-02-22 01:39:33 UTC
(In reply to Robin Obůrka from comment #0)
> Created attachment 204121 [details]
> Console output druring failure
> 
> Overview
> --------
> 
> Kernel fails to restore from hibernation. It just freez without stack trace
> or any reasonable error message. Console stops to produce more output, no
> disk activity detected. I'm using Dell Latitude E7440 with Intel i7-4600U
> processor.
> 
> Steps to Reproduce
> ------------------
> 
> On expected architecture:
> 
> 1) Boot kernel 3.19 or newer.
> 2) Hibernate computer.
> 3) Try to start it again.
> 4) Image is loaded but system doesn't start.
> 5) If it starts - try the step 2 again. Sometimes, It works first time. The
> second try fails every time.
> 
> Actual Results
> --------------
> 
> System is not laoded.
> 
> Expected Results
> ----------------
> 
> System should be restored from hibernation.
> 
> Build Date & Hardware
> ---------------------
> 
> Every kernel version after 3.18.x.
> 
> Dell Latitude E7440 with Intel i7-4600U processor. BIOS in Legay mode (no
> EFI).
> 
> Additional Information
> ----------------------
> 
> I have found several similar bug reports but I'm not sure if it is the same
> issue. I have no stack trace and the versions didn't match.
> 
> It is all OK with kernel 3.18.x and older versions. Kernel 3.19.0 is the
> first one with this bug. Thanks to no_console_suspend I have some output.
> (See arrachment)

I've seen some similar bugs reported when are reproduced after loading the image. Can you please help to confirm if this is a core kernel issue or driver issue by:
1. compile the kernel with sata driver ,USB3.0 plus USB2.0 module as build-in(you can refer to Comment 47 at https://bugzilla.kernel.org/show_bug.cgi?id=82291)
2. always boot kernel with grub ops appended(either first kernel or second kernel for resume)
init=/bin/bash nomodeset text resume=/dev/sdax no_console_suspend ignore_loglevel 
3. use swapon to enable the swap partition, then try echo disk > /sys/power/state
4. resume the kernel.


Since 3.18.x is OK, can you do a git bisect to find the bad commit in the kernel?
Comment 2 Robin Obůrka 2016-02-25 22:06:05 UTC
> > Additional Information
> > ----------------------
> > 
> > I have found several similar bug reports but I'm not sure if it is the same
> > issue. I have no stack trace and the versions didn't match.
> > 
> > It is all OK with kernel 3.18.x and older versions. Kernel 3.19.0 is the
> > first one with this bug. Thanks to no_console_suspend I have some output.
> > (See arrachment)
> 
> I've seen some similar bugs reported when are reproduced after loading the
> image. Can you please help to confirm if this is a core kernel issue or
> driver issue by:
> 1. compile the kernel with sata driver ,USB3.0 plus USB2.0 module as
> build-in(you can refer to Comment 47 at
> https://bugzilla.kernel.org/show_bug.cgi?id=82291)
> 2. always boot kernel with grub ops appended(either first kernel or second
> kernel for resume)
> init=/bin/bash nomodeset text resume=/dev/sdax no_console_suspend
> ignore_loglevel 
> 3. use swapon to enable the swap partition, then try echo disk >
> /sys/power/state
> 4. resume the kernel.

I made some tries. Kernel fails with "Kernel panic - trying to kill init" every time. BASH is not present in initramfs so I'm using /bin/sh. I have no more ideas why it fails and I didn't find any help.

> Since 3.18.x is OK, can you do a git bisect to find the bad commit in the
> kernel?

Honestly, I'm desperately trying to avoid it. This is currently my only computer for work. It is a lot of commits, 2 or 3 reboots per try, every few minutes. I'm not sure if I can afford it in the near future.
Comment 3 Logan Gunthorpe 2016-04-25 17:29:27 UTC
This is very similar to my problem. Unfortunately I already submitted another bug report here:

https://bugzilla.kernel.org/show_bug.cgi?id=116941
Comment 4 Zhang Rui 2016-04-26 03:06:46 UTC
(In reply to Robin Obůrka from comment #2)
> > > Additional Information
> > > ----------------------
> > > 
> > > I have found several similar bug reports but I'm not sure if it is the
> same
> > > issue. I have no stack trace and the versions didn't match.
> > > 
> > > It is all OK with kernel 3.18.x and older versions. Kernel 3.19.0 is the
> > > first one with this bug. Thanks to no_console_suspend I have some output.
> > > (See arrachment)
> > 
> > I've seen some similar bugs reported when are reproduced after loading the
> > image. Can you please help to confirm if this is a core kernel issue or
> > driver issue by:
> > 1. compile the kernel with sata driver ,USB3.0 plus USB2.0 module as
> > build-in(you can refer to Comment 47 at
> > https://bugzilla.kernel.org/show_bug.cgi?id=82291)
> > 2. always boot kernel with grub ops appended(either first kernel or second
> > kernel for resume)
> > init=/bin/bash nomodeset text resume=/dev/sdax no_console_suspend
> > ignore_loglevel 
> > 3. use swapon to enable the swap partition, then try echo disk >
> > /sys/power/state
> > 4. resume the kernel.
> 
> I made some tries. Kernel fails with "Kernel panic - trying to kill init"
> every time. BASH is not present in initramfs so I'm using /bin/sh. I have no
> more ideas why it fails and I didn't find any help.
> 
you mean kernel fails just after boot so that you even don't have a chance to try hibernate?
what about singleuser mode?

> > Since 3.18.x is OK, can you do a git bisect to find the bad commit in the
> > kernel?
> 
> Honestly, I'm desperately trying to avoid it. This is currently my only
> computer for work. It is a lot of commits, 2 or 3 reboots per try, every few
> minutes. I'm not sure if I can afford it in the near future.

Since you have already know that the regression occurs between 3.18 and 3.19, IMO, the best and most efficient way to find the root cause is to use git bisect to find out the offending commit.
BTW, does "nomodeset" boot option helps? it works for a similar problem https://bugzilla.kernel.org/show_bug.cgi?id=117071
Comment 5 Logan Gunthorpe 2016-05-08 03:10:40 UTC
I suggest you follow the bug report in 

https://bugzilla.kernel.org/show_bug.cgi?id=116941

It's likely a duplicate. I did a similar bisect between 3.18 and 3.19 and found a broken commit. If you try what I've suggested in that report (depending on kernel version your currently running) you can hopefully get a working system.
Comment 6 Robin Obůrka 2016-05-08 07:39:28 UTC
> > I made some tries. Kernel fails with "Kernel panic - trying to kill init"
> > every time. BASH is not present in initramfs so I'm using /bin/sh. I have
> no
> > more ideas why it fails and I didn't find any help.
> > 
> you mean kernel fails just after boot so that you even don't have a chance
> to try hibernate?
> what about singleuser mode?

Yes, that is my problem. I don't know what I'm doing wrong

> > > Since 3.18.x is OK, can you do a git bisect to find the bad commit in the
> > > kernel?
> > 
> > Honestly, I'm desperately trying to avoid it. This is currently my only
> > computer for work. It is a lot of commits, 2 or 3 reboots per try, every
> few
> > minutes. I'm not sure if I can afford it in the near future.
> 
> Since you have already know that the regression occurs between 3.18 and
> 3.19, IMO, the best and most efficient way to find the root cause is to use
> git bisect to find out the offending commit.

I absolutely agree. I'm finishing my studies at university and 1) I have really no time, 2) I really need my computer.

I hope I will be able to try it but not in the next 2 months.

> BTW, does "nomodeset" boot option helps? it works for a similar problem
> https://bugzilla.kernel.org/show_bug.cgi?id=117071

I wasn't able to check it due to the kernel panic problem.
Comment 7 Robin Obůrka 2016-05-08 09:07:40 UTC
(In reply to Logan Gunthorpe from comment #5)
> I suggest you follow the bug report in 
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=116941
> 
> It's likely a duplicate. I did a similar bisect between 3.18 and 3.19 and
> found a broken commit. If you try what I've suggested in that report
> (depending on kernel version your currently running) you can hopefully get a
> working system.

OK, I can confirm that kernel 3.19 with reverted patch f5b2831d654167d7 work in my case too! It must be the same bug.

Thank you really much for your time. My skill with kernel debugging are poor.
Comment 8 Zhang Rui 2016-05-10 07:09:28 UTC

*** This bug has been marked as a duplicate of bug 116941 ***

Note You need to log in before you can comment on or make changes to this bug.