Bug 11545

Summary: e1000e: Sometimes causes resume from suspend to RAM to fail
Product: Drivers Reporter: Frans Pop (elendil)
Component: NetworkAssignee: Jeff Garzik (jgarzik)
Status: CLOSED DUPLICATE    
Severity: normal CC: jbarnes, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27-rc6 Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    

Description Frans Pop 2008-09-12 02:42:02 UTC
Latest working kernel version: Unknown
Earliest failing kernel version: Unknown; first seen with 2.6.27-rc4
Distribution: Debian
Hardware Environment: HP Compaq 2510p laptop
Software Environment: Debian unstable

Problem Description:
Most of the time the laptop resumes perfectly from suspend, but sometimes (about 1 in 5-10 times) it fails to resume fairly early (with the display still off). Only solution at that point is a hard power off.

I've used pm_trace to try to find out the cause of the failure. That gave:
  Magic number: 0:983:221
  hash matches drivers/base/power/main.c:350
e1000e 0000:00:19.0: hash matches
rtc_cmos 00:06: setting system clock to 2020-12-11 09:12:26 UTC (1607677946)

So it looks like the e1000e device or driver is the cause.
(Line 350 in power/main.c has 'TRACE_RESUME(0);'.)

This was with e1000e compiled into the kernel (as I wanted to use netconsole for another issue), but I've also seen the failure with e1000e modular.
I will now make e1000e modular again and unload it before suspending to see if that makes resume stable.

Note that the eth0 NIC is unused (no cable connected). I use the wireless NIC instead.

Suggestions how to do further tracing or instrumentation would be very welcome.
Comment 1 Anonymous Emailer 2008-09-12 09:05:30 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Fri, 12 Sep 2008 02:42:03 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11545
> 
>            Summary: e1000e: Sometimes causes resume from suspend to RAM to
>                     fail
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: 2.6.27-rc6
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: jgarzik@pobox.com
>         ReportedBy: elendil@planet.nl
>                 CC: rjw@sisk.pl
> 
> 
> Latest working kernel version: Unknown
> Earliest failing kernel version: Unknown; first seen with 2.6.27-rc4
> Distribution: Debian
> Hardware Environment: HP Compaq 2510p laptop
> Software Environment: Debian unstable
> 
> Problem Description:
> Most of the time the laptop resumes perfectly from suspend, but sometimes
> (about 1 in 5-10 times) it fails to resume fairly early (with the display
> still
> off). Only solution at that point is a hard power off.
> 
> I've used pm_trace to try to find out the cause of the failure. That gave:
>   Magic number: 0:983:221
>   hash matches drivers/base/power/main.c:350
> e1000e 0000:00:19.0: hash matches
> rtc_cmos 00:06: setting system clock to 2020-12-11 09:12:26 UTC (1607677946)
> 
> So it looks like the e1000e device or driver is the cause.
> (Line 350 in power/main.c has 'TRACE_RESUME(0);'.)
> 
> This was with e1000e compiled into the kernel (as I wanted to use netconsole
> for another issue), but I've also seen the failure with e1000e modular.
> I will now make e1000e modular again and unload it before suspending to see
> if
> that makes resume stable.
> 
> Note that the eth0 NIC is unused (no cable connected). I use the wireless NIC
> instead.
> 
> Suggestions how to do further tracing or instrumentation would be very
> welcome.
> 
Comment 2 Frans Pop 2008-09-16 10:55:52 UTC
After making sure the e1000e module is unloaded before suspending I'm still seeing the failure to resume. I've just captured another occurrence and this time I get:
Magic number: 4:989:661
block ram14: hash matches
acpi PNP0C0A:00: hash matches

PNP0C0A is the battery...

I'm totally lost here. Any help would be appreciated.
Comment 3 Frans Pop 2008-12-04 03:15:34 UTC
References: http://marc.info/?l=linux-kernel&m=122818451003644&w=4

*** This bug has been marked as a duplicate of bug 12121 ***