Bug 13553

Summary: When NETCONSOLE is enabled in kernel, computer crashes after 120seconds (approx)
Product: Networking Reporter: David Hill (hilld)
Component: OtherAssignee: Arnaldo Carvalho de Melo (acme)
Status: RESOLVED UNREPRODUCIBLE    
Severity: high CC: hilld
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.29.4, 2.6.30 Subsystem:
Regression: No Bisected commit-id:

Description David Hill 2009-06-17 01:55:53 UTC

    
Comment 1 David Hill 2009-06-17 01:56:57 UTC
00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08)
00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 RL/VR AGP
Comment 2 David Hill 2009-06-17 02:55:56 UTC
With NETCONSOLE enabled, if I type:
ethtool -s eth1 speed 100 duplex full autoneg on

the computer freezes with kernel 2.6.29.4 and 2.6.30...

I can reproduce it anytime you want.
Comment 3 Andrew Morton 2009-06-23 21:08:06 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 17 Jun 2009 01:55:54 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13553
> 
>            Summary: When NETCONSOLE is enabled in kernel, computer crashes
>                     after 120seconds (approx)
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.29.4, 2.6.30
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: hilld@binarystorm.net
>         Regression: No
> 
> 

> 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
> 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
> 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
> 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
> 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01)
> 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
> 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
> 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
> 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100
> (rev 08)
> 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL-8139/8139C/8139C+ (rev 10)
> 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 RL/VR AGP
> 
> ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-) [reply] -------
> 
> With NETCONSOLE enabled, if I type:
> ethtool -s eth1 speed 100 duplex full autoneg on
> 
> the computer freezes with kernel 2.6.29.4 and 2.6.30...
> 
> I can reproduce it anytime you want.
> 

Interesting.  I wonder what the significance is of the 120 seconds.  I
see no such timers in e100.c.  Does the networking core have timers on
such intervals?
Comment 4 Neil Horman 2009-06-24 01:05:05 UTC
On Tue, Jun 23, 2009 at 02:07:43PM -0700, Andrew Morton wrote:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Wed, 17 Jun 2009 01:55:54 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=13553
> > 
> >            Summary: When NETCONSOLE is enabled in kernel, computer crashes
> >                     after 120seconds (approx)
> >            Product: Networking
> >            Version: 2.5
> >     Kernel Version: 2.6.29.4, 2.6.30
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Other
> >         AssignedTo: acme@ghostprotocols.net
> >         ReportedBy: hilld@binarystorm.net
> >         Regression: No
> > 
> > 
> 
> > 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
> > 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
> > 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
> > 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
> > 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01)
> > 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
> > 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
> > 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
> > 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro
> 100
> > (rev 08)
> > 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> > RTL-8139/8139C/8139C+ (rev 10)
> > 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 RL/VR AGP
> > 
> > ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-) [reply] -------
> > 
> > With NETCONSOLE enabled, if I type:
> > ethtool -s eth1 speed 100 duplex full autoneg on
> > 
> > the computer freezes with kernel 2.6.29.4 and 2.6.30...
> > 
> > I can reproduce it anytime you want.
> > 
> 
> Interesting.  I wonder what the significance is of the 120 seconds.  I
> see no such timers in e100.c.  Does the networking core have timers on
> such intervals?
> 
My guess is the 120 seconds has less to do with the driver, and more to do with
some other periodic event in the kernel that triggers a message getting written
to the console, which in turn triggers whatever deadlock it is thats getting hit
here.  I imagine we could diagnose it pretty quick if a stack trace or vmcore
could be captured on this.  David, can you enable the NMI watchdog on this
system to trigger a panic on the system after a deadlock?  Then if you could
enable a second serial console, or setup kdump to capture a vmcore on this
system, we should be able to  figure out whats going on.  My guess is that in
the e100 driver we're taking a lock in the ethtool set path, then calling
printk, which winds up recursing into the driver, trying to take the same lock
again.  A stack trace will tell us for certain.

Regards
Neil

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Comment 5 David Hill 2009-07-16 05:42:22 UTC
Will try that in the next few days... sorry for the delay.  I was on 
vacation for the last 2 weeks and thus, out of town :D



----- Original Message ----- 
From: "Neil Horman" <nhorman@tuxdriver.com>
To: "Andrew Morton" <akpm@linux-foundation.org>
Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>; 
<bugme-daemon@bugzilla.kernel.org>; <hilld@binarystorm.net>
Sent: Tuesday, June 23, 2009 9:05 PM
Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled 
inkernel, computer crashes after 120seconds (approx)


> On Tue, Jun 23, 2009 at 02:07:43PM -0700, Andrew Morton wrote:
>>
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Wed, 17 Jun 2009 01:55:54 GMT
>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>> > http://bugzilla.kernel.org/show_bug.cgi?id=13553
>> >
>> >            Summary: When NETCONSOLE is enabled in kernel, computer 
>> > crashes
>> >                     after 120seconds (approx)
>> >            Product: Networking
>> >            Version: 2.5
>> >     Kernel Version: 2.6.29.4, 2.6.30
>> >           Platform: All
>> >         OS/Version: Linux
>> >               Tree: Mainline
>> >             Status: NEW
>> >           Severity: high
>> >           Priority: P1
>> >          Component: Other
>> >         AssignedTo: acme@ghostprotocols.net
>> >         ReportedBy: hilld@binarystorm.net
>> >         Regression: No
>> >
>> >
>>
>> > 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
>> > 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
>> > 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
>> > 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 
>> > 01)
>> > 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 
>> > 01)
>> > 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
>> > 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>> > 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>> > 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet 
>> > Pro 100
>> > (rev 08)
>> > 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>> > RTL-8139/8139C/8139C+ (rev 10)
>> > 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 RL/VR 
>> > AGP
>> >
>> > ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-) 
>> > [reply] -------
>> >
>> > With NETCONSOLE enabled, if I type:
>> > ethtool -s eth1 speed 100 duplex full autoneg on
>> >
>> > the computer freezes with kernel 2.6.29.4 and 2.6.30...
>> >
>> > I can reproduce it anytime you want.
>> >
>>
>> Interesting.  I wonder what the significance is of the 120 seconds.  I
>> see no such timers in e100.c.  Does the networking core have timers on
>> such intervals?
>>
> My guess is the 120 seconds has less to do with the driver, and more to do 
> with
> some other periodic event in the kernel that triggers a message getting 
> written
> to the console, which in turn triggers whatever deadlock it is thats 
> getting hit
> here.  I imagine we could diagnose it pretty quick if a stack trace or 
> vmcore
> could be captured on this.  David, can you enable the NMI watchdog on this
> system to trigger a panic on the system after a deadlock?  Then if you 
> could
> enable a second serial console, or setup kdump to capture a vmcore on this
> system, we should be able to  figure out whats going on.  My guess is that 
> in
> the e100 driver we're taking a lock in the ethtool set path, then calling
> printk, which winds up recursing into the driver, trying to take the same 
> lock
> again.  A stack trace will tell us for certain.
>
> Regards
> Neil
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
>
Comment 6 David Hill 2009-07-17 05:55:38 UTC
Hi back,
Look at bug 13219.  I'm not sure the bug is related to NETCONSOLE.
It may be with the NIC drivers or the tools miidiag/ethtool or anything 
else.
The behavior of the system is random.

I attached the NMI stack trace ... but for the kdump, I need to read a bit 
more about it and think I'll need to patch the kernel... will I ?

Thanks again,

Dave


----- Original Message ----- 
From: "David Hill" <hilld@binarystorm.net>
To: "Neil Horman" <nhorman@tuxdriver.com>; "Andrew Morton" 
<akpm@linux-foundation.org>
Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>; 
<bugme-daemon@bugzilla.kernel.org>
Sent: Thursday, July 16, 2009 1:42 AM
Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled 
inkernel, computer crashes after 120seconds (approx)


> Will try that in the next few days... sorry for the delay.  I was on 
> vacation for the last 2 weeks and thus, out of town :D
>
>
>
> ----- Original Message ----- 
> From: "Neil Horman" <nhorman@tuxdriver.com>
> To: "Andrew Morton" <akpm@linux-foundation.org>
> Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>; 
> <bugme-daemon@bugzilla.kernel.org>; <hilld@binarystorm.net>
> Sent: Tuesday, June 23, 2009 9:05 PM
> Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled 
> inkernel, computer crashes after 120seconds (approx)
>
>
>> On Tue, Jun 23, 2009 at 02:07:43PM -0700, Andrew Morton wrote:
>>>
>>> (switched to email.  Please respond via emailed reply-to-all, not via 
>>> the
>>> bugzilla web interface).
>>>
>>> On Wed, 17 Jun 2009 01:55:54 GMT
>>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>>
>>> > http://bugzilla.kernel.org/show_bug.cgi?id=13553
>>> >
>>> >            Summary: When NETCONSOLE is enabled in kernel, computer 
>>> > crashes
>>> >                     after 120seconds (approx)
>>> >            Product: Networking
>>> >            Version: 2.5
>>> >     Kernel Version: 2.6.29.4, 2.6.30
>>> >           Platform: All
>>> >         OS/Version: Linux
>>> >               Tree: Mainline
>>> >             Status: NEW
>>> >           Severity: high
>>> >           Priority: P1
>>> >          Component: Other
>>> >         AssignedTo: acme@ghostprotocols.net
>>> >         ReportedBy: hilld@binarystorm.net
>>> >         Regression: No
>>> >
>>> >
>>>
>>> > 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
>>> > 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
>>> > 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
>>> > 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 
>>> > 01)
>>> > 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 
>>> > 01)
>>> > 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
>>> > 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>> > 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>> > 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet 
>>> > Pro 100
>>> > (rev 08)
>>> > 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>> > RTL-8139/8139C/8139C+ (rev 10)
>>> > 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 RL/VR 
>>> > AGP
>>> >
>>> > ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-) 
>>> > [reply] -------
>>> >
>>> > With NETCONSOLE enabled, if I type:
>>> > ethtool -s eth1 speed 100 duplex full autoneg on
>>> >
>>> > the computer freezes with kernel 2.6.29.4 and 2.6.30...
>>> >
>>> > I can reproduce it anytime you want.
>>> >
>>>
>>> Interesting.  I wonder what the significance is of the 120 seconds.  I
>>> see no such timers in e100.c.  Does the networking core have timers on
>>> such intervals?
>>>
>> My guess is the 120 seconds has less to do with the driver, and more to 
>> do with
>> some other periodic event in the kernel that triggers a message getting 
>> written
>> to the console, which in turn triggers whatever deadlock it is thats 
>> getting hit
>> here.  I imagine we could diagnose it pretty quick if a stack trace or 
>> vmcore
>> could be captured on this.  David, can you enable the NMI watchdog on 
>> this
>> system to trigger a panic on the system after a deadlock?  Then if you 
>> could
>> enable a second serial console, or setup kdump to capture a vmcore on 
>> this
>> system, we should be able to  figure out whats going on.  My guess is 
>> that in
>> the e100 driver we're taking a lock in the ethtool set path, then calling
>> printk, which winds up recursing into the driver, trying to take the same 
>> lock
>> again.  A stack trace will tell us for certain.
>>
>> Regards
>> Neil
>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>> -- 
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>>
>>
>
Comment 7 Neil Horman 2009-07-17 14:16:07 UTC
On Fri, Jul 17, 2009 at 01:55:44AM -0400, David Hill wrote:
> Hi back,
> Look at bug 13219.  I'm not sure the bug is related to NETCONSOLE.
> It may be with the NIC drivers or the tools miidiag/ethtool or anything  
> else.
> The behavior of the system is random.
>
> I attached the NMI stack trace ... but for the kdump, I need to read a 
> bit more about it and think I'll need to patch the kernel... will I ?
>
> Thanks again,
>
> Dave
>
Neither of the logs you attached in the associated bugs seem to have the NMI
lockup backtrace included.  As for a kdump, you won't need to patch the kernel,
no, but depending on what kernel you're using, you may need to build the kernel
with CONFIG_CRASH and CONFIG_KEXEC turned on.

Neil

>
> ----- Original Message ----- From: "David Hill" <hilld@binarystorm.net>
> To: "Neil Horman" <nhorman@tuxdriver.com>; "Andrew Morton"  
> <akpm@linux-foundation.org>
> Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>;  
> <bugme-daemon@bugzilla.kernel.org>
> Sent: Thursday, July 16, 2009 1:42 AM
> Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled  
> inkernel, computer crashes after 120seconds (approx)
>
>
>> Will try that in the next few days... sorry for the delay.  I was on  
>> vacation for the last 2 weeks and thus, out of town :D
>>
>>
>>
>> ----- Original Message ----- From: "Neil Horman" 
>> <nhorman@tuxdriver.com>
>> To: "Andrew Morton" <akpm@linux-foundation.org>
>> Cc: <netdev@vger.kernel.org>; <bugzilla-daemon@bugzilla.kernel.org>;  
>> <bugme-daemon@bugzilla.kernel.org>; <hilld@binarystorm.net>
>> Sent: Tuesday, June 23, 2009 9:05 PM
>> Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled  
>> inkernel, computer crashes after 120seconds (approx)
>>
>>
>>> On Tue, Jun 23, 2009 at 02:07:43PM -0700, Andrew Morton wrote:
>>>>
>>>> (switched to email.  Please respond via emailed reply-to-all, not 
>>>> via the
>>>> bugzilla web interface).
>>>>
>>>> On Wed, 17 Jun 2009 01:55:54 GMT
>>>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>>>
>>>> > http://bugzilla.kernel.org/show_bug.cgi?id=13553
>>>> >
>>>> >            Summary: When NETCONSOLE is enabled in kernel, 
>>>> computer > crashes
>>>> >                     after 120seconds (approx)
>>>> >            Product: Networking
>>>> >            Version: 2.5
>>>> >     Kernel Version: 2.6.29.4, 2.6.30
>>>> >           Platform: All
>>>> >         OS/Version: Linux
>>>> >               Tree: Mainline
>>>> >             Status: NEW
>>>> >           Severity: high
>>>> >           Priority: P1
>>>> >          Component: Other
>>>> >         AssignedTo: acme@ghostprotocols.net
>>>> >         ReportedBy: hilld@binarystorm.net
>>>> >         Regression: No
>>>> >
>>>> >
>>>>
>>>> > 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
>>>> > 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
>>>> > 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
>>>> > 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE 
>>>> (rev > 01)
>>>> > 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB 
>>>> (rev > 01)
>>>> > 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
>>>> > 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>>> > 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>>> > 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 
>>>> Ethernet > Pro 100
>>>> > (rev 08)
>>>> > 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>>> > RTL-8139/8139C/8139C+ (rev 10)
>>>> > 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 
>>>> RL/VR > AGP
>>>> >
>>>> > ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-) > 
>>>> [reply] -------
>>>> >
>>>> > With NETCONSOLE enabled, if I type:
>>>> > ethtool -s eth1 speed 100 duplex full autoneg on
>>>> >
>>>> > the computer freezes with kernel 2.6.29.4 and 2.6.30...
>>>> >
>>>> > I can reproduce it anytime you want.
>>>> >
>>>>
>>>> Interesting.  I wonder what the significance is of the 120 seconds.  I
>>>> see no such timers in e100.c.  Does the networking core have timers on
>>>> such intervals?
>>>>
>>> My guess is the 120 seconds has less to do with the driver, and more 
>>> to do with
>>> some other periodic event in the kernel that triggers a message 
>>> getting written
>>> to the console, which in turn triggers whatever deadlock it is thats  
>>> getting hit
>>> here.  I imagine we could diagnose it pretty quick if a stack trace 
>>> or vmcore
>>> could be captured on this.  David, can you enable the NMI watchdog on 
>>> this
>>> system to trigger a panic on the system after a deadlock?  Then if 
>>> you could
>>> enable a second serial console, or setup kdump to capture a vmcore on 
>>> this
>>> system, we should be able to  figure out whats going on.  My guess is 
>>> that in
>>> the e100 driver we're taking a lock in the ethtool set path, then calling
>>> printk, which winds up recursing into the driver, trying to take the 
>>> same lock
>>> again.  A stack trace will tell us for certain.
>>>
>>> Regards
>>> Neil
>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>> -- 
>>> This message has been scanned for viruses and
>>> dangerous content by MailScanner, and is
>>> believed to be clean.
>>>
>>>
>>>
>>
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
Comment 8 David Hill 2010-02-28 04:55:49 UTC
I forgot this bug existed ... :S   Will try doing this.
Comment 9 David Hill 2010-02-28 05:00:43 UTC
And forget about the timer thing... it crashes only when I disconnect the ethernet cable (or reset the switch) ...
Comment 10 David Hill 2010-02-28 05:09:22 UTC
Ok, I'm not quite sure what you expect me to try... bug I guess I need to recompile my kernel with KEXEC=y (which is already the case) and enable CRASH_DUMP ... start the new kernel with kexec and unplug the ethernet adapter and attach the dump to this bug report... am I right?

Thank you very much.
Comment 11 David Hill 2011-06-29 00:23:23 UTC
You can close this bug report...
Comment 12 David Hill 2011-06-29 00:23:59 UTC
This is not reproducable and was induced by some other bugs back at that time.