Bug 11289 - r8169 often fails at startup with "NETDEV WATCHDOG: eth0: transmit timed out"
Summary: r8169 often fails at startup with "NETDEV WATCHDOG: eth0: transmit timed out"
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Francois Romieu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-08 16:42 UTC by Laurence Withers
Modified: 2008-08-10 08:14 UTC (History)
0 users

See Also:
Kernel Version: 2.6.26.2
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
lspci -vvvxxxx on failure (191.10 KB, text/plain)
2008-08-08 16:42 UTC, Laurence Withers
Details
lspci -vvvxxxx on success (192.91 KB, text/plain)
2008-08-08 16:43 UTC, Laurence Withers
Details
kernel 2.6.26.2 .config (38.41 KB, text/plain)
2008-08-08 16:43 UTC, Laurence Withers
Details

Description Laurence Withers 2008-08-08 16:42:08 UTC
Often, my ethernet card fails to work after starting my machine. I'm using the r8169 driver and kernel version 2.6.26.2 . I have had this problem with both a DGS-1216T Gigabit switch and an el-cheapo 100Mbit switch.

When it fails, I don't believe it is transmitting any packets at all (no flashing lights on the switch).

On failure, I see the following in dmesg:

NETDEV WATCHDOG: eth0: transmit timed out
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:222 dev_watchdog+0xa3/0xfb()
Modules linked in: bnep rfcomm l2cap bluetooth snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_event snd_seq_midi_emul snd_seq snd_pcm_oss tun usbhid hid sr_mod cdrom snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device snd_timer uhci_hcd ehci_hcd snd_page_alloc firewire_ohci snd_util_mem snd_hwdep snd usbcore evdev firewire_core crc_itu_t soundcore sg rtc pata_jmicron r8169 thermal processor button
Pid: 0, comm: swapper Not tainted 2.6.26.2 #2

Call Trace:
 <IRQ>  [<ffffffff8022ad09>] warn_on_slowpath+0x51/0x8c
 [<ffffffff804179ae>] printk+0x4e/0x58
 [<ffffffff80233642>] lock_timer_base+0x26/0x4b
 [<ffffffff802337e2>] __mod_timer+0xb0/0xbf
 [<ffffffff803a23ef>] dev_watchdog+0x0/0xfb
 [<ffffffff803a23ef>] dev_watchdog+0x0/0xfb
 [<ffffffff803a2492>] dev_watchdog+0xa3/0xfb
 [<ffffffff80233088>] run_timer_softirq+0x163/0x1dc
 [<ffffffff8023f419>] ktime_get+0xc/0x41
 [<ffffffff8022fb71>] __do_softirq+0x46/0x93
 [<ffffffff8020c41c>] call_softirq+0x1c/0x28
 [<ffffffff8020de88>] do_softirq+0x2c/0x68
 [<ffffffff8022f951>] irq_exit+0x3f/0x95
 [<ffffffff8021889b>] smp_apic_timer_interrupt+0x89/0xa1
 [<ffffffff80211664>] mwait_idle+0x0/0x3f
 [<ffffffff8020bec6>] apic_timer_interrupt+0x66/0x70
 <EOI>  [<ffffffff80209370>] default_idle+0x0/0x3b
 [<ffffffff80218393>] lapic_next_event+0x0/0xa
 [<ffffffff802116a0>] mwait_idle+0x3c/0x3f
 [<ffffffff8020a432>] cpu_idle+0x56/0xb8

---[ end trace a850f6a9616ee5b4 ]---
r8169: eth0: link up

I have tried both with and without MSI enabled.

I did notice that there is a difference in lspci output depending on whether the card is working or not.

Working:
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02)
        Subsystem: Giga-byte Technology Device e000
        Flags: bus master, fast devsel, latency 0, IRQ 377
        I/O ports at c000 [size=256]
        Memory at f1110000 (64-bit, prefetchable) [size=4K]
        Memory at f1100000 (64-bit, prefetchable) [size=64K]
        [virtual] Expansion ROM at f1120000 [disabled] [size=64K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/1 Enable+
        Capabilities: [70] Express Endpoint, MSI 01
        Capabilities: [b0] MSI-X: Enable- Mask- TabSize=2
        Capabilities: [d0] Vital Product Data <?>
        Capabilities: [100] Advanced Error Reporting <?>
        Capabilities: [140] Virtual Channel <?>
        Capabilities: [160] Device Serial Number 00-00-00-00-00-00-00-00
        Kernel driver in use: r8169
        Kernel modules: r8169

Not working:
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev ff) (prog-if ff)
        !!! Unknown header type 7f
        Kernel driver in use: r8169
        Kernel modules: r8169

I will attach lspci -vvvxxxx output as well. Further information can be provided on request, and patches tried.
Comment 1 Laurence Withers 2008-08-08 16:42:56 UTC
Created attachment 17149 [details]
lspci -vvvxxxx on failure
Comment 2 Laurence Withers 2008-08-08 16:43:15 UTC
Created attachment 17150 [details]
lspci -vvvxxxx on success
Comment 3 Laurence Withers 2008-08-08 16:43:42 UTC
Created attachment 17151 [details]
kernel 2.6.26.2 .config
Comment 4 Francois Romieu 2008-08-10 05:58:44 UTC
Can you give 2.6.27-rc2 a try and check if anything performs better (lspci
for one) ?

-- 
Ueimor
Comment 5 Laurence Withers 2008-08-10 07:02:02 UTC
Yes, this seems to be much better. Although I have only rebooted a few times, I've not seen the problem once.

Is there a specific commit that fixed this? Or should I try a git-bisect to find one?
Comment 6 Francois Romieu 2008-08-10 08:14:47 UTC
Laurence Withers  2008-08-10 07:02:02 :
[...]
> Yes, this seems to be much better. Although I have only rebooted a few times,
> I've not seen the problem once.

> Is there a specific commit that fixed this?

77332894c21165404496c56763d7df6c15c4bb09

I will submit it for 2.6.26-stable.

> Or should I try a git-bisect to find one ?

It would be interesting to see what the answer of git-bisect is but
it is not really needed.

I will close the bug for now. Please open a different bug if you
experience link problems that do not seem related to a broken
state of the PCI configuration registers.

-- 
Ueimor

Note You need to log in before you can comment on or make changes to this bug.