Bug 15398
Summary: | Networking hangs randomly. | ||
---|---|---|---|
Product: | Drivers | Reporter: | TAXI (taxi) |
Component: | Network | Assignee: | Francois Romieu (romieu) |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | akpm, alan, nissarin, richard, sam, vaka.divya2 |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.33-rc8 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
TAXI
2010-02-25 19:17:30 UTC
Postet wrong (Version: 2.5) - Will close this and open a new bug report. Sorry for that. Can't find a method to enter version by opening new bug. yup, sis190 is getting stuck. Is this a regression? Were any earlier kernel versions OK? If so, which version(s)? It's no regression. It's very hard to say which kernel versions are OK and which not. This is couse of the nature of this bug. It occurs very rarely and if it does, it's hard to backtrace (remember: there is no info in syslog) I think the bug occurs more often if you plug the Network-Cable out, wait a bit and plug it in again but I'm not sure about that. I'm just wondering if it's the same thing I'm experiencing on my box. I've already searched the web, including bugzilla and I found quite a lot reports but this one looks most similar to mine. It's also similar with #15232 and #15139; speaking of which, in the last one there is some patch which add additional debugging information, is it possible to adapt this patch for r8169 (is s/e1000/r8169/ enough)? Mar 15 19:54:45 radscorpion kernel: ------------[ cut here ]------------ Mar 15 19:54:45 radscorpion kernel: WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x272/0x280() Mar 15 19:54:45 radscorpion kernel: Hardware name: GA-MA78G-DS3H Mar 15 19:54:45 radscorpion kernel: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out Mar 15 19:54:45 radscorpion kernel: Modules linked in: tun oss_usb oss_hdaudio osscore bridge stp llc pl2303 usbserial firewire_ohci usb_storage firewire_core i2c_piix4 psmouse Mar 15 19:54:45 radscorpion kernel: Pid: 0, comm: swapper Not tainted 2.6.33-00159-gd424b92 #6 Mar 15 19:54:45 radscorpion kernel: Call Trace: Mar 15 19:54:45 radscorpion kernel: <IRQ> [<ffffffff81417d92>] ? dev_watchdog+0x272/0x280 Mar 15 19:54:45 radscorpion kernel: [<ffffffff81417d92>] ? dev_watchdog+0x272/0x280 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8105f084>] ? warn_slowpath_common+0x74/0xd0 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8105f141>] ? warn_slowpath_fmt+0x51/0x60 Mar 15 19:54:45 radscorpion kernel: [<ffffffff81052e20>] ? activate_task+0x40/0x70 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8105a887>] ? try_to_wake_up+0xb7/0x3c0 Mar 15 19:54:45 radscorpion kernel: [<ffffffff811f5041>] ? strlcpy+0x41/0x50 Mar 15 19:54:45 radscorpion kernel: [<ffffffff814047ab>] ? netdev_drivername+0x3b/0x40 Mar 15 19:54:45 radscorpion kernel: [<ffffffff81417d92>] ? dev_watchdog+0x272/0x280 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8105599c>] ? enqueue_task_fair+0x19c/0x1f0 Mar 15 19:54:45 radscorpion kernel: [<ffffffff81417b20>] ? dev_watchdog+0x0/0x280 Mar 15 19:54:45 radscorpion kernel: [<ffffffff81069dfc>] ? run_timer_softirq+0x13c/0x200 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8105a887>] ? try_to_wake_up+0xb7/0x3c0 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8107d6b1>] ? ktime_get+0x61/0xe0 Mar 15 19:54:45 radscorpion kernel: [<ffffffff81064c06>] ? __do_softirq+0xa6/0x130 Mar 15 19:54:45 radscorpion kernel: [<ffffffff810273cc>] ? call_softirq+0x1c/0x30 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8102919d>] ? do_softirq+0x4d/0x80 Mar 15 19:54:45 radscorpion kernel: [<ffffffff810648f5>] ? irq_exit+0x75/0x90 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8103e64c>] ? smp_apic_timer_interrupt+0x6c/0xa0 Mar 15 19:54:45 radscorpion kernel: [<ffffffff81026e93>] ? apic_timer_interrupt+0x13/0x20 Mar 15 19:54:45 radscorpion kernel: <EOI> [<ffffffff8102e4d2>] ? default_idle+0x32/0x40 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8102e738>] ? c1e_idle+0xa8/0x100 Mar 15 19:54:45 radscorpion kernel: [<ffffffff8102579a>] ? cpu_idle+0xaa/0x100 Mar 15 19:54:45 radscorpion kernel: [<ffffffff816f2c55>] ? start_kernel+0x35c/0x41d Mar 15 19:54:45 radscorpion kernel: [<ffffffff816f2373>] ? x86_64_start_kernel+0xe1/0xf2 Mar 15 19:54:45 radscorpion kernel: ---[ end trace 9b1f3a8aa1ae216e ]--- Mar 15 19:54:45 radscorpion kernel: r8169: eth0: link up It's quite hard to trigger (rtorrent seams to make it easier to happen), I didn't notice it earlier but I started to check dmesg output quite often when I started playing with kms. But now when I think about it I had problems with network before, f.e. when playing multiplayer games I've experienced 4-5s lag spikes (I was able to see opponents movements but I was unable to move - no tx, rx worked). At that time I though it's network related problem but if that was caused by this bug, then the problem appeared much earlier (previously I used 2.6.32.7). Too bad that I don't remember when exactly problems started to show up and unfortunately I cleaned up logs recently. Yeah, I'm getting this too. Also, I have read several reports and we seem to all have one thing in common, and that's rtorrent. It only seems to happen for me when I'm using it for other things. For e.g. it downloads using rtorrent, and usually this happens when it's both downloading and being used to play files back over the network. It's not the file services because I've tried NFS, AFP and SMB, all to the same effect. Apr 1 23:37:14 download-server kernel: [ 718.011280] ------------[ cut here ]------------ Apr 1 23:37:14 download-server kernel: [ 718.011305] WARNING: at /build/buildd/linux-2.6.32/net/sched/sch_generic.c:261 dev_watchdog+0x262/0x270() Apr 1 23:37:14 download-server kernel: [ 718.011312] Hardware name: EasyNote_MX37-U-004 Apr 1 23:37:14 download-server kernel: [ 718.011318] NETDEV WATCHDOG: eth0 (sis190): transmit queue 0 timed out Apr 1 23:37:14 download-server kernel: [ 718.011323] Modules linked in: vmnet parport_pc vsock vmci vmmon ppdev snd_hda_codec_realtek nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc arc4 snd_hda_intel snd_hda_codec snd_hwdep ath5k snd_pcm snd_timer fbcon tileblit font bitblit softcursor mac80211 ath snd lp video output soundcore snd_page_alloc vga16fb vgastate sis190 cfg80211 shpchp mii asus_laptop led_class sis_agp parport sata_sis Apr 1 23:37:14 download-server kernel: [ 718.011425] Pid: 0, comm: swapper Not tainted 2.6.32-17-server #26-Ubuntu Apr 1 23:37:14 download-server kernel: [ 718.011430] Call Trace: Apr 1 23:37:14 download-server kernel: [ 718.011435] <IRQ> [<ffffffff81066ceb>] warn_slowpath_common+0x7b/0xc0 Apr 1 23:37:14 download-server kernel: [ 718.011456] [<ffffffff81066d91>] warn_slowpath_fmt+0x41/0x50 Apr 1 23:37:14 download-server kernel: [ 718.011465] [<ffffffff81489842>] dev_watchdog+0x262/0x270 Apr 1 23:37:14 download-server kernel: [ 718.011476] [<ffffffff81080767>] ? insert_work+0x77/0xc0 Apr 1 23:37:14 download-server kernel: [ 718.011488] [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10 Apr 1 23:37:14 download-server kernel: [ 718.011497] [<ffffffff814895e0>] ? dev_watchdog+0x0/0x270 Apr 1 23:37:14 download-server kernel: [ 718.011505] [<ffffffff81077417>] run_timer_softirq+0x197/0x340 Apr 1 23:37:14 download-server kernel: [ 718.011516] [<ffffffff810943a0>] ? tick_sched_timer+0x0/0xc0 Apr 1 23:37:14 download-server kernel: [ 718.011525] [<ffffffff8108f113>] ? ktime_get+0x63/0xe0 Apr 1 23:37:14 download-server kernel: [ 718.011534] [<ffffffff8106e227>] __do_softirq+0xb7/0x1e0 Apr 1 23:37:14 download-server kernel: [ 718.011542] [<ffffffff81093f8a>] ? tick_program_event+0x2a/0x30 Apr 1 23:37:14 download-server kernel: [ 718.011551] [<ffffffff810142ec>] call_softirq+0x1c/0x30 Apr 1 23:37:14 download-server kernel: [ 718.011559] [<ffffffff81015cb5>] do_softirq+0x65/0xa0 Apr 1 23:37:14 download-server kernel: [ 718.011566] [<ffffffff8106e0c5>] irq_exit+0x85/0x90 Apr 1 23:37:14 download-server kernel: [ 718.011576] [<ffffffff8155c021>] smp_apic_timer_interrupt+0x71/0x9c Apr 1 23:37:14 download-server kernel: [ 718.011584] [<ffffffff81013cb3>] apic_timer_interrupt+0x13/0x20 Apr 1 23:37:14 download-server kernel: [ 718.011589] <EOI> [<ffffffff8130c7ce>] ? acpi_idle_enter_simple+0x117/0x14b Apr 1 23:37:14 download-server kernel: [ 718.011606] [<ffffffff8130c7c7>] ? acpi_idle_enter_simple+0x110/0x14b Apr 1 23:37:14 download-server kernel: [ 718.011617] [<ffffffff81448c77>] ? cpuidle_idle_call+0xa7/0x140 Apr 1 23:37:14 download-server kernel: [ 718.011627] [<ffffffff81011e63>] ? cpu_idle+0xb3/0x110 Apr 1 23:37:14 download-server kernel: [ 718.011636] [<ffffffff8154ee91>] ? start_secondary+0xa8/0xaa Apr 1 23:37:14 download-server kernel: [ 718.011643] ---[ end trace 62bcf8b592c12c43 ]--- They current kernel is 2.6.32-17-server build from Ubuntu 10.04 BETA. I'm seeing a similar problem. We have a pair of very busy Squid servers (peak ~15000 TCP connections, ~100Mb/sec) using identical Asus server motherboards. The networking on one box intermittently fails - sometimes after a few hours sometimes after as much as 7 days. Here's the relevant info and an extract from syslog / dmesg # uname -r 2.6.31.12 # cat /sys/class/net/eth0/device/vendor 0x14e4 # cat /sys/class/net/eth0/device/device 0x1659 {{{ http://www.pcidatabase.com/vendor_details.php?id=767 0x1659 Chip Number: BCM5721 Chip Description: NetXtreme Gigabit Ethernet PCI Express }}} # ethtool -i eth0 driver: tg3 version: 3.99 firmware-version: 5721-v3.65 bus-info: 0000:03:00.0 #cat /var/log/syslog Apr 22 23:49:35 kernel: [623449.988504] ------------[ cut here ]------------ Apr 22 23:49:35 kernel: [623449.988511] WARNING: at net/sched/sch_generic.c:246 dev_watchdog+0x1be/0x1d0() Apr 22 23:49:35 kernel: [623449.988514] Hardware name: System Product Name Apr 22 23:49:35 kernel: [623449.988516] NETDEV WATCHDOG: eth0 (tg3): transmit queue 0 timed out Apr 22 23:49:35 kernel: [623449.988518] Modules linked in: ip_gre e1000 via_rhine tg3 libphy r8169 pcnet32 e100 8139too mii w83627ehf vt8231 via686a hwm on_vid coretemp asus_atk0110 hwmon Apr 22 23:49:35 kernel: [623449.988533] Pid: 0, comm: swapper Not tainted 2.6.31.12 #2 Apr 22 23:49:35 kernel: [623449.988535] Call Trace: Apr 22 23:49:35 kernel: [623449.988540] [<c012864e>] ? warn_slowpath_common+0x6e/0xb0 Apr 22 23:49:35 kernel: [623449.988543] [<c034e4fe>] ? dev_watchdog+0x1be/0x1d0 Apr 22 23:49:35 kernel: [623449.988546] [<c01286db>] ? warn_slowpath_fmt+0x2b/0x30 Apr 22 23:49:35 kernel: [623449.988549] [<c034e4fe>] ? dev_watchdog+0x1be/0x1d0 Apr 22 23:49:35 kernel: [623449.988553] [<c011f842>] ? __wake_up+0x42/0x60 Apr 22 23:49:35 kernel: [623449.988557] [<c01376f2>] ? insert_work+0x42/0x50 Apr 22 23:49:35 kernel: [623449.988560] [<c034e340>] ? dev_watchdog+0x0/0x1d0 Apr 22 23:49:35 kernel: [623449.988564] [<c0131149>] ? run_timer_softirq+0xf9/0x1c0 Apr 22 23:49:35 kernel: [623449.988567] [<c012d2c0>] ? __do_softirq+0x80/0x100 Apr 22 23:49:35 kernel: [623449.988570] [<c012d36d>] ? do_softirq+0x2d/0x40 Apr 22 23:49:35 kernel: [623449.988574] [<c0114d94>] ? smp_apic_timer_interrupt+0x54/0x90 Apr 22 23:49:35 kernel: [623449.988577] [<c0103676>] ? apic_timer_interrupt+0x2a/0x30 Apr 22 23:49:35 kernel: [623449.988581] [<c03f00d8>] ? klist_add_before+0x18/0x50 Apr 22 23:49:35 kernel: [623449.988585] [<c0109dc2>] ? mwait_idle+0x42/0x60 Apr 22 23:49:35 kernel: [623449.988587] [<c0101d55>] ? cpu_idle+0x35/0x60 Apr 22 23:49:35 kernel: [623449.988590] ---[ end trace 346a74434bf31555 ]--- Apr 22 23:49:35 kernel: [623449.988592] tg3: eth0: transmit timed out, resetting Apr 22 23:49:35 kernel: [623449.988596] tg3: DEBUG: MAC_TX_STATUS[0000000f] MAC_RX_STATUS[00000008] Apr 22 23:49:35 kernel: [623449.988601] tg3: DEBUG: RDMAC_STATUS[00000000] WDMAC_STATUS[00000000] Apr 22 23:49:35 kernel: [623450.089694] tg3: tg3_stop_block timed out, ofs=1400 enable_bit=2 Apr 22 23:49:35 kernel: [623450.248154] tg3: eth0: Link is down. |