Most recent kernel where this bug did not occur: 2.4.something Distribution: openSUSE 10.3 Hardware Environment: Athlon XP 2200+ Software Environment: 2.6.22.12 Problem Description: this bug is being created from a mailing list discussion. I've got a pair of GiG-E cards that do not work correctly. Everything appears to come up just fine, but sooner or later (typically fairly quickly) the cards weird out and never really come back. The best info I've got is this: Nov 10 22:21:19 frank kernel: tg3.c:v3.65 (August 07, 2006) Nov 10 22:21:19 frank kernel: ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [LNKB] -> GSI 3 (level, low) -> IRQ 3 Nov 10 22:21:19 frank kernel: eth0: Tigon3 [partno(AC91002A1) rev 0105 PHY(5701)] (PCI:33MHz:32-bit) 10/100/1000BaseT Ethernet 00:09:5b:09:b1:69 Nov 10 22:21:19 frank kernel: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0] Nov 10 22:21:19 frank kernel: eth0: dma_rwctrl[76ff000f] dma_mask[64-bit] Nov 10 22:21:19 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:21:19 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:21:19 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:21:19 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:21:20 frank kernel: tg3: eth0: Link is up at 1000 Mbps, full duplex. Nov 10 22:21:20 frank kernel: tg3: eth0: Flow control is on for TX and on for RX. Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:21:20 frank kernel: ACPI: PCI interrupt for device 0000:00:0b.0 disabled Nov 10 22:21:20 frank kernel: PCI: Enabling device 0000:00:0b.0 (0100 -> 0102) Nov 10 22:21:20 frank kernel: ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [LNKB] -> GSI 3 (level, low) -> IRQ 3 Nov 10 22:21:20 frank kernel: eth0: Tigon3 [partno(AC91002A1) rev 0105 PHY(5701)] (PCI:33MHz:32-bit) 10/100/1000BaseT Ethernet 00:09:5b:09:b1:69 Nov 10 22:21:20 frank kernel: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0] Nov 10 22:21:20 frank kernel: eth0: dma_rwctrl[76ff000f] dma_mask[64-bit] Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:21:20 frank kernel: tg3: eth0: Link is up at 1000 Mbps, full duplex. Nov 10 22:21:20 frank kernel: tg3: eth0: Flow control is on for TX and on for RX. Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:21:20 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:21:20 frank kernel: tg3: eth0: Link is up at 1000 Mbps, full duplex. Nov 10 22:21:20 frank kernel: tg3: eth0: Flow control is on for TX and on for RX. Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:24:40 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:41:49 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:41:49 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:41:49 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:41:49 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:43:02 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:45:52 frank kernel: NETDEV WATCHDOG: eth0: transmit timed out Nov 10 22:45:52 frank kernel: tg3: eth0: transmit timed out, resetting Nov 10 22:45:52 frank kernel: tg3: tg3_stop_block timed out, ofs=1400 enable_bit=2 Nov 10 22:45:52 frank kernel: tg3: tg3_stop_block timed out, ofs=c00 enable_bit=2 Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) Nov 10 22:45:52 frank kernel: PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) Nov 10 22:45:52 frank kernel: tg3: eth0: Link is down. Nov 10 22:45:56 frank kernel: tg3: eth0: Link is up at 1000 Mbps, full duplex. Nov 10 22:45:56 frank kernel: tg3: eth0: Flow control is on for TX and on for RX. Nov 10 22:47:49 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:47:49 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:47:49 frank kernel: nfs: server 192.168.2.1 not responding, timed out Nov 10 22:49:02 frank kernel: nfs: server 192.168.2.1 not responding, timed out
To which Jarek Poplawski responded: On 13-11-2007 19:57, Jon Nelson wrote: > I'm not sure if this is the right place, Me too. Looks more like acpi or pci problem. Did you try to experiment with something like: pci=noacpi or acpi=off boot parameters? Probably some point to your .config and dmesg should be useful too, so taking it to bugzilla and sending a number as a follow up to this thread should be resonable. Btw, I add main kernel to cc. Regards, Jarek P.
and Michael Chan chimed in: It looks like the card is being reset periodically. Every time the card gets reset, you'll see those PM messages in the version of the driver you're using. Do you see NETDEV WATCHDOG message as well in the dmesg log?
(In reply to comment #2) > and Michael Chan chimed in: > > It looks like the card is being reset periodically. Every time the card > gets reset, you'll see those PM messages in the version of the driver > you're using. Do you see NETDEV WATCHDOG message as well in the dmesg > log? > and I responded: Is this what you mean? I pulled this from the quoted text: Nov 10 22:45:52 frank kernel: NETDEV WATCHDOG: eth0: transmit timed out ]
(In reply to comment #3) > (In reply to comment #2) > > and Michael Chan chimed in: > > > > It looks like the card is being reset periodically. Every time the card > > gets reset, you'll see those PM messages in the version of the driver > > you're using. Do you see NETDEV WATCHDOG message as well in the dmesg > > log? > > > > > and I responded: > > > Is this what you mean? I pulled this from the quoted text: > > Nov 10 22:45:52 frank kernel: NETDEV WATCHDOG: eth0: transmit timed out > ] > This is not a new problem - these cards have done this or something like it for as long as I've had them*. They work just fine in 100 MBit mode but not in all of my machines, and in none of them at gig-e. I've tried every version of the driver since SUSE 9.1 without much luck (at least as far back as 2.6.9). I'd try a newer driver, esp. if I could make it compile on 2.6.22.12 (I prefer but do not require to stay with the stock distro kernel, modules notwithstanding). NOTE: to avoid list noise, I can make a bug out of this on bugzilla.kernel.org and we can proceed from there if that is preferred. [*] Actually, they worked OK in 2.4.something way-back-when but only for short durations at gig-e speeds.
and Jarek Poplawski said: Why avoid list noise? These lists are made just for this. But, since this case needs a lot of space for your configs, maybe a lot of time, and maybe a bit more people to have a look at this as well, bugzilla could be very useful. Of course, like Michael said, it would be better if you could do at least short test with a version as new as possible. Regards, Jarek P. PS: I've forgot to mention: lspci -vv, cat /proc/interrupts and maybe the same for these other, working gig-e cards.
Here no I include the requested information: tg3.c:v3.77 (May 31, 2007) ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [LNKB] -> GSI 3 (level, low) -> IRQ 3 eth0: Tigon3 [partno(AC91002A1) rev 0105 PHY(5701)] (PCI:33MHz:32-bit) 10/100/1000Base-T Ethernet 00:09:5b:09:b1:19 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] WireSpeed[1] TSOcap[0] eth0: dma_rwctrl[76ff000f] dma_mask[64-bit] eth0 renamed to eth1 udev: renamed network interface eth0 to eth1 ... and this is where I bring the I/F up: PM: Writing back config space on device 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) PM: Writing back config space on device 0000:00:0b.0 at offset 3 (was 0, writing 4008) PM: Writing back config space on device 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) PM: Writing back config space on device 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) PM: Writing back config space on device 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) tg3: eth1: Link is up at 1000 Mbps, full duplex. tg3: eth1: Flow control is on for TX and on for RX. the contents of /proc/interrupts: CPU0 0: 11826 XT-PIC-XT timer 1: 756 XT-PIC-XT i8042 2: 0 XT-PIC-XT cascade 3: 53 XT-PIC-XT ehci_hcd:usb4, eth1 5: 24 XT-PIC-XT ohci_hcd:usb2 6: 3 XT-PIC-XT floppy 7: 0 XT-PIC-XT parport0 8: 2 XT-PIC-XT rtc 9: 1 XT-PIC-XT acpi 10: 0 XT-PIC-XT SiS SI7012 11: 0 XT-PIC-XT ohci_hcd:usb5 12: 2 XT-PIC-XT ohci_hcd:usb1, ohci_hcd:usb3 14: 2220 XT-PIC-XT libata 15: 86 XT-PIC-XT libata NMI: 0 LOC: 0 ERR: 0 MIS: 0 The output from lspci -vv Capabilities: [60] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [44] AGP version 2.0 Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA- ITACoh- GART64- HTrans- 64bit- FW+ AGP3- Rate=x1,x2,x4 Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=<none> and finally the initial portions of the boot messages: Linux version 2.6.22.12-0.1-default (geeko@buildhost) (gcc version 4.2.1 (SUSE Linux)) #1 SMP 2007/11/06 23:05:18 UTC BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) BIOS-e820: 000000003fff0000 - 000000003fff8000 (ACPI data) BIOS-e820: 000000003fff8000 - 0000000040000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ffee0000 - 00000000fff00000 (reserved) BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved) 127MB HIGHMEM available. 896MB LOWMEM available. Entering add_active_range(0, 0, 262128) 0 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 229376 HighMem 229376 -> 262128 early_node_map[1] active PFN ranges 0: 0 -> 262128 On node 0 totalpages: 262128 DMA zone: 32 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4064 pages, LIFO batch:0 Normal zone: 1760 pages used for memmap Normal zone: 223520 pages, LIFO batch:31 HighMem zone: 255 pages used for memmap HighMem zone: 32497 pages, LIFO batch:7 DMI 2.3 present. Using APIC driver default ACPI: RSDP 000FA310, 0014 (r0 AMI ) ACPI: RSDT 3FFF0000, 0028 (r1 AMIINT SiS735XX 1000 MSFT 100000B) ACPI: FACP 3FFF0030, 0074 (r1 AMIINT SiS735XX 1000 MSFT 100000B) ACPI: DSDT 3FFF0100, 3332 (r1 SiS 735 100 MSFT 100000D) ACPI: FACS 3FFF8000, 0040 ACPI: PM-Timer IO Port: 0x808 Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000) Built 1 zonelists. Total pages: 260081 Kernel command line: root=/dev/disk/by-id/ata-Maxtor_2F040J0_F14Q19HE-part3 vga=0x31a splash=silent 1 bootsplash: silent mode. Local APIC disabled by BIOS -- you can enable it with "lapic" mapped APIC to ffffd000 (0180f000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 4096 (order: 12, 16384 bytes) Detected 1792.455 MHz processor. Console: colour dummy device 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 1031300k/1048512k available (1825k kernel code, 16492k reserved, 765k data, 228k init, 131008k highmem) virtual kernel memory layout: fixmap : 0xffdf4000 - 0xfffff000 (2092 kB) pkmap : 0xff800000 - 0xffc00000 (4096 kB) vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB) lowmem : 0xc0000000 - 0xf8000000 ( 896 MB) .init : 0xc038e000 - 0xc03c7000 ( 228 kB) .data : 0xc02c842b - 0xc0387964 ( 765 kB) .text : 0xc0100000 - 0xc02c842b (1825 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 3586.55 BogoMIPS (lpj=7173110) Security Framework v1.0.0 initialized Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0383f9ff c1c3f9ff 00000000 00000000 00000000 00000000 00000000 CPU: CLK_CTL MSR was 6003d22f. Reprogramming to 2003d22f CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) CPU: After all inits, caps: 0383f9ff c1c3f9ff 00000000 00000420 00000000 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Compat vDSO mapped to ffffe000. Checking 'hlt' instruction... OK. SMP alternatives: switching to UP code Freeing SMP alternatives: 12k freed Unpacking initramfs... done Freeing initrd memory: 4095k freed ACPI: Core revision 20070126 Parsing all Control Methods: Table [DSDT](id 0001) - 412 Objects with 41 Devices 143 Methods 18 Regions tbxface-0587 [00] tb_load_namespace : ACPI Tables successfully acquired ACPI: setting ELCR to 0200 (from 1c28) evxfevnt-0091 [00] enable : Transition to ACPI mode successful CPU0: AMD Athlon(tm) XP 2200+ stepping 01 SMP motherboard not detected. Local APIC not detected. Using dummy APIC emulation. Brought up 1 CPUs Booting paravirtualized kernel on bare hardware NET: Registered protocol family 16 ACPI: bus type pci registered PCI: PCI BIOS revision 2.10 entry at 0xfdb01, last bus=1 PCI: Using configuration type 1 Setting up standard PCI resources evgpeblk-0956 [00] ev_create_gpe_block : GPE 00 to 0F [_GPE] 2 regs on int 0x9 evgpeblk-0956 [00] ev_create_gpe_block : GPE 10 to 1F [_GPE] 2 regs on int 0x9 evgpeblk-1052 [00] ev_initialize_gpe_bloc: Found 9 Wake, Enabled 0 Runtime GPEs in this block evgpeblk-1052 [00] ev_initialize_gpe_bloc: Found 0 Wake, Enabled 0 Runtime GPEs in this block ACPI: EC: Look up EC in DSDT Completing Region/Field/Buffer/Package initialization:......................................................................... Initialized 14/18 Regions 5/5 Fields 35/35 Buffers 19/30 Packages (421 nodes) Initializing Device/Processor/Thermal objects by executing _INI methods:. Executed 1 _INI methods requiring 0 _STA executions (examined 44 objects) ACPI: Interpreter enabled ACPI: (supports S0 S1 S4 S5) ACPI: Using PIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) PCI: Probing PCI hardware (bus 00) Enabling SiS 96x SMBus. ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs *3 4 5 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 *10 11 12 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 10 11 *12 14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 *5 7 10 11 12 14 15) ACPI: Power Resource [URP1] (off) ACPI: Power Resource [URP2] (off) ACPI: Power Resource [FDDP] (off) ACPI: Power Resource [LPTP] (off) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init ACPI: bus type pnp registered pnp: PnP ACPI: found 8 devices ACPI: ACPI bus type pnp unregistered PnPBIOS: Disabled by ACPI PNP PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
With a VIA Velocity instead (as requested); CPU0 0: 254656 XT-PIC-XT timer 1: 6182 XT-PIC-XT i8042 2: 0 XT-PIC-XT cascade 3: 4760915 XT-PIC-XT ehci_hcd:usb2, eth0 5: 34916 XT-PIC-XT ohci_hcd:usb3 6: 5 XT-PIC-XT floppy 7: 0 XT-PIC-XT parport0 8: 2 XT-PIC-XT rtc 9: 1 XT-PIC-XT acpi 10: 1114 XT-PIC-XT SiS SI7012 11: 0 XT-PIC-XT ohci_hcd:usb5 12: 0 XT-PIC-XT ohci_hcd:usb1, ohci_hcd:usb4 14: 22363 XT-PIC-XT libata 15: 20588 XT-PIC-XT libata NMI: 0 LOC: 0 ERR: 0 MIS: 0
Reply-To: akpm@linux-foundation.org On Thu, 15 Nov 2007 18:04:19 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=9391 > > Summary: Netgear GA320T(tg3) strange errors and non-workingness > Product: Drivers > Version: 2.5 > KernelVersion: 2.6.22.12 (openSUSE 10.3) > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Network > AssignedTo: jgarzik@pobox.com > ReportedBy: jnelson-kernel-bugzilla@jamponi.net > > > Most recent kernel where this bug did not occur: 2.4.something > > Distribution: openSUSE 10.3 > > Hardware Environment: Athlon XP 2200+ > Software Environment: 2.6.22.12 > Problem Description: this bug is being created from a mailing list > discussion. > > I've got a pair of GiG-E > cards that do not work correctly. Everything appears to come up just > fine, but sooner or later (typically fairly quickly) the cards weird > out and never really come back. > > The best info I've got is this: > > Nov 10 22:21:19 frank kernel: tg3.c:v3.65 (August 07, 2006) > Nov 10 22:21:19 frank kernel: ACPI: PCI Interrupt 0000:00:0b.0[A] -> > Link [LNKB] -> GSI 3 (level, low) -> IRQ 3 > Nov 10 22:21:19 frank kernel: eth0: Tigon3 [partno(AC91002A1) rev 0105 > PHY(5701)] (PCI:33MHz:32-bit) 10/100/1000BaseT Ethernet > 00:09:5b:09:b1:69 > Nov 10 22:21:19 frank kernel: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] > ASF[0] Split[0] WireSpeed[1] TSOcap[0] > Nov 10 22:21:19 frank kernel: eth0: dma_rwctrl[76ff000f] dma_mask[64-bit] > Nov 10 22:21:19 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:21:19 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:21:19 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:21:19 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:21:20 frank kernel: tg3: eth0: Link is up at 1000 Mbps, full > duplex. > Nov 10 22:21:20 frank kernel: tg3: eth0: Flow control is on for TX and > on for RX. > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:21:20 frank kernel: ACPI: PCI interrupt for device > 0000:00:0b.0 disabled > Nov 10 22:21:20 frank kernel: PCI: Enabling device 0000:00:0b.0 (0100 -> > 0102) > Nov 10 22:21:20 frank kernel: ACPI: PCI Interrupt 0000:00:0b.0[A] -> > Link [LNKB] -> GSI 3 (level, low) -> IRQ 3 > Nov 10 22:21:20 frank kernel: eth0: Tigon3 [partno(AC91002A1) rev 0105 > PHY(5701)] (PCI:33MHz:32-bit) 10/100/1000BaseT Ethernet > 00:09:5b:09:b1:69 > Nov 10 22:21:20 frank kernel: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] > ASF[0] Split[0] WireSpeed[1] TSOcap[0] > Nov 10 22:21:20 frank kernel: eth0: dma_rwctrl[76ff000f] dma_mask[64-bit] > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:21:20 frank kernel: tg3: eth0: Link is up at 1000 Mbps, full > duplex. > Nov 10 22:21:20 frank kernel: tg3: eth0: Flow control is on for TX and > on for RX. > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:21:20 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:21:20 frank kernel: tg3: eth0: Link is up at 1000 Mbps, full > duplex. > Nov 10 22:21:20 frank kernel: tg3: eth0: Flow control is on for TX and > on for RX. > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:24:40 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:41:48 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:41:49 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:41:49 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:41:49 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:41:49 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:43:02 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:45:52 frank kernel: NETDEV WATCHDOG: eth0: transmit timed out > Nov 10 22:45:52 frank kernel: tg3: eth0: transmit timed out, resetting > Nov 10 22:45:52 frank kernel: tg3: tg3_stop_block timed out, ofs=1400 > enable_bit=2 > Nov 10 22:45:52 frank kernel: tg3: tg3_stop_block timed out, ofs=c00 > enable_bit=2 > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset b (was 164514e4, writing 302a1385) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 3 (was 0, writing 4008) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 2 (was 2000000, writing 2000015) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 1 (was 2b00000, writing 2b00106) > Nov 10 22:45:52 frank kernel: PM: Writing back config space on device > 0000:00:0b.0 at offset 0 (was 164514e4, writing 3ea173b) > Nov 10 22:45:52 frank kernel: tg3: eth0: Link is down. > Nov 10 22:45:56 frank kernel: tg3: eth0: Link is up at 1000 Mbps, full > duplex. > Nov 10 22:45:56 frank kernel: tg3: eth0: Flow control is on for TX and > on for RX. > Nov 10 22:47:49 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:47:49 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:47:49 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > Nov 10 22:49:02 frank kernel: nfs: server 192.168.2.1 not responding, timed > out > >
On 16-11-2007 03:18, Andrew Morton wrote: > On Thu, 15 Nov 2007 18:04:19 -0800 (PST) bugme-daemon@bugzilla.kernel.org > wrote: > >> http://bugzilla.kernel.org/show_bug.cgi?id=9391 >> >> Summary: Netgear GA320T(tg3) strange errors and non-workingness >> Product: Drivers >> Version: 2.5 >> KernelVersion: 2.6.22.12 (openSUSE 10.3) ... I see we have a new thread for this... Jon, maybe it's only me, but this really looks strange. I don't know this board, but probably these interrupts shouldn't look like this. It seems there is a problem with detecting your apic. I'm not sure if this is a default SUSE kernel, but maybe it would be better to try some current live-cd distro with good hardware detection. tg3 card & driver seem to have quite good opinions, especially if msi interrupts could be used. Btw., these logs (dmesg, lspci) could be much longer... (Somebody might even think you have something to hide; it would be better to mask only personal data then.) Cheers, Jarek P.
Reply-To: jnelson@jamponi.net On 11/16/07, Jarek Poplawski <jarkao2@o2.pl> wrote: > On 16-11-2007 03:18, Andrew Morton wrote: > > On Thu, 15 Nov 2007 18:04:19 -0800 (PST) bugme-daemon@bugzilla.kernel.org > wrote: > > > >> http://bugzilla.kernel.org/show_bug.cgi?id=9391 > >> > >> Summary: Netgear GA320T(tg3) strange errors and non-workingness > >> Product: Drivers > >> Version: 2.5 > >> KernelVersion: 2.6.22.12 (openSUSE 10.3) > ... > > I see we have a new thread for this... > > Jon, maybe it's only me, but this really looks strange. I don't know > this board, but probably these interrupts shouldn't look like this. > It seems there is a problem with detecting your apic. I'm not sure > if this is a default SUSE kernel, but maybe it would be better to > try some current live-cd distro with good hardware detection. tg3 > card & driver seem to have quite good opinions, especially if msi > interrupts could be used. > > Btw., these logs (dmesg, lspci) could be much longer... (Somebody > might even think you have something to hide; it would be better to > mask only personal data then.) The lspci is exactly as it was output. The dmesg is shortened only slightly. The kernel is the latest available for openSUSE 10.3. No MSI because this is an Athlon XP (read: 32bit, single core, regular old 33MHz, 32bit PCI). The interrupts look *exactly* the same with or without noapic. This board doesn't have apic as far as I know. It's an ECS K7S5A I think. Does that help?
Jon Nelson wrote: > The lspci is exactly as it was output. The dmesg is shortened only > slightly. The kernel is the latest available for openSUSE 10.3. No MSI > because this is an Athlon XP (read: 32bit, single core, regular old > 33MHz, 32bit PCI). > The lspci doesn't look correct to me either. We don't have AGP capability on these network cards. Please provide: lspci -vvxxx -s0:0:b.0
Jon Nelson wrote, On 11/16/2007 03:08 PM: ... > The lspci is exactly as it was output. The dmesg is shortened only > slightly. The kernel is the latest available for openSUSE 10.3. No MSI > because this is an Athlon XP (read: 32bit, single core, regular old > 33MHz, 32bit PCI). > > The interrupts look *exactly* the same with or without noapic. This > board doesn't have apic as far as I know. It's an ECS K7S5A I think. > > Does that help? Yes. You might be right with these interrupts. But, dmesg doesn't really show enough. Try to paste this all, at least until the first WATCHDOG time out and card reset. Thanks, Jarek P.
Created attachment 13583 [details] dmesg
Created attachment 13584 [details] lspci -vv -xxxx
Three things: 1. if needs be, I'm familiar enough with git to pull most anybody's kernel (I currently have Linus' kernel) and try that. 2. I typoed the bug - it's a GA302T. I can report any numbers printed on (either of) the cards. I bought them as a pair.
I think it will help if you try the latest 2.6.23.8 kernel. The tg3 driver in that kernel is a lot newer and it will print some debug information during netdev watchdog.
I have a successfully working GA-302T for the first time since I was using 2.4! Yay! 2.6.24rc3 (g2ffbb837) appears to work just perfectly! No weird messages, and really great performance! I saw just over 145,000 KB/s as reported by iptraf for a /single/ TCP session between an AMD x86-64 dual core 3600+, NVidia MCP55 (2.6.22.12) and an Athlon XP 2200+, Netgear GA-302T (2.6.24rc3). Really fantastic! Should this get closed or what?
Thanks, Jon. If you feel that the problem was resolved then close this bug please.
Apparently later versions of the kernel and/or driver resolve this issue. Really, that is fantastic!