Bug 12432 - sis900 transmit timeouts
sis900 transmit timeouts
Status: REJECTED INVALID
Product: Networking
Classification: Unclassified
Component: Other
All Linux
: P1 normal
Assigned To: Arnaldo Carvalho de Melo
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-11 11:00 UTC by gionnico
Modified: 2009-03-05 05:23 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.28
Tree: Mainline
Regression: Yes


Attachments
2.6.28 vanilla kernel dmesg after trying to assign a ip though dhcpcd (15.75 KB, text/plain)
2009-01-11 11:04 UTC, gionnico
Details
diff .config.works .config.noworks (1.42 KB, text/plain)
2009-01-19 10:30 UTC, gionnico
Details

Description gionnico 2009-01-11 11:00:57 UTC
Latest working kernel version: 2.6.27
Earliest failing kernel version: 2.6.28
Distribution: gentoo
Hardware Environment: i686, sis900 ethernet
Problem Description: i can't reach the machine anymore, since i've installed 2.6.28 vanilla. it worked in 2.6.27 with the same configuration.

Steps to reproduce:
You need a sis900 ethernet and even if you set an ip, the machine doesn't respond anymore to the net.
I tried assigning a dynamic ip with dhcpcd eth0 and i'll post the dmesg.
Comment 1 gionnico 2009-01-11 11:04:07 UTC
Created attachment 19747 [details]
2.6.28 vanilla kernel dmesg after trying to assign a ip though dhcpcd

The system has got a "Silicon Integrated Systems [SiS] SiS900 PCI Fast Ethernet (rev 90)" ethernet controller.

As said, the IP had already been manually set, but I can't reach the machine.

ifconfig (same as ifconfig -a)

eth0      Link encap:Ethernet  HWaddr OK
          inet addr:192.168.OK  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
          Interrupt:22 Base address:0xa000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
Comment 2 Daniel Drake 2009-01-11 11:23:02 UTC
Original downstream report:
https://bugs.gentoo.org/show_bug.cgi?id=253646

You can see the transmit timeout errors in the dmesg attachment above.
Comment 3 Anonymous Emailer 2009-01-11 11:40:31 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sun, 11 Jan 2009 11:00:58 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=12432
> 
>            Summary: [2.6.28 regression] sis900 fast ethernet breakage
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.28
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: gionnico@email.it
> 
> 
> Latest working kernel version: 2.6.27
> Earliest failing kernel version: 2.6.28
> Distribution: gentoo
> Hardware Environment: i686, sis900 ethernet
> Problem Description: i can't reach the machine anymore, since i've installed
> 2.6.28 vanilla. it worked in 2.6.27 with the same configuration.
> 
> Steps to reproduce:
> You need a sis900 ethernet and even if you set an ip, the machine doesn't
> respond anymore to the net.
> I tried assigning a dynamic ip with dhcpcd eth0 and i'll post the dmesg.

It's a regression and I'm not seeing any likely-looking changes to that
driver in 2.6.28.

Comment 4 Daniel Drake 2009-01-19 05:07:44 UTC
gionnico, given the lack of response here (the bug is not obvious), it would be helpful if you could perform a bisection. This is quite time consuming (you will have to test about 14 kernels) but it will tell us the exact commit which introduced the bug. If you have time/patience, the process is described here:

http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/

Use v2.6.27 as good and v2.6.28 as bad.
Comment 5 gionnico 2009-01-19 10:29:16 UTC
Uff you could have said that I should have used the 2.6.28 config.

I used the 2.6.27 (and default adjustments=All default for new options) and the problem didn't present.

I'll patch the 2 config files diff. If the changed options shouldn't block my device (so there's a bug) I can git bisect the wrong config again.
Comment 6 gionnico 2009-01-19 10:30:54 UTC
Created attachment 19889 [details]
diff .config.works .config.noworks
Comment 7 Daniel Drake 2009-01-19 10:44:27 UTC
Please only ever use unified diffs, and please attach both config files in full.
Actually, assuming I am reading it right, you have disabled CONFIG_PCI_QUIRKS in your 2.6.28 config. Why? I don't think you want to do this.
Comment 8 Thomas Mudrunka 2009-03-04 16:14:52 UTC
I've got similar problem. I am getting lot of this messages:
harvie@harvie-ntb ~ $ dmesg | grep timeout
eth0: Transmit timeout, status 00000004 00000000 
eth0: Transmit timeout, status 00000004 00000000 
...

And pings to nearest router have 100x longer latency when using sis900 100Mb Tx Full-Duplex (or any slower mode) than when using 802.11g link to the same network.

with this NIC:
harvie@harvie-ntb ~ $ lspci | grep 900
00:04.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 PCI Fast Ethernet (rev 91)

and with this kernel:
harvie@harvie-ntb ~ $ uname -a
Linux harvie-ntb 2.6.28-ARCH #1 SMP PREEMPT Sun Feb 22 11:03:50 UTC 2009 i686...

using noapic option

I found more people with this problem:
http://www.linuxquestions.org/questions/linux-networking-3/netdev-watchdog-eth0-transmit-timed-out-46634/

And somebody also written the patch (i'm not sure if it works):
http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/4410.html

diff -puN drivers/net/sis900.c~sis900-tx_timeout-fix drivers/net/sis900.c
--- 25/drivers/net/sis900.c~sis900-tx_timeout-fix Mon Oct 20 16:10:04 2003
+++ 25-akpm/drivers/net/sis900.c Mon Oct 20 16:10:11 2003
@@ -1438,7 +1438,7 @@ static void sis900_tx_timeout(struct net
                         pci_unmap_single(sis_priv->pci_dev,
                                 sis_priv->tx_ring[i].bufptr, skb->len,
                                 PCI_DMA_TODEVICE);
- dev_kfree_skb(skb);
+ dev_kfree_skb_irq(skb);
                         sis_priv->tx_skbuff[i] = 0;
                         sis_priv->tx_ring[i].cmdsts = 0;
                         sis_priv->tx_ring[i].bufptr = 0;

_

- 

Hope this will help you.
Comment 9 gionnico 2009-03-05 04:59:17 UTC
(In reply to comment #7)
> Please only ever use unified diffs, and please attach both config files in
> full.
> Actually, assuming I am reading it right, you have disabled CONFIG_PCI_QUIRKS
> in your 2.6.28 config. Why? I don't think you want to do this.
> 

In my case I just needed that PCI_QUIRKS.
Comment 10 Daniel Drake 2009-03-05 05:23:22 UTC
OK, closing then. Thomas, feel free to file another bug report for your issue.

Note You need to log in before you can comment on or make changes to this bug.