Bug 30952

Summary: asix driver broken since 2.6.35
Product: Drivers Reporter: Bogdan Lipski (lipski.bogdan)
Component: NetworkAssignee: drivers_network (drivers_network)
Status: RESOLVED DUPLICATE    
Severity: normal CC: 2bitoperations, alan, antpeter, lvml, myxal.mxl
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.4.4 Subsystem:
Regression: Yes Bisected commit-id:

Description Bogdan Lipski 2011-03-12 14:02:34 UTC
Hey,

there a few reports in here logged but I am not sure if they are 100% the same, anyway you can close this one as duplicate if so.
https://bugzilla.kernel.org/show_bug.cgi?id=16831
https://bugzilla.kernel.org/show_bug.cgi?id=29082

config:
- gentoo x64, fully updated
- tried kernels 2.6.36/2.6.37/2.6.37.3
- network card d-link dub-e100:
asix 5-2:1.0: eth2: register 'asix' at usb-0000:00:1d.7-2, ASIX AX88772 USB 2.0 Ethernet

Bus 005 Device 002: ID 2001:3c05 D-Link Corp. [hex] DUB-E100 Fast Ethernet [asix]
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol         0
  bMaxPacketSize0        64
  idVendor           0x2001 D-Link Corp. [hex]
  idProduct          0x3c05 DUB-E100 Fast Ethernet [asix]
  bcdDevice            0.01
  iManufacturer           1 D-Link Corporation
  iProduct                2 DUB-E100
  iSerial                 3 000001
  bNumConfigurations      1

card is connected directly to scientific atlanta epc2203 cable/voip modem for internet access, reason is isp locks it down to mac address and i really don't like calling them in case motherboard changes etc...


the problem:
1. under 2.6.34 (tried 2.6.34-gentoo) everything seems to be well.
2. patch from 2.6.35/signed off by Jussi Kivilinna "[PATCH] asix: check packet size against mtu+ETH_HLEN instead of ETH_FRAME_LEN"
=> since this patch I am getting a lot of "asix_rx_fixup() Bad RX Length" and packets get dropped, under ifconfig I see RX error counter increasing for every message logged in dmesg, number of RX errors can shoot up easily to couple hundred thousands in just a few hours.
3. tried to revert this patch in 2.6.37.3 as it seems to be just a one-liner, but it looks like other patches applied to asix.c since 2.6.34 have also some influence. anyway when reverting this single patch the behaviour with plenty of error messages logged under dmesg stops and there are no RX errors logged under ifconfig. however it seems that after some time card starts loosing the packets again - i will update this report later if I have any further findings.
Comment 1 Bogdan Lipski 2011-03-12 16:08:06 UTC
workaround suggested in:
https://bugzilla.kernel.org/show_bug.cgi?id=16831
seems to work well for me as well.
Comment 2 Andrew Morton 2011-03-14 21:43:32 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

The reporter says "workaround suggested in:
https://bugzilla.kernel.org/show_bug.cgi?id=16831 seems to work well
for me as well".  That workaround appears to be "dump the kernel driver
and use the driver off the vendor's website".


On Sat, 12 Mar 2011 14:02:37 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=30952
> 
>            Summary: asix driver broken since 2.6.35
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.35+
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network@kernel-bugs.osdl.org
>         ReportedBy: lipski.bogdan@gmail.com
>         Regression: Yes
> 
> 
> Hey,
> 
> there a few reports in here logged but I am not sure if they are 100% the
> same,
> anyway you can close this one as duplicate if so.
> https://bugzilla.kernel.org/show_bug.cgi?id=16831
> https://bugzilla.kernel.org/show_bug.cgi?id=29082
> 
> config:
> - gentoo x64, fully updated
> - tried kernels 2.6.36/2.6.37/2.6.37.3
> - network card d-link dub-e100:
> asix 5-2:1.0: eth2: register 'asix' at usb-0000:00:1d.7-2, ASIX AX88772 USB
> 2.0
> Ethernet
> 
> Bus 005 Device 002: ID 2001:3c05 D-Link Corp. [hex] DUB-E100 Fast Ethernet
> [asix]
> Device Descriptor:
>   bLength                18
>   bDescriptorType         1
>   bcdUSB               2.00
>   bDeviceClass          255 Vendor Specific Class
>   bDeviceSubClass       255 Vendor Specific Subclass
>   bDeviceProtocol         0
>   bMaxPacketSize0        64
>   idVendor           0x2001 D-Link Corp. [hex]
>   idProduct          0x3c05 DUB-E100 Fast Ethernet [asix]
>   bcdDevice            0.01
>   iManufacturer           1 D-Link Corporation
>   iProduct                2 DUB-E100
>   iSerial                 3 000001
>   bNumConfigurations      1
> 
> card is connected directly to scientific atlanta epc2203 cable/voip modem for
> internet access, reason is isp locks it down to mac address and i really
> don't
> like calling them in case motherboard changes etc...
> 
> 
> the problem:
> 1. under 2.6.34 (tried 2.6.34-gentoo) everything seems to be well.
> 2. patch from 2.6.35/signed off by Jussi Kivilinna "[PATCH] asix: check
> packet
> size against mtu+ETH_HLEN instead of ETH_FRAME_LEN"
> => since this patch I am getting a lot of "asix_rx_fixup() Bad RX Length" and
> packets get dropped, under ifconfig I see RX error counter increasing for
> every
> message logged in dmesg, number of RX errors can shoot up easily to couple
> hundred thousands in just a few hours.
> 3. tried to revert this patch in 2.6.37.3 as it seems to be just a one-liner,
> but it looks like other patches applied to asix.c since 2.6.34 have also some
> influence. anyway when reverting this single patch the behaviour with plenty
> of
> error messages logged under dmesg stops and there are no RX errors logged
> under
> ifconfig. however it seems that after some time card starts loosing the
> packets
> again - i will update this report later if I have any further findings.
>
Comment 3 David S. Miller 2011-03-14 21:45:38 UTC
From: Andrew Morton <akpm@linux-foundation.org>
Date: Mon, 14 Mar 2011 14:43:09 -0700

> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> The reporter says "workaround suggested in:
> https://bugzilla.kernel.org/show_bug.cgi?id=16831 seems to work well
> for me as well".  That workaround appears to be "dump the kernel driver
> and use the driver off the vendor's website".

This is an issue you've brought up to me several months ago.

I started looking into it, but because the vendor driver development
happens in a completely different universe the divergence noise is
substantial and it's a huge effort to consolidate these two drivers.

I think that until the vendor starts to care, nothing is going to
happen to resolve these ASIC driver bugs.
Comment 4 Andrew Morton 2011-03-14 21:50:45 UTC
On Mon, 14 Mar 2011 14:45:53 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Andrew Morton <akpm@linux-foundation.org>
> Date: Mon, 14 Mar 2011 14:43:09 -0700
> 
> > 
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> > 
> > The reporter says "workaround suggested in:
> > https://bugzilla.kernel.org/show_bug.cgi?id=16831 seems to work well
> > for me as well".  That workaround appears to be "dump the kernel driver
> > and use the driver off the vendor's website".
> 
> This is an issue you've brought up to me several months ago.
> 
> I started looking into it, but because the vendor driver development
> happens in a completely different universe the divergence noise is
> substantial and it's a huge effort to consolidate these two drivers.
> 
> I think that until the vendor starts to care, nothing is going to
> happen to resolve these ASIC driver bugs.

Yup.  I suppose an alternative approach might be to feed the current
vendor driver through the drivers/staging process (preferably with
their assistance!) then run with two alternative drivers for a
while and eventually remove the old one.

But that's without having looked at the vendor code.  Is it a Big Mess?
Comment 5 David S. Miller 2011-03-14 21:57:38 UTC
From: Andrew Morton <akpm@linux-foundation.org>
Date: Mon, 14 Mar 2011 14:49:54 -0700

> But that's without having looked at the vendor code.  Is it a Big Mess?

The vendor took the upstream driver at a point a year or two in the
past then did whatever they want with it.

They made no effort to "merge" in changes made to the upstream driver
during this time, so there is serious divergence especially in the PHY
handling which is probably where all the problems are in the upstream
driver.

Even the table of device IDs for probing is orderred and arranged
differently.
Comment 6 Lutz Vieweg 2012-07-01 18:59:13 UTC
Today I implemented some minor changes to the latest manufacturers driver (available from http://www.asix.com.tw/FrootAttach/driver/AX88772B_772A_760_772_178_LINUX_Driver_v4.2.0_Source.zip ) to make it compilable with linux-3.4.x - it works well for me now, but I am by no means an expert regarding network driver programming, so my patch may break things I do not know about - you can find it here: http://www.linlap.com/wiki/asus+ux32vd#asix_usb-to-ethernet_adapter

(BTW: I think this bug is a duplicate of https://bugzilla.kernel.org/show_bug.cgi?id=29082 ).
Comment 7 Lutz Vieweg 2012-07-01 19:04:50 UTC
I noticed that the part of the manufacturers driver that deals with the incoming packet length is quite different between the AX88772 and AX88772b related functions, while driver/net/usb/asix.c treats AX88772 / AX88772a / AX88772b chips all the same.

The bug definitely happenes with the AX88772b - I cannot tell about the older chips versions.
Comment 8 Bogdan Lipski 2012-07-10 13:07:01 UTC
Hi Lutz/All,
Thanks very much. I see patch had some rejects vs 3.4.4, but it was easy to edit, so I was able to apply it. I will try your patch in a moment then and let you know if I see any problems.

meanwhile for stock asix driver in 3.4.4 - I was running it for a few days and I didnt see the errors "asix_rx_fixup() Bad RX Length"/packet loss anymore, at least with my DUB-E100, but after 2 days of running it I got this:
Jul  9 10:37:08 gizmo kernel: : asix 2-2:1.0: eth2: Failed to enable software MII access
Jul  9 10:37:18 gizmo kernel: : asix 2-2:1.0: eth2: Failed to enable hardware MII access
and my link (and perhaps the whole sys) died, was brutally rebooted by homies so don't have further details, anyway enough to still move to manufacturer's driver+patch.

Rgds,
B.
Comment 9 Petr 2012-11-07 18:05:55 UTC
Running on Ubuntu

Linux  3.6.5-030605-generic #201211011211 SMP Thu Nov 1 16:20:33 UTC 2012 i686 i686 i686 GNU/Linux

and have this bug

syslog:

Nov  7 12:14:10 eee-server kernel: [341873.437662] asix 1-2:1.0: eth1: asix_rx_fixup() Bad RX Length 1514
Nov  7 12:14:10 eee-server kernel: [341873.437662] asix 1-2:1.0: eth1: asix_rx_fixup() Bad Header Length
Nov  7 12:14:10 eee-server kernel: [341873.441254] asix 1-2:1.0: eth1: asix_rx_fixup() Bad RX Length 1514
Nov  7 12:14:10 eee-server kernel: [341873.442730] asix 1-2:1.0: eth1: asix_rx_fixup() Bad Header Length
Nov  7 12:14:19 eee-server kernel: [341882.154238] asix 1-2:1.0: eth1: asix_rx_fixup() Bad RX Length 622
Nov  7 12:14:19 eee-server kernel: [341882.155781] asix 1-2:1.0: eth1: asix_rx_fixup() Bad Header Length
Nov  7 12:14:19 eee-server kernel: [341882.867884] asix 1-2:1.0: eth1: asix_rx_fixup() Bad RX Length 1514
Nov  7 12:14:19 eee-server kernel: [341882.869181] asix 1-2:1.0: eth1: asix_rx_fixup() Bad Header Length
Nov  7 12:14:20 eee-server kernel: [341883.318103] asix 1-2:1.0: eth1: asix_rx_fixup() Bad RX Length 688
Nov  7 12:14:20 eee-server kernel: [341883.319390] asix 1-2:1.0: eth1: asix_rx_fixup() Bad Header Length
Nov  7 12:14:20 eee-server kernel: [341883.534605] asix 1-2:1.0: eth1: asix_rx_fixup() Bad RX Length 1514
Comment 10 Alan 2013-12-23 11:43:19 UTC

*** This bug has been marked as a duplicate of bug 29082 ***