Bug 13830 - 8139too - packets dropped/lost by ethernet stack
8139too - packets dropped/lost by ethernet stack
Status: NEW
Product: Drivers
Classification: Unclassified
Component: Network
All Linux
: P1 normal
Assigned To: drivers_network@kernel-bugs.osdl.org
https://bugs.launchpad.net/linux/+bug...
:
Depends on:
Blocks: 56331
  Show dependency treegraph
 
Reported: 2009-07-26 10:32 UTC by Neil Wilson
Modified: 2016-03-19 17:23 UTC (History)
6 users (show)

See Also:
Kernel Version: 3.16.0
Tree: Mainline
Regression: No


Attachments
Lenovo 3000 N100 with ACPI on (51.02 KB, text/plain)
2009-09-22 18:07 UTC, Neil Wilson
Details
Lenovo 3000 N100 with ACPI off (40.72 KB, text/plain)
2009-09-22 18:09 UTC, Neil Wilson
Details
Lenovo 3000 N100 with ACPI = HT (43.18 KB, text/plain)
2009-09-22 18:10 UTC, Neil Wilson
Details

Description Neil Wilson 2009-07-26 10:32:51 UTC
I'm losing receive packets with the 8139too REALTEK driver when I place the driver under load.

Details:

distribution: Ubuntu Jaunty 9.04, Ubuntu Karmic Alpha-3.

eth0      Link encap:Ethernet  HWaddr 00:1b:38:08:7f:d3  
          inet addr:192.168.2.3  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::21b:38ff:fe08:7fd3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:9164 errors:94 dropped:94 overruns:94 frame:0
          TX packets:7497 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:12058463 (12.0 MB)  TX bytes:792762 (792.7 KB)
          Interrupt:21 Base address:0x2000 

Dmesg details

[    4.357046] 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
[    4.357092] 8139cp 0000:05:01.0: This (id 10ec:8139 rev 10) is not an 8139C+ 
compatible chip, use 8139too
[    4.359452] 8139too Fast Ethernet driver 0.9.28
[    4.359513] 8139too 0000:05:01.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
[    4.359527] 8139too 0000:05:01.0: setting latency timer to 64
[    4.360886] eth0: RealTek RTL8139 at 0x2000, 00:1b:38:08:7f:d3, IRQ 21
[    4.360889] eth0:  Identified 8139 chip type 'RTL-8100B/8139D'


The problem appears to goes away if I enable just the debug message at the end of the function in the interrupt handler (rtl8139_interrupt) which sort of suggests that something is coming into the driver faster than it can handle or it is missing a state somewhere. 

(I get much better throughput with the debug message on than without it).

Also none of the error counters appear to be incremented by the driver itself - suggesting again that the higher level is expecting something the driver isn't providing.

Bug 10682 may be related (http://bugzilla.kernel.org/show_bug.cgi?id=10682) but isn't exactly the same.

More details at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/401891
Comment 1 Andrew Morton 2009-07-28 22:27:10 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sun, 26 Jul 2009 10:32:52 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13830
> 
>            Summary: 8139too - packets dropped/lost by ethernet stack
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.28, 2.6.31
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network@kernel-bugs.osdl.org
>         ReportedBy: aldursys@gmail.com
>         Regression: No
> 
> 
> I'm losing receive packets with the 8139too REALTEK driver when I place the
> driver under load.
> 
> Details:
> 
> distribution: Ubuntu Jaunty 9.04, Ubuntu Karmic Alpha-3.
> 
> eth0      Link encap:Ethernet  HWaddr 00:1b:38:08:7f:d3  
>           inet addr:192.168.2.3  Bcast:192.168.2.255  Mask:255.255.255.0
>           inet6 addr: fe80::21b:38ff:fe08:7fd3/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:9164 errors:94 dropped:94 overruns:94 frame:0
>           TX packets:7497 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000 
>           RX bytes:12058463 (12.0 MB)  TX bytes:792762 (792.7 KB)
>           Interrupt:21 Base address:0x2000 
> 
> Dmesg details
> 
> [    4.357046] 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
> [    4.357092] 8139cp 0000:05:01.0: This (id 10ec:8139 rev 10) is not an 8139C+ 
> compatible chip, use 8139too
> [    4.359452] 8139too Fast Ethernet driver 0.9.28
> [    4.359513] 8139too 0000:05:01.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> [    4.359527] 8139too 0000:05:01.0: setting latency timer to 64
> [    4.360886] eth0: RealTek RTL8139 at 0x2000, 00:1b:38:08:7f:d3, IRQ 21
> [    4.360889] eth0:  Identified 8139 chip type 'RTL-8100B/8139D'
> 
> 
> The problem appears to goes away if I enable just the debug message at the end
> of the function in the interrupt handler (rtl8139_interrupt) which sort of
> suggests that something is coming into the driver faster than it can handle or
> it is missing a state somewhere. 
> 
> (I get much better throughput with the debug message on than without it).
> 
> Also none of the error counters appear to be incremented by the driver itself -
> suggesting again that the higher level is expecting something the driver isn't
> providing.
> 
> Bug 10682 may be related (http://bugzilla.kernel.org/show_bug.cgi?id=10682) but
> isn't exactly the same.
> 
> More details at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/401891
Comment 2 Neil Wilson 2009-09-19 06:04:59 UTC
Hi,

This bug is still there in 2.6.31 on a 64 bit platform, and I'd like
to see if it is fixable.

Can anybody give me a clue to what is likely to cause the parallel
increase of the three error counters. I've not seen that anywhere
else.

If somebody who understands how things are wired up in this area can
give me a few pointers and a bit of mentoring I can have a go at
working out what is wrong here.

Rgs

Neil

2009/7/28  <bugzilla-daemon@bugzilla.kernel.org>:
> http://bugzilla.kernel.org/show_bug.cgi?id=13830
>
>
>
>
>
> --- Comment #1 from Andrew Morton <akpm@linux-foundation.org>  2009-07-28 22:27:10 ---
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Sun, 26 Jul 2009 10:32:52 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=13830
>>
>>            Summary: 8139too - packets dropped/lost by ethernet stack
>>            Product: Drivers
>>            Version: 2.5
>>     Kernel Version: 2.6.28, 2.6.31
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Network
>>         AssignedTo: drivers_network@kernel-bugs.osdl.org
>>         ReportedBy: aldursys@gmail.com
>>         Regression: No
>>
>>
>> I'm losing receive packets with the 8139too REALTEK driver when I place the
>> driver under load.
>>
>> Details:
>>
>> distribution: Ubuntu Jaunty 9.04, Ubuntu Karmic Alpha-3.
>>
>> eth0      Link encap:Ethernet  HWaddr 00:1b:38:08:7f:d3
>>           inet addr:192.168.2.3  Bcast:192.168.2.255  Mask:255.255.255.0
>>           inet6 addr: fe80::21b:38ff:fe08:7fd3/64 Scope:Link
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           RX packets:9164 errors:94 dropped:94 overruns:94 frame:0
>>           TX packets:7497 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:12058463 (12.0 MB)  TX bytes:792762 (792.7 KB)
>>           Interrupt:21 Base address:0x2000
>>
>> Dmesg details
>>
>> [    4.357046] 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
>> [    4.357092] 8139cp 0000:05:01.0: This (id 10ec:8139 rev 10) is not an 8139C+
>> compatible chip, use 8139too
>> [    4.359452] 8139too Fast Ethernet driver 0.9.28
>> [    4.359513] 8139too 0000:05:01.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
>> [    4.359527] 8139too 0000:05:01.0: setting latency timer to 64
>> [    4.360886] eth0: RealTek RTL8139 at 0x2000, 00:1b:38:08:7f:d3, IRQ 21
>> [    4.360889] eth0:  Identified 8139 chip type 'RTL-8100B/8139D'
>>
>>
>> The problem appears to goes away if I enable just the debug message at the end
>> of the function in the interrupt handler (rtl8139_interrupt) which sort of
>> suggests that something is coming into the driver faster than it can handle or
>> it is missing a state somewhere.
>>
>> (I get much better throughput with the debug message on than without it).
>>
>> Also none of the error counters appear to be incremented by the driver itself -
>> suggesting again that the higher level is expecting something the driver isn't
>> providing.
>>
>> Bug 10682 may be related (http://bugzilla.kernel.org/show_bug.cgi?id=10682) but
>> isn't exactly the same.
>>
>> More details at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/401891
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>



-- 
Neil Wilson
Comment 3 Neil Wilson 2009-09-19 08:11:16 UTC
Booting with acpi=off or acpi=ht makes the problem go away. None of the other standard acpi knobs appear to have any effect on the issue.

Looks like an ACPI issue rather than a driver problem at this stage.
Comment 4 Zhang Rui 2009-09-21 01:30:04 UTC
please attach the full dmesg output when ACPI is on & off.
Comment 5 Neil Wilson 2009-09-22 18:07:51 UTC
Created attachment 23140 [details]
Lenovo 3000 N100 with ACPI on
Comment 6 Neil Wilson 2009-09-22 18:09:00 UTC
Created attachment 23141 [details]
Lenovo 3000 N100 with ACPI off
Comment 7 Neil Wilson 2009-09-22 18:10:17 UTC
Created attachment 23142 [details]
Lenovo 3000 N100 with ACPI = HT
Comment 8 ykzhao 2009-09-28 10:08:28 UTC
Hi,
   will you please try the following boot option and see whether the issue still exists?
   a. processor.max_cstate=1
   b. idle=poll
   c. nolapic_timer

Thanks.
Comment 9 Neil Wilson 2009-09-28 19:41:54 UTC
No network errors with any of those, and the network throughput increased as I went down the list. 'nolapic_timer' was the best.
Comment 10 ykzhao 2009-09-29 05:51:13 UTC
Hi, Neil
    thanks for so quick response.
    From the test it seems that this issue is related with CPU deep C-state. When the box is booted with the boot option of "acpi=ht/off", the acpi is disabled and C-state won't be used. In such case the ethernet can work well.
    At the same time when the system is booted with "nolapic_timer"/"processor.max_cstate=1", the ethernet also can work well.
    So please add the boot option of "nolapic_timer" on your box.

Another issue is that the latency exiting from deep C-state is too big. In such case maybe the some ethernet packets will be lost. So it will be better that the 8139 ethernet driver updates pm_qos requirement.

And this bug will be re-assigned to ethernet driver.

Thanks.
Comment 11 Neil Wilson 2012-03-31 16:13:15 UTC
This fault still remains in the ethernet driver on version

Linux ubuntu 3.2.0-20-generic #33-Ubuntu SMP Tue Mar 27 16:42:26 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Comment 12 giorge943 2014-09-08 18:35:24 UTC
Still present in 3.13.0-32-generic
Comment 13 vasrg 2014-10-29 08:29:30 UTC
Still present in 3.16.0-23-generic (Lubuntu 14.10 (Utopic Unicorn) x86)

Note You need to log in before you can comment on or make changes to this bug.