Bug 198931 - Network connection on r8152 stops with "Tx status -71"
Summary: Network connection on r8152 stops with "Tx status -71"
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-25 13:03 UTC by Jean-Louis Dupond
Modified: 2020-06-16 15:23 UTC (History)
12 users (show)

See Also:
Kernel Version: 4.16-rc2 (drm-tip)
Tree: Mainline
Regression: No


Attachments
dmesg of Linux 4.17.0 (94.57 KB, text/plain)
2018-10-19 13:24 UTC, RussianNeuroMancer
Details
dmesg of Linux 4.19rc8 (110.52 KB, text/plain)
2018-10-19 13:25 UTC, RussianNeuroMancer
Details
dmesg of Linux 5.4.0 (174.20 KB, text/plain)
2020-03-03 14:13 UTC, RussianNeuroMancer
Details
lsusb output mjanssens wd15 xps9360 (3.98 KB, text/plain)
2020-05-05 14:27 UTC, Michiel Janssens
Details

Description Jean-Louis Dupond 2018-02-25 13:03:56 UTC
Hi All!

I have the following setup:
Precision 5520
Dell WD15 Docking

On the WD15 docking there is an network port r8152.

Now once in a while (like 1 time a week), the connection dies, and kernel prints the following message:
kernel: [ 6164.073282] r8152 4-1.2:1.0 enxa44cc890f4c8: Tx status -71

simply replugging the dock, or disable/enable the network interface fixes the problem.

Question is, how comes this appear :)

Feel free to ask for additional information!
Comment 1 RussianNeuroMancer 2018-10-19 13:23:48 UTC
Seems like a have same issue on Dell Latitude 7285 and HP EliteBook Folio G1 with Belkin USB-C Express Dock 3.1 HD F4U093:

[ 1090.235874] pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
[ 1090.235879] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 1090.235886] pcieport 0000:00:1c.0:   device [8086:9d10] error status/mask=00003000/00002000
[ 1090.235889] pcieport 0000:00:1c.0:    [12] Timeout               
[ 1589.760804] r8152 4-1.1.2:1.0 enx58ef68a8892b: Tx status -2
[ 1594.048998] ------------[ cut here ]------------
[ 1594.049003] NETDEV WATCHDOG: enx58ef68a8892b (r8152): transmit queue 0 timed out
[ 1594.049040] WARNING: CPU: 0 PID: 9 at net/sched/sch_generic.c:461 dev_watchdog+0x221/0x230
[ 1594.049042] Modules linked in: [...]
[ 1594.049294] CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.19.0-041900rc8-generic #201810150631
[ 1594.049296] Hardware name: Dell Inc. Latitude 7285/0VVWNX, BIOS 1.2.0 07/09/2018
[ 1594.049303] RIP: 0010:dev_watchdog+0x221/0x230
[ 1594.049307] Code: 00 49 63 4e e0 eb 92 4c 89 ef c6 05 26 ff f5 00 01 e8 c3 b6 fc ff 89 d9 4c 89 ee 48 c7 c7 08 2d 7b 82 48 89 c2 e8 61 26 7b ff <0f> 0b eb c0 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48
[ 1594.049310] RSP: 0018:ffffc2438192bd70 EFLAGS: 00010282
[ 1594.049314] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
[ 1594.049317] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffffa0341e216420
[ 1594.049320] RBP: ffffc2438192bda0 R08: 0000000000000001 R09: 0000000000000511
[ 1594.049322] R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000001
[ 1594.049325] R13: ffffa033fac37000 R14: ffffa033fac374c0 R15: ffffa034166cf680
[ 1594.049329] FS:  0000000000000000(0000) GS:ffffa0341e200000(0000) knlGS:0000000000000000
[ 1594.049332] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1594.049335] CR2: 00007f26eec71020 CR3: 000000038a20a005 CR4: 00000000003606f0
[ 1594.049340] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1594.049342] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1594.049344] Call Trace:
[ 1594.049356]  ? pfifo_fast_change_tx_queue_len+0x2e0/0x2e0
[ 1594.049363]  call_timer_fn+0x30/0x130
[ 1594.049371]  run_timer_softirq+0x3ea/0x420
[ 1594.049376]  ? __switch_to_asm+0x34/0x70
[ 1594.049381]  ? __switch_to+0xad/0x500
[ 1594.049385]  ? __switch_to_asm+0x40/0x70
[ 1594.049388]  ? __switch_to_asm+0x34/0x70
[ 1594.049392]  ? __switch_to_asm+0x40/0x70
[ 1594.049397]  __do_softirq+0xdc/0x2b5
[ 1594.049403]  run_ksoftirqd+0x2b/0x40
[ 1594.049410]  smpboot_thread_fn+0xd0/0x170
[ 1594.049416]  kthread+0x120/0x140
[ 1594.049421]  ? sort_range+0x30/0x30
[ 1594.049426]  ? kthread_bind+0x40/0x40
[ 1594.049431]  ret_from_fork+0x35/0x40
[ 1594.049435] ---[ end trace 3fcb83dc58402212 ]---
[ 1594.049468] r8152 4-1.1.2:1.0 enx58ef68a8892b: Tx timeout
[ 1599.172616] r8152 4-1.1.2:1.0 enx58ef68a8892b: Tx timeout
[ 1604.288619] r8152 4-1.1.2:1.0 enx58ef68a8892b: Tx timeout
[ 1610.176579] r8152 4-1.1.2:1.0 enx58ef68a8892b: Tx timeout

Full logs:
Comment 2 RussianNeuroMancer 2018-10-19 13:24:37 UTC
Created attachment 279099 [details]
dmesg of Linux 4.17.0
Comment 3 RussianNeuroMancer 2018-10-19 13:25:08 UTC
Created attachment 279101 [details]
dmesg of Linux 4.19rc8
Comment 4 RussianNeuroMancer 2018-11-28 11:43:09 UTC
Jean-Louis, can you please verify if issue is still reproducible for you on Linux 4.20rc4? For me, at least with one dock (Belkin USB-C Express Dock 3.1 HD F4U093) and one device (HP Elite x2 1013 G3) this issue is no longer reproducible. I will verify other laptops with Linux 4.20 later.
Comment 5 Jean-Louis Dupond 2018-12-04 12:14:35 UTC
I haven't seen this the last months. Running Ubuntu 18.10 with 4.18.0-11-generic
Comment 6 Konstantin Sobolev 2019-12-04 23:22:05 UTC
I have a very similar setup: Dell Precision 7540 with WD19DC dock that has RTL8153 adapter. It crashes periodically with similar symptoms, my current kernel is 5.4.1

[76658.437411] ------------[ cut here ]------------
[76658.437412] NETDEV WATCHDOG: enp57s0u2u4 (r8152): transmit queue 0 timed out
[76658.437421] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:447 dev_watchdog+0x21f/0x230
[76658.437421] Modules linked in: snd_usb_audio snd_usbmidi_lib snd_rawmidi r8152 mii tun md4 cifs dm_zero fuse raid10 raid1 raid0 dm_raid raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq dm_crypt dm_mirror dm_region_hash dm_log dm_mod dax ohci_pci ohci_hcd uhci_hcd ehci_pci ehci_hcd mousedev hid_multitouch dell_rbtn input_leds dell_laptop dell_wmi dell_smbios i2c_designware_platform atkbd rtsx_pci_sdmmc mmc_core mei_hdcp i2c_designware_core dell_wmi_descriptor intel_wmi_thunderbolt wmi_bmof intel_rapl_msr libps2 dcdbas dell_smm_hwmon btusb btrtl btbcm uvcvideo x86_pkg_temp_thermal videobuf2_vmalloc btintel intel_powerclamp videobuf2_memops coretemp videobuf2_v4l2 ucsi_acpi bluetooth processor_thermal_device videodev intel_lpss_pci typec_ucsi mei_me i2c_i801 rtsx_pci intel_soc_dts_iosf ecdh_generic intel_lpss mei mfd_core ecc videobuf2_common intel_rapl_common intel_pch_thermal typec wmi i8042 int3403_thermal int3400_thermal i2c_hid dell_smo8800
[76658.437439]  int340x_thermal_zone serio acpi_thermal_rel intel_pmc_core evdev i915
[76658.437441] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U            5.4.1-gentoo #6
[76658.437442] Hardware name: Dell Inc. Precision 7540/0CYJDT, BIOS 1.4.0 09/23/2019
[76658.437443] RIP: 0010:dev_watchdog+0x21f/0x230
[76658.437444] Code: 85 c0 75 e8 eb a8 4c 89 ef c6 05 5d 62 b3 00 01 e8 e6 c8 fc ff 44 89 e1 4c 89 ee 48 c7 c7 48 c1 49 9f 48 89 c2 e8 ea 11 8a ff <0f> 0b eb 89 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 c7 47 08 00
[76658.437444] RSP: 0018:ffffb139401b8e80 EFLAGS: 00010282
[76658.437445] RAX: 0000000000000000 RBX: ffff9f890e1a2a00 RCX: 00000000000011d4
[76658.437445] RDX: 0000000000000001 RSI: 0000000000000086 RDI: ffffffffa39e53ac
[76658.437446] RBP: ffff9f8914cc7440 R08: 0000000000000001 R09: 00000000000011d4
[76658.437446] R10: 0000000000028978 R11: 0000000000000001 R12: 0000000000000000
[76658.437447] R13: ffff9f8914cc7000 R14: ffff9f8914cc7440 R15: 0000000000000001
[76658.437447] FS:  0000000000000000(0000) GS:ffff9f891c080000(0000) knlGS:0000000000000000
[76658.437448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[76658.437448] CR2: 00007fc9000030b8 CR3: 00000009c4384006 CR4: 00000000003606e0
[76658.437448] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[76658.437449] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[76658.437449] Call Trace:
[76658.437450]  <IRQ>
[76658.437452]  ? qdisc_put_unlocked+0x30/0x30
[76658.437454]  call_timer_fn+0x26/0x120
[76658.437454]  run_timer_softirq+0x17d/0x470
[76658.437456]  ? enqueue_hrtimer+0x31/0x80
[76658.437457]  ? __hrtimer_run_queues+0x11b/0x260
[76658.437458]  __do_softirq+0xd6/0x2ba
[76658.437460]  irq_exit+0x9b/0xa0
[76658.437461]  smp_apic_timer_interrupt+0x5b/0x110
[76658.437462]  apic_timer_interrupt+0xf/0x20
[76658.437462]  </IRQ>
[76658.437464] RIP: 0010:cpuidle_enter_state+0xa8/0x400
[76658.437464] Code: c5 0f 1f 44 00 00 31 ff e8 85 fb 9b ff 80 7c 24 0b 00 74 12 9c 58 f6 c4 02 0f 85 2d 03 00 00 31 ff e8 7c d1 a0 ff fb 45 85 e4 <0f> 88 6c 02 00 00 4c 2b 2c 24 49 63 cc 48 8d 04 49 48 c1 e0 05 8b
[76658.437465] RSP: 0018:ffffb139400dfe70 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
[76658.437465] RAX: ffff9f891c0a7bc0 RBX: ffffffff9f6a1ce0 RCX: 000045b86eed58ba
[76658.437466] RDX: 000045b86efc9af4 RSI: 000045b86eed58ba RDI: 0000000000000000
[76658.437466] RBP: ffffd1393fab4a10 R08: 000045b86eed58d6 R09: 00000000000001bf
[76658.437466] R10: ffff9f891c0a6c20 R11: ffff9f891c0a6c00 R12: 0000000000000002
[76658.437467] R13: 000045b86eed58d6 R14: 0000000000000002 R15: ffff9f8915f5c740
[76658.437468]  cpuidle_enter+0x24/0x40
[76658.437470]  do_idle+0x1bf/0x230
[76658.437471]  cpu_startup_entry+0x14/0x20
[76658.437472]  start_secondary+0x131/0x160
[76658.437473]  secondary_startup_64+0xa4/0xb0
[76658.437474] ---[ end trace 907e490a0cd3c160 ]---
[76658.437476] r8152 4-2.4:1.0 enp57s0u2u4: Tx timeout
[76659.788078] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B (start=83035 end=83036) time 243 us, min 1431, max 1439, scanline start 1421, end 1443
[76660.958672] r8152 4-2.4:1.0 enp57s0u2u4: Tx status -2
[76660.958758] r8152 4-2.4:1.0 enp57s0u2u4: Tx status -2
[76660.958848] r8152 4-2.4:1.0 enp57s0u2u4: Tx status -2
[76660.958940] r8152 4-2.4:1.0 enp57s0u2u4: Tx status -2
Comment 7 Peter 2020-01-27 13:20:07 UTC
I have a similiar setup and similiar problem:
Setup: Lenovo Thinkpad t480, Think Pad USB-C Dock 40A90090EU [1], Ubuntu 16.04, Kernel  4.15.0-74-generic #83~16.04.1-Ubuntu

Network connection is periodically crashing.
Dmesg shows `r8152 4-1.1:1.0 enxe04f43991e1c: Rx status -71` in that case. 
I noticed that this seams to depend on the use of the network connection. E.g. if I compile a lot using icecream to distribute compilation jobs, it seams to be a lot less stable. 

Using `rmmod r8152 && modprobe r8152` fixes the problem temporarily. 


[1] https://support.lenovo.com/de/de/accessories/acc100348
Comment 8 RussianNeuroMancer 2020-01-27 14:00:29 UTC
@Peter check Comment 4
Re-test on newer kernel (you can take it from mainline PPA).
Comment 9 Timur Kristóf 2020-03-02 12:08:11 UTC
This still happens to me on 5.5.6-201.fc31.x86_64. My dmesg is full of these messages:

[12696.189484] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12702.333456] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12707.965422] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12713.085385] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12718.205360] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12724.349321] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12729.981295] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12735.101256] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12740.221235] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12746.365199] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12751.997171] r8152 6-1:1.0 enp10s0u1: Tx timeout
[12757.117155] r8152 6-1:1.0 enp10s0u1: Tx timeout
Comment 10 RussianNeuroMancer 2020-03-03 08:06:00 UTC
Timur, you are using same docking station as Jean-Louis or some other?
Comment 11 Timur Kristóf 2020-03-03 12:47:07 UTC
RussianNeuroMancer, I use a Dell XPS 13 9370 with a Lenovo ThinkPad branded Thunderbolt 3 dock. The model number is DBB9003L1. (The dock is not mine, I'm just borrowing it from a collegaue for a week.) I think these docks mostly use the same hardware under the hood, I think I've also seen a Fedora bug report about the same issue with the Dell TB16 here: https://bugzilla.redhat.com/show_bug.cgi?id=1460789
Comment 12 RussianNeuroMancer 2020-03-03 13:01:35 UTC
I see. By the way, since my Comment 4 I was able to reproduce this issue again. This time with Linux 5.4 on Dell Venue 8 Pro 5855 and Dell WD15 Dock.
Comment 13 RussianNeuroMancer 2020-03-03 14:13:28 UTC
Created attachment 287779 [details]
dmesg of Linux 5.4.0
Comment 14 BniceJada 2020-03-05 14:03:27 UTC
Same problem here. Dell Latitude 7480 (BIOS 1.16.1) with WD15 dock (Port Controller on v1.1.8). I am using 5.5.7-zen1-1-zen (but the same problem also occured with the standard arch kernel).

It has not occured with 5.4.2.arch1-1 but it for sure occured with 5.4.5.arch1-1 (I had holidays inbetween and the troubles started after them).

--
Mar 05 13:42:34 hostname kernel: ------------[ cut here ]------------
Mar 05 13:42:34 hostname kernel: NETDEV WATCHDOG: enp59s0u1u2 (r8152): transmit queue 0 timed out
Mar 05 13:42:34 hostname kernel: WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0x268/0x270
Mar 05 13:42:34 hostname kernel: Modules linked in: md4 nls_utf8 cifs dns_resolver fscache libdes rfcomm ip6t_REJECT nf_reject_ipv6 xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_na>
Mar 05 13:42:34 hostname kernel:  coretemp snd_hda_codec_generic ledtrig_audio kvm_intel snd_pcm_dmaengine snd_hda_intel dell_wmi_descriptor dcdbas snd_intel_dspcfg dell_smm_hwmon snd_hda_codec kvm cfg80211 snd_hda_core snd_hwdep snd_pcm e1000e fuse irqbypass i>
Mar 05 13:42:34 hostname kernel:  libps2 aesni_intel crypto_simd cryptd glue_helper xhci_pci xhci_hcd rtsx_pci i8042 serio i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm intel_agp intel_gtt agpgart btrfs blake2b_generic libcr>
Mar 05 13:42:34 hostname kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.5.7-zen1-1-zen #1
Mar 05 13:42:34 hostname kernel: Hardware name: Dell Inc. Latitude 7480/00F6D3, BIOS 1.16.1 10/03/2019
Mar 05 13:42:34 hostname kernel: RIP: 0010:dev_watchdog+0x268/0x270
Mar 05 13:42:34 hostname kernel: Code: 47 9c 69 ff eb 8a 4c 89 f7 c6 05 dc 05 db 00 01 e8 0d fa f9 ff 44 89 e9 4c 89 f6 48 c7 c7 d0 2a 5a 8e 48 89 c2 e8 0f 92 73 ff <0f> 0b e9 68 ff ff ff 90 0f 1f 44 00 00 48 c7 47 08 00 00 00 00 48
Mar 05 13:42:34 hostname kernel: RSP: 0018:ffffb39300164e60 EFLAGS: 00010286
Mar 05 13:42:34 hostname kernel: RAX: 0000000000000000 RBX: ffff8cdc200b2000 RCX: 0000000000000000
Mar 05 13:42:34 hostname kernel: RDX: 0000000000000103 RSI: 00000000000000f6 RDI: 00000000ffffffff
Mar 05 13:42:34 hostname kernel: RBP: ffff8cdc0e5bf45c R08: 0000000000000515 R09: 0000000000000003
Mar 05 13:42:34 hostname kernel: R10: 0000000000000001 R11: 0000000000003c00 R12: ffff8cdc0e5bf480
Mar 05 13:42:34 hostname kernel: R13: 0000000000000000 R14: ffff8cdc0e5bf000 R15: ffff8cdc200b2080
Mar 05 13:42:34 hostname kernel: FS:  0000000000000000(0000) GS:ffff8cdc26500000(0000) knlGS:0000000000000000
Mar 05 13:42:34 hostname kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 05 13:42:34 hostname kernel: CR2: 00007fe15a0d3000 CR3: 000000019f20a001 CR4: 00000000003606e0
Mar 05 13:42:34 hostname kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 05 13:42:34 hostname kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar 05 13:42:34 hostname kernel: Call Trace:
Mar 05 13:42:34 hostname kernel:  <IRQ>
Mar 05 13:42:34 hostname kernel:  ? qdisc_put_unlocked+0x30/0x30
Mar 05 13:42:34 hostname kernel:  ? qdisc_put_unlocked+0x30/0x30
Mar 05 13:42:34 hostname kernel:  call_timer_fn+0x2d/0x150
Mar 05 13:42:34 hostname kernel:  ? qdisc_put_unlocked+0x30/0x30
Mar 05 13:42:34 hostname kernel:  run_timer_softirq+0xaec/0xce0
Mar 05 13:42:34 hostname kernel:  __do_softirq+0x111/0x374
Mar 05 13:42:34 hostname kernel:  ? hrtimer_interrupt+0x235/0x3e0
Mar 05 13:42:34 hostname kernel:  irq_exit+0xc9/0x120
Mar 05 13:42:34 hostname kernel:  smp_apic_timer_interrupt+0xa6/0x1a0
Mar 05 13:42:34 hostname kernel:  apic_timer_interrupt+0xf/0x20
Mar 05 13:42:34 hostname kernel:  </IRQ>
Mar 05 13:42:34 hostname kernel: RIP: 0010:cpuidle_enter_state+0xc9/0x850
Mar 05 13:42:34 hostname kernel: Code: e8 8c b0 85 ff 80 7c 24 0f 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 00 06 00 00 31 ff e8 3e 09 8d ff fb 66 0f 1f 44 00 00 <45> 85 e4 0f 88 1f 04 00 00 49 63 d4 4c 2b 6c 24 10 48 8d 04 52 48
Mar 05 13:42:34 hostname kernel: RSP: 0018:ffffb393000dbe50 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
Mar 05 13:42:34 hostname kernel: RAX: ffff8cdc26500000 RBX: ffff8cdc26537800 RCX: 000000000000001f
Mar 05 13:42:34 hostname kernel: RDX: 0000000000000000 RSI: 000000002f32988b RDI: 0000000000000000
Mar 05 13:42:34 hostname kernel: RBP: ffffffff8e8bea60 R08: 00000a3f27ed44df R09: 00000a3f251f7ba7
Mar 05 13:42:34 hostname kernel: R10: 0000000000000007 R11: 0000000000000007 R12: 0000000000000008
Mar 05 13:42:34 hostname kernel: R13: 00000a3f27ed44df R14: 0000000000000008 R15: ffff8cdc22a98000
Mar 05 13:42:34 hostname kernel:  cpuidle_enter+0x29/0x40
Mar 05 13:42:34 hostname kernel:  do_idle+0x20c/0x2c0
Mar 05 13:42:34 hostname kernel:  cpu_startup_entry+0x19/0x20
Mar 05 13:42:34 hostname kernel:  start_secondary+0x1c6/0x220
Mar 05 13:42:34 hostname kernel:  secondary_startup_64+0xb6/0xc0
Mar 05 13:42:34 hostname kernel: ---[ end trace 358d3d81e0691439 ]---
Mar 05 13:42:34 hostname kernel: r8152 4-1.2:1.0 enp59s0u1u2: Tx timeout
Mar 05 13:42:40 hostname kernel: r8152 4-1.2:1.0 enp59s0u1u2: Tx timeout
Mar 05 13:42:46 hostname kernel: r8152 4-1.2:1.0 enp59s0u1u2: Tx timeout
Mar 05 13:42:51 hostname kernel: r8152 4-1.2:1.0 enp59s0u1u2: Tx timeout
Mar 05 13:42:56 hostname kernel: r8152 4-1.2:1.0 enp59s0u1u2: Tx timeout
Comment 15 Jamin W. Collins 2020-03-08 20:43:20 UTC
I've been encountering this problem with every relatively recent (4.9+) kernel, and possibly older ones as well.

System: Lenovo W530

USB adapter: 
Cable Matters 3 Port USB 3.0 Hub with Ethernet (USB Hub with Ethernet, Gigabit Ethernet USB Hub ) Supporting 10/100/1000 Mbps Ethernet Network in Black
https://smile.amazon.com/gp/product/B01J6583NK/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1
Bus 004 Device 003: ID 0bda:8153 Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter

I've encountered the problem with Arch's main linux kernels and their LTS builds.

The interface seems to have trouble once it is put under any sort of load (30% or more utilization) on the host system. Removing and reloading the module can sometimes temporarily improve things, but (from what I've seen) the issue always returns within a few minutes to an hour.
Comment 17 RussianNeuroMancer 2020-03-09 18:26:12 UTC
So if it's same issue there is at least several workarounds: 

"usbcore.quirks=0bda:8153:k" kernel boot option (from first askubuntu link)
Install tlp and change USB_BLACKLIST option in /etc/default/tlp to "0bda:8153" (from second askubuntu link)
Patch /drivers/usb/core/quirks.c with following line (mentioned in tlp bugreport)
{ USB_DEVICE(0x0bda, 0x8153), .driver_info = USB_QUIRK_NO_LPM },

Unfortunately, this week I doesn't have access to Dell WD15 docking station. Is any else can try at least first or second workaround?
Comment 18 BniceJada 2020-03-12 05:47:10 UTC
(In reply to RussianNeuroMancer from comment #17)
> So if it's same issue there is at least several workarounds: 
> 
> "usbcore.quirks=0bda:8153:k" kernel boot option (from first askubuntu link)
> Install tlp and change USB_BLACKLIST option in /etc/default/tlp to
> "0bda:8153" (from second askubuntu link)
> Patch /drivers/usb/core/quirks.c with following line (mentioned in tlp
> bugreport)
> { USB_DEVICE(0x0bda, 0x8153), .driver_info = USB_QUIRK_NO_LPM },

I can confirm that blacklisting "0bda:8153" for USB_BLACKLIST in my tlp.conf seems to work fine for me. Prior to this change I lost network connection each night and now I have connection straight for the last two nights (three days)
Comment 19 Peter 2020-03-13 10:20:51 UTC
(In reply to RussianNeuroMancer from comment #17)

> "usbcore.quirks=0bda:8153:k" kernel boot option (from first askubuntu link)

I can confirm that adding "usbcore.quirks=0bda:8153:k" to kernel boot options worked for me.
Comment 20 Hans de Goede 2020-03-13 11:04:25 UTC
So reading through this bug report, the solution, or at least a workaround would seem to be to add USB_QUIRK_NO_LPM entries for the troublesome rtl8152 / rtl8153 based ethernet adapters to drivers/usb/core/quirks.c. There actually already is at least one line in there for a dock with a r8153 nic:

        /* Microsoft Surface Dock Ethernet (RTL8153 GigE) */
        { USB_DEVICE(0x045e, 0x07c6), .driver_info = USB_QUIRK_NO_LPM },

There is mention of several docks here; but upon checking various logs, they all seem to use the generic realtek usb-id for the RTL8153 GigE NIC.

So it seems that the solution is adding the following lines to: drivers/usb/core/quirks.c :

        /* Generic RTL8153 GigE adapters */
        { USB_DEVICE(0x0bda, 0x8153), .driver_info = USB_QUIRK_NO_LPM },

I will submit a patch upstream for this.
Comment 21 Peter 2020-03-13 11:37:49 UTC
Unfortunately I was to enthusiastic about this. 
I wrote my comment after 1 day of working and 1 night of downloading huge amount of big data without problems. But after that using icecream distributed compiler daemon again crashed my connection. 

So it seams to be better but not solved for me.
Comment 22 Hans de Goede 2020-03-13 12:04:16 UTC
(In reply to Peter from comment #21)
> Unfortunately I was to enthusiastic about this. 
> I wrote my comment after 1 day of working and 1 night of downloading huge
> amount of big data without problems. But after that using icecream
> distributed compiler daemon again crashed my connection. 
> 
> So it seams to be better but not solved for me.

I'm sorry to hear that the issue is not 100% resolved. Still I've found enough other bug-reports where people are having success with this option when used with a RTL813 device, that I believe that it is worthwhile to submit a patch for this upstream, see. e.g. :
https://bugzilla.redhat.com/show_bug.cgi?id=1713657
Comment 23 RussianNeuroMancer 2020-03-13 13:29:52 UTC
> https://bugzilla.redhat.com/show_bug.cgi?id=1713657

I wonder why blacklist in tlp didn't help him, but usbcore.quirks does.
Comment 24 RussianNeuroMancer 2020-03-13 14:21:22 UTC
> But after that using icecream distributed compiler daemon again crashed my
> connection. 

> So it seams to be better but not solved for me.

Try this:

1. remove lines 737-737 here https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/net/usb/cdc_ether.c#L733

2. remove lines 6900 and 6901 here https://github.com/torvalds/linux/blob/0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/net/usb/r8152.c#L6900

Back in Linux 4.18/4.19 days that allowed me to workaround similar issue on HP Elite x2 1013 G3 and Belkin USB-C Express Dock 3.1 HD.
Comment 25 Hans de Goede 2020-03-13 14:29:54 UTC
(In reply to RussianNeuroMancer from comment #24)
> Try this:
> 
> 1. remove lines 737-737 here
> https://github.com/torvalds/linux/blob/
> 0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/net/usb/cdc_ether.c#L733
> 
> 2. remove lines 6900 and 6901 here
> https://github.com/torvalds/linux/blob/
> 0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/net/usb/r8152.c#L6900
> 
> Back in Linux 4.18/4.19 days that allowed me to workaround similar issue on
> HP Elite x2 1013 G3 and Belkin USB-C Express Dock 3.1 HD.

Hmm, so in essence that swaps the driver which is specifically made for the RTL8153 with the generic USB ethernet class driver.

Although it might be interesting to try that there are known issues with that.

E.g. with a Lenovo thunderbolt 3 gen 2 dock, when the laptop is turned off while connected to the dock, most of the dock is turned off, but the ethernet card still has power (for wake on lan I guess) and when using the cdc_ether driver, then the RTL8153 nick will start spamming the network as fast as it can after the laptop has been turned off, which in my case made my entire (wired) home network unusable (*).

So I actually send a patch upstream doing the opposite, adding the Lenovo specific USB-ids for the RTL8153 to the blacklist in cdc_ether and to the white/device-id list in r8152.c which solved the dock jamming my wired network after the laptop turned off.

*) I'm using a cheap unmanaged switch a better switch may have kept the network at least somewhat usable
Comment 26 Marcus Sundman 2020-04-30 02:11:35 UTC
(In reply to RussianNeuroMancer from comment #24)
> > But after that using icecream distributed compiler daemon again crashed my
> > connection. 
> 
> > So it seams to be better but not solved for me.
> 
> Try this:
> 
> 1. remove lines 737-737 here
> https://github.com/torvalds/linux/blob/
> 0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/net/usb/cdc_ether.c#L733
> 
> 2. remove lines 6900 and 6901 here
> https://github.com/torvalds/linux/blob/
> 0d81a3f29c0afb18ba2b1275dcccf21e0dd4da38/drivers/net/usb/r8152.c#L6900
> 
> Back in Linux 4.18/4.19 days that allowed me to workaround similar issue on
> HP Elite x2 1013 G3 and Belkin USB-C Express Dock 3.1 HD.

My 0bda:8153 also stops working with the cdc_ether driver (without it saying anything in syslog).
Blacklisting 0bda:8153 in TLP didn't work.
Adding 0bda:8153 quirks kernel parameter didn't work.
Using the newest r8152.53.56-2.12.0 driver from realtek didn't work.
Comment 27 Michiel Janssens 2020-05-01 08:38:23 UTC
Similar issues here with a Dell dock WD15, connected from Dell XPS 13 9360.
Since several months the wired connection from the dock 0bda:8153 dies, but the network stack isn't notified. A reboot after this waits endlessly on services to stop. Sometimes Gnome gui locks up shortly after logging back in the system and being presented with the issue. I have to do REISUB to get the system working again.
The issue doesn't appear while working on the the system, mostly when leaving it running by itself for a while. I haven't found a way to actually trigger it.

At the moment I'm running openSUSE Tumbleweed with kernel 5.6.6, issue is still happening.
I tried quircks, but no result.
For several days i'm now testing running it with usbcore.autosuspend=-1 and have left the system running for longer periods. The issue didn't happen so far.

Side note:
Commit 75d7676ead19b1fbb5e0ee934c9ccddcb666b68c doesn't seem to have fixed the message "Tx status -71" from the original bug reporter. (Tx timeout, in my case)
That still happens once in a while.
Comment 28 Marcus Sundman 2020-05-01 21:42:27 UTC
The usbcore.autosuspend=-1 kernel parameter doesn't resolve it for me.
Also, I can trigger the problem in seconds, simply by reading at gigabit speeds.
Comment 29 Hans de Goede 2020-05-04 13:05:08 UTC
At least for thise seeing issues with Dell's WD15 dock I think that trying something similar to this quirk might help:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b63e48fb50e1ca71db301ca9082befa6f16c55c4

To try this, first do:

lsusb -t

To find the Bus and Dev number of any USB hub(s) inside the dock.

Then do:

lsusb

And lookup the same Bus and Dev number to get the vendor- and product-id used for the hub, e.g. 0bda:0487

Then try booting with this added to your kernel commandline:

usbcore.quirks=0bda:0487:k

Replacing the 0bda:0487 with the <vend>:<prod> ids for your hub (from the lsusb output). If you want to try this on more then one USB device, you can specify the NO_LPM quirk for multiple USB devices like this:

usbcore.quirks=0bda:0487:k,0bda:0488:k

Please give this a try and see if that helps. Also note that the same thing can be used to set the NO_LPM quirk on the USB ethernet-chip itself if it has a different USB-id which is not yet in the kernel's quirks list.
Comment 30 Marcus Sundman 2020-05-04 22:41:26 UTC
(In reply to Hans de Goede from comment #29)
> Replacing the 0bda:0487 with the <vend>:<prod> ids for your hub (from the
> lsusb output). If you want to try this on more then one USB device, you can
> specify the NO_LPM quirk for multiple USB devices like this:
> 
> usbcore.quirks=0bda:0487:k,0bda:0488:k

This didn't work.

I have 3 devices:
Bus 003 Device 009: ID 0bda:8153 Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter
Bus 003 Device 008: ID 0bda:0411 Realtek Semiconductor Corp. 4-Port USB 3.0 Hub
Bus 002 Device 006: ID 0bda:5411 Realtek Semiconductor Corp. 4-Port USB 2.0 Hub

I added these kernel params:
usbcore.quirks=0bda:8153:k,0bda:5411:k,0bda:0411:k usbcore.autosuspend=-1

Still fails with either
Rx status -71
or
Tx status -71
after reading 50 MB/s over the network for a minute or few.

> Please give this a try and see if that helps. Also note that the same thing
> can be used to set the NO_LPM quirk on the USB ethernet-chip itself if it
> has a different USB-id which is not yet in the kernel's quirks list.

I'm not sure how to do that. As far as I can tell my ethernet chip is at 0bda:8153 (which in my case is at usb@3:1.4, which maps to device 9, which maps to 0bda:8153).
Comment 31 Hans de Goede 2020-05-05 13:11:56 UTC
@Marcus Sundman, right you have already set the flag for your ethernet usb controller by adding the 0bda:8153:k part to the quirks. So it seems that at least for you setting the NO_LPM flag does not help.

Does your dock have updateable firmware? If so you may want to try to update the firmware. The first generation thunderbolt docks from all vendors were notoriously buggy and the all need the latest firmware to work at least somewhat reliable. Getting the latest firmware is also strongly advised for people using Windows since there really were quite a few issues with these devices which are fixed with fw updates.

Yes, I wrote somewhat reliable, the best fix for thunderbolt dock issues often is getting a second generation or newer dock :(
Comment 32 Michiel Janssens 2020-05-05 14:23:03 UTC
(In reply to Hans de Goede from comment #29)
Thanks for posting this instruction!
I already had seen the commit for the WD19, but it wasn't clear how I should investigate that on my system.
The WD15 doc adds 2 usb busses with both a Microchip USB hub
I removed the usbcore.autosuspend=-1 parameter and will test for several days with usbcore.quirks=0424:5537:k, which is the hub which has the 0bda:8153 as child.
I will add attachments with my lsusb output.
Comment 33 Michiel Janssens 2020-05-05 14:27:36 UTC
Created attachment 288917 [details]
lsusb output mjanssens wd15 xps9360
Comment 34 Michiel Janssens 2020-05-06 10:41:55 UTC
It didn't take days to get results.

Just the hub where 0bda:8153 is child
usbcore.quirks=0424:5537:k
result: 0bda:8153 dies after a while, without log entry, needed REISUB to reboot

Both hubs which are added by connecting WD15
usbcore.quirks=0424:5537:k,0424:2137:k
result: 0bda:8153 dies after a while, without log entry, needed REISUB to reboot

So I'm back to using usbcore.autosuspend=-1.
Please advise if I missed something (or incorrect dev id) I could test.
Comment 35 Hans de Goede 2020-05-06 10:43:36 UTC
(In reply to Michiel Janssens from comment #34)
> It didn't take days to get results.
> 
> Just the hub where 0bda:8153 is child
> usbcore.quirks=0424:5537:k
> result: 0bda:8153 dies after a while, without log entry, needed REISUB to
> reboot
> 
> Both hubs which are added by connecting WD15
> usbcore.quirks=0424:5537:k,0424:2137:k
> result: 0bda:8153 dies after a while, without log entry, needed REISUB to
> reboot
> 
> So I'm back to using usbcore.autosuspend=-1.
> Please advise if I missed something (or incorrect dev id) I could test.

There have been a lot of firmware updates for the wd15, do you have these all applied?
Comment 36 Michiel Janssens 2020-05-06 11:32:54 UTC
(In reply to Hans de Goede from comment #35)

> There have been a lot of firmware updates for the wd15, do you have these
> all applied?

Good catch, i'm on 1.0.4 according to fwupdmgr. Latest is 1.0.6 on the dell site.
Unfortunately the wd15 appears not (yet) to be fully supported via fwupdmgr so Windows is the only option, sigh. I will try to update, test again and report.
Bios is current by the way.
Comment 37 Marcus Sundman 2020-05-07 00:11:04 UTC
(In reply to Hans de Goede from comment #31)
> @Marcus Sundman, right you have already set the flag for your ethernet usb
> controller by adding the 0bda:8153:k part to the quirks. So it seems that at
> least for you setting the NO_LPM flag does not help.
I also tried without usbcore.autosuspend=-1 but that also didn't help.

> Does your dock have updateable firmware? If so you may want to try to update
> the firmware.
It's a LogiLink UA0173A, and it doesn't seem to have any firmware available (only newer drivers, which I already tried).
Comment 38 RussianNeuroMancer 2020-05-07 03:49:50 UTC
Just for the record, I was able to reproduce this issue even on NanoPi-M1 (Allwinner H3) with Linux 5.4.32 attached to Belkin USB-C Express Dock 3.1 HD F4U093 (did this for convenience, just to quickly get working keyboard and mouse without reattaching keyboard and mouse cables from dock to board). Quirk was included in 5.4 since 5.4.28 so it already applied. Unfortunately, I didn't expected this issue to be reproducible with NanoPi-M1 board, so I didn't saved lsusb -t before/after this happened.
Comment 39 Michiel Janssens 2020-05-07 10:30:09 UTC
(In reply to Michiel Janssens from comment #36)

I ran several updaters from Dell under Windows. My WD15 firmware components (4 of them) were already current, apparently the main version is some sort of wrapper. So no updates to Bios or WD15 firmware are possible.
At the moment I run kernel 5.6.8, so I ran all tests again:
- with or without usbcore.quirks=0424:5537:k,0424:2137:k the nic dies after a while
- with usbcore.autosuspend=-1 the nic remains alive
Comment 40 Marcus Sundman 2020-06-13 02:01:35 UTC
Still the same problem with Realtek's new driver, r8152.53.56-2.13.0, on ubuntu's 5.4.0-37-generic with usbcore.autosuspend=-1.

It fails with 'Tx status -71' or 'Rx status -71':
> net_ratelimit: 22 callbacks suppressed
> r8152 3-2.4:1.0 enx00e04d6aeb98: Tx status -71
> r8152 3-2.4:1.0 enx00e04d6aeb98: Tx status -71
> r8152 3-2.4:1.0 enx00e04d6aeb98: Tx status -71
> ...

But sometimes that quickly turns into this:
> xhci_hcd 0000:03:00.0: WARN: TRB error for slot 3 ep 3 on endpoint
> r8152 3-2.4:1.0 enx00e04d6aeb98: Tx status -84
> xhci_hcd 0000:03:00.0: WARN waiting for error on ep to be cleared
> r8152 3-2.4:1.0 enx00e04d6aeb98: failed tx_urb -22
> xhci_hcd 0000:03:00.0: WARN waiting for error on ep to be cleared
> r8152 3-2.4:1.0 enx00e04d6aeb98: failed tx_urb -22
> xhci_hcd 0000:03:00.0: WARN waiting for error on ep to be cleared
> r8152 3-2.4:1.0 enx00e04d6aeb98: failed tx_urb -22
> ...

I've also tried adjusting the nic's Rx ring size from 100 to 20 or 2000, but still the same crash seconds after starting a gigabit speed download.
Comment 41 Nikolay Kichukov 2020-06-16 15:23:32 UTC
GNU/Gentoo, 64bit here, kernel 5.7.2, same problem on Lenovo 40AS USB-C dock:

None of the suggested "workarounds" helped, here is the lsusb tree:

/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 10000M
    |__ Port 2: Dev 2, If 0, Class=Hub, Driver=hub/4p, 10000M
        |__ Port 1: Dev 4, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
        |__ Port 3: Dev 3, If 0, Class=Hub, Driver=hub/4p, 10000M

And the patch applied to the kernel(the ids differ):
cat /etc/portage/patches/sys-kernel/gentoo-sources-5.7.2/lenovo-usbc-dock-rtl-ethernet-quirk.patch 
--- a/drivers/usb/core/quirks.c	2020-06-01 01:49:15.000000000 +0200
+++ b/drivers/usb/core/quirks.c	2020-06-15 12:01:39.028377907 +0200
@@ -384,6 +384,11 @@
 	/* Generic RTL8153 based ethernet adapters */
 	{ USB_DEVICE(0x0bda, 0x8153), .driver_info = USB_QUIRK_NO_LPM },
 
+	/* Lenovo USB-C Ethernet RTL8153 based ethernet adapters */
+       { USB_DEVICE(0x1d6b, 0x0003), .driver_info = USB_QUIRK_NO_LPM },
+	{ USB_DEVICE(0x17ef, 0xa391), .driver_info = USB_QUIRK_NO_LPM },
+	{ USB_DEVICE(0x17ef, 0xa387), .driver_info = USB_QUIRK_NO_LPM },
+
 	/* Action Semiconductor flash disk */
 	{ USB_DEVICE(0x10d6, 0x2200), .driver_info =
 			USB_QUIRK_STRING_FETCH_255 },

and booting with:
usbcore.quirks=17ef:a387:k,17ef:a391:k,1d6b:0003:k

or 

usbcore.autosuspend=-1

does not help.

Same problem happens on laptops connected to this lenovo docks running windows OSes.

Note You need to log in before you can comment on or make changes to this bug.