Bug 200977
Summary: | Daily crash with r8152 driver requiring reboot | ||
---|---|---|---|
Product: | Drivers | Reporter: | mcarans |
Component: | Network | Assignee: | drivers_network (drivers_network) |
Status: | NEW --- | ||
Severity: | high | CC: | bavay, cornelius.riemenschneider, david, erki, frank-w, harald.rudell, hr.mitrev, igor.dejanovic, ikovnatsky, jade, loulou, marctraider, maregt0, Martin, matt, moritz.kernel, pdecat, rjloura, Russ.Dill, russianneuromancer, sgh, shahar.evron, siltaar, silvia, suhn, trourance, v.a.fedorov |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 4.15.0-33-generic | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | rtnetlink r8152 report |
Description
mcarans
2018-08-30 12:19:17 UTC
I found this bug report on OpenSuse that looks like it could be related: https://bugzilla.opensuse.org/show_bug.cgi?id=1083866 although this report says problems only started with 4.15 kernel, but I have had problems with previous kernel versions. The reporter of that bug has the same laptop as I do: Dell XPS 13 but a different manufacturer's hub. Created attachment 278209 [details]
rtnetlink r8152 report
Having a similar issue with USB-C attached Eth controller. See attached for details.
I originally created the bug report on openSUSE Tumbleweed as linked in https://bugzilla.opensuse.org/show_bug.cgi?id=1083866 with no solution to this day. The hang still occurs with the most recent kernel 4.18.7 and can only be prevented by disabling powersaving features for that device: === In my case === - In powertop, switching "Autosuspend for USB device USB 10/100/1000 LAN [Realtek]" from the default "GOOD" to "BAD". - Equivalent of "echo 'on' > '/sys/bus/usb/devices/4-1.4/power/control'" === Adjust according to your device === I have tried to bisect the problem repeatedly, but ended up nowhere - or in different places every single time depending on how you look at it. The fact that the issue arises with pre-4.15 kernels on other distributions might have something to do with that. However, 4.14 worked fine for me. The bisected kernels themselves behaved rather strangely. Sometimes running for 24 hours without hanging, sometimes hanging within minutes after a reboot, sometimes with sometimes without hung_task output. Advise on how to proceed would be appreciated. I have had to create udev rules both to turn off usb autosuspend for the device and also turn off Turbo Mode of the CPU: ACTION=="add", SUBSYSTEM=="usb", ATTR{idVendor}=="0bda", ATTR{idProduct}=="8153", TEST=="power/control", ATTR{power/control}="on" KERNEL=="cpu",RUN+="/bin/sh -c 'echo -n 1 > /sys/devices/system/cpu/intel_pstate/no_turbo'" For some reason my attempts to find existing bug reports failed :/ So I reported the - probably same issue - here again: https://bugzilla.kernel.org/show_bug.cgi?id=205155 (In reply to rans from comment #4) > I have had to create udev rules both to turn off usb autosuspend for the > device and also turn off Turbo Mode of the CPU: How did you know what to do (what to turn off) ? (autosuspend I can imagine, but why the Turbo Mode of the CPU? and how did you guess that? ;)) I found it from an answer on stackoverflow as I recall. BTW, if you need to reset your USB device (in particular the network), you may find this script I wrote helpful: https://askubuntu.com/questions/645/how-do-you-reset-a-usb-device-from-the-command-line/988297#988297 any new infos here? i have this problem still on ubuntu 18.4 with ubuntu-kernel 5.3.0-26 i found out that udev can trigger this bug (for reproduce it): udevadm info --attribute-walk /sys/class/net/ethx tested on other device with self compiled 5.4.12 works, so it seems to be fixed in mainline. any idea which commit fixed it? https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/usb/r8152.c?h=v5.4.15 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/net/core/rtnetlink.c?h=v5.4.15 seem not containing a fix for this, so i guess it's in another file... udev-rule (disable pm and cpu turbo-mode) does not help in my case This bug still exists in the version 5.6.3 and is very annoying because it freezes the kernel. I'm using Linux version 5.6.3-arch1-1 (linux@archlinux) (gcc version 9.3.0 (Arch Linux 9.3.0-1)) #1 SMP PREEMPT Wed, 08 Apr 2020 07:47:16 +000 logs from journalctl: avr 14 18:11:58 archfg kernel: INFO: task kworker/4:8:39290 blocked for more than 122 seconds. avr 14 18:11:58 archfg kernel: Tainted: G U OE 5.6.3-arch1-1 #1 avr 14 18:11:58 archfg kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. avr 14 18:11:58 archfg kernel: kworker/4:8 D 0 39290 2 0x80004080 avr 14 18:11:58 archfg kernel: Workqueue: events rtl_work_func_t [r8152] avr 14 18:11:58 archfg kernel: Call Trace: avr 14 18:11:58 archfg kernel: ? __schedule+0x2e8/0x7a0 avr 14 18:11:58 archfg kernel: schedule+0x46/0xf0 avr 14 18:11:58 archfg kernel: rpm_resume+0x18b/0x790 avr 14 18:11:58 archfg kernel: ? wait_woken+0x70/0x70 avr 14 18:11:58 archfg kernel: rpm_resume+0x302/0x790 avr 14 18:11:58 archfg kernel: __pm_runtime_resume+0x3b/0x60 avr 14 18:11:58 archfg kernel: usb_autopm_get_interface+0x18/0x50 avr 14 18:11:58 archfg kernel: rtl_work_func_t+0x6b/0x2c0 [r8152] avr 14 18:11:58 archfg kernel: ? __schedule+0x2f0/0x7a0 avr 14 18:11:58 archfg kernel: process_one_work+0x1da/0x3d0 avr 14 18:11:58 archfg kernel: worker_thread+0x4a/0x3d0 avr 14 18:11:58 archfg kernel: kthread+0xfb/0x130 avr 14 18:11:58 archfg kernel: ? process_one_work+0x3d0/0x3d0 avr 14 18:11:58 archfg kernel: ? kthread_park+0x90/0x90 avr 14 18:11:58 archfg kernel: ret_from_fork+0x35/0x40 udev-rule (disable pm) does not help in my case, any idea how to bypass that issue ? btw, it seems that this issue is closely related to the following: https://bugzilla.kernel.org/show_bug.cgi?id=198931 Reproduced on 5.7.12-arch1-1. The udev rule from above to turn off pm did not successfully work around this issue and it happened after 12 hours of leaving my computer on my desk, idle. Similar on 5.8.9 (xanmod2) kernel on Ubuntu 19. My usb/ethernet adapter is based on a Realtek RTL8153 (USB ID: 0bda:8153, identified as rtl8153a-3 v2 by the kernel). I worked fine with Ubuntu stock kernel (ie 5.3.0) but crashes on kernels 5.8. It seems to be related to power saving features (as hunted here) since it only happens after leaving the computer idle for a while. In such a case, trying to unload the module leads to blocked tasks. From the kernel, how my device is identified: [ 14.602199] usbcore: registered new interface driver r8152 [ 14.763963] r8152 2-2:1.0: load rtl8153a-3 v2 02/07/20 successfully [ 14.787971] r8152 2-2:1.0 eth0: v1.11.11 [ 14.789991] r8152 2-2:1.0 enxXXX: renamed from eth0 [ 18.386984] r8152 2-2:1.0 enxXXX: carrier on After the usb port got in powersave mode and came back: Sep 17 08:40:35 XXX kernel: [39162.976536] usb 2-2: USB disconnect, device number 2 Sep 17 08:40:35 XXX kernel: [39162.976710] r8152 2-2:1.0 enx00e07cc8a856: Stop submitting intr, status -108 Sep 17 08:40:35 XXX kernel: [39162.976725] xhci_hcd 0000:00:14.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state. As unplugging/replugging the adapter does not allow restarting the network, I tried unloading the module and got this: Sep 17 08:43:19 XXX kernel: [39327.121200] INFO: task NetworkManager:1077 blocked for more than 122 seconds. Sep 17 08:43:19 XXX kernel: [39327.121210] Tainted: G OE 5.8.9-xanmod2 #0~git20200916.c1b6438 Sep 17 08:43:19 XXX kernel: [39327.121214] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 17 08:43:19 XXX kernel: [39327.121219] NetworkManager D 0 1077 1 0x00000000 Sep 17 08:43:19 XXX kernel: [39327.121225] Call Trace: Sep 17 08:43:19 XXX kernel: [39327.121241] __schedule+0x36d/0xbe0 Sep 17 08:43:19 XXX kernel: [39327.121252] schedule_preempt_disabled+0x6a/0xe0 Sep 17 08:43:19 XXX kernel: [39327.121260] __mutex_lock.constprop.0+0x16f/0x4f0 Sep 17 08:43:19 XXX kernel: [39327.121268] rtnetlink_rcv_msg+0xe3/0x380 Sep 17 08:43:19 XXX kernel: [39327.121274] ? rtnl_calcit.isra.0+0x120/0x120 Sep 17 08:43:19 XXX kernel: [39327.121281] netlink_rcv_skb+0x47/0x110 Sep 17 08:43:19 XXX kernel: [39327.121287] netlink_unicast+0x26b/0x400 Sep 17 08:43:19 XXX kernel: [39327.121294] netlink_sendmsg+0x243/0x480 Sep 17 08:43:19 XXX kernel: [39327.121301] sock_sendmsg+0x5e/0x60 Sep 17 08:43:19 XXX kernel: [39327.121306] ___sys_sendmsg+0x320/0x3e0 Sep 17 08:43:19 XXX kernel: [39327.121313] ? __check_object_size+0x46/0x147 Sep 17 08:43:19 XXX kernel: [39327.121320] ? import_iovec+0x37/0xe0 Sep 17 08:43:19 XXX kernel: [39327.121325] ? ___sys_recvmsg+0x12d/0x1d0 Sep 17 08:43:19 XXX kernel: [39327.121332] ? ep_poll+0x835/0x9e0 Sep 17 08:43:19 XXX kernel: [39327.121338] ? __wake_up_locked_key+0x48/0x80 Sep 17 08:43:19 XXX kernel: [39327.121345] __sys_sendmsg+0x98/0xf0 Sep 17 08:43:19 XXX kernel: [39327.121352] do_syscall_64+0x52/0xd0 Sep 17 08:43:19 XXX kernel: [39327.121357] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Sep 17 08:43:19 XXX kernel: [39327.121362] RIP: 0033:0x7fbe967d32ad Sep 17 08:43:19 XXX kernel: [39327.121364] Code: Bad RIP value. Sep 17 08:43:19 XXX kernel: [39327.121367] RSP: 002b:00007ffe0ca522e0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e Sep 17 08:43:19 XXX kernel: [39327.121371] RAX: ffffffffffffffda RBX: 000055cb5ab5d380 RCX: 00007fbe967d32ad Sep 17 08:43:19 XXX kernel: [39327.121373] RDX: 0000000000000000 RSI: 00007ffe0ca52330 RDI: 000000000000000c Sep 17 08:43:19 XXX kernel: [39327.121375] RBP: 00007ffe0ca52330 R08: 0000000000000000 R09: 0000000000000000 Sep 17 08:43:19 XXX kernel: [39327.121377] R10: 000055cb5ab30010 R11: 0000000000000293 R12: 000055cb5ab5d380 Sep 17 08:43:19 XXX kernel: [39327.121379] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 On the other hand, as I use TLP for power management, I could blacklist the usb ID so it does not get power savings. This seems to do the trick, so far no crashes... Sorry, I've had another freeze, the blacklisting in TLP was not enough... [21386.367851] INFO: task kworker/2:0:30730 blocked for more than 122 seconds. [21386.367859] Tainted: G OE 5.8.9-xanmod2 #0~git20200916.c1b6438 [21386.367862] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [21386.367866] kworker/2:0 D 0 30730 2 0x00004000 [21386.367882] Workqueue: events rtl_work_func_t [r8152] [21386.367884] Call Trace: [21386.367896] __schedule+0x36d/0xbe0 [21386.367904] schedule+0x5f/0xd0 [21386.367911] rpm_resume+0x17b/0x960 [21386.367917] ? wait_woken+0x80/0x80 [21386.367923] rpm_resume+0x2f6/0x960 [21386.367929] ? newidle_balance+0xe4/0x3f0 [21386.367934] ? dequeue_entity+0xce/0x540 [21386.367940] __pm_runtime_resume+0x3b/0x60 [21386.367946] usb_autopm_get_interface+0x18/0x50 [21386.367954] rtl_work_func_t+0x69/0x2a0 [r8152] [21386.367960] ? __schedule+0x375/0xbe0 [21386.367964] process_one_work+0x1da/0x3d0 [21386.367969] worker_thread+0x4d/0x460 [21386.367972] ? process_one_work+0x3d0/0x3d0 [21386.367977] kthread+0x17f/0x1b0 [21386.367982] ? __kthread_init_worker+0x50/0x50 [21386.367989] ret_from_fork+0x22/0x30 On kernel 5.9.0 (mainline), I am also experiencing these hangs, usually once a day. I have a Lenovo USB-C Dock connected to my T14s laptop. I am happpy to provide more information if needed. Oct 22 13:00:08 archlinux kernel: INFO: task kworker/6:2:9255 blocked for more than 122 seconds. Oct 22 13:00:08 archlinux kernel: Tainted: G W 5.9.0-1-mainline #1 Oct 22 13:00:08 archlinux kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 13:00:08 archlinux kernel: task:kworker/6:2 state:D stack: 0 pid: 9255 ppid: 2 flags:0x00004080 Oct 22 13:00:08 archlinux kernel: Workqueue: events rtl_work_func_t [r8152] Oct 22 13:00:08 archlinux kernel: Call Trace: Oct 22 13:00:08 archlinux kernel: __schedule+0x292/0x830 Oct 22 13:00:08 archlinux kernel: schedule+0x46/0xf0 Oct 22 13:00:08 archlinux kernel: rpm_resume+0x189/0x820 Oct 22 13:00:08 archlinux kernel: ? wait_woken+0x80/0x80 Oct 22 13:00:08 archlinux kernel: rpm_resume+0x2fe/0x820 Oct 22 13:00:08 archlinux kernel: __pm_runtime_resume+0x3b/0x60 Oct 22 13:00:08 archlinux kernel: usb_autopm_get_interface+0x18/0x50 Oct 22 13:00:08 archlinux kernel: rtl8152_set_mac_address+0x61/0x1c0 [r8152] Oct 22 13:00:08 archlinux kernel: set_ethernet_addr.isra.0+0x83/0x90 [r8152] Oct 22 13:00:08 archlinux kernel: rtl8152_reset_resume+0x48/0x60 [r8152] Oct 22 13:00:08 archlinux kernel: usb_resume_interface.part.0.isra.0+0x3a/0xb0 Oct 22 13:00:08 archlinux kernel: usb_resume_both+0x103/0x180 Oct 22 13:00:08 archlinux kernel: ? usb_runtime_suspend+0x70/0x70 Oct 22 13:00:08 archlinux kernel: __rpm_callback+0x7b/0x130 Oct 22 13:00:08 archlinux kernel: rpm_callback+0x4f/0x70 Oct 22 13:00:08 archlinux kernel: ? usb_runtime_suspend+0x70/0x70 Oct 22 13:00:08 archlinux kernel: rpm_resume+0x5d7/0x820 Oct 22 13:00:08 archlinux kernel: rpm_resume+0x2fe/0x820 Oct 22 13:00:08 archlinux kernel: __pm_runtime_resume+0x3b/0x60 Oct 22 13:00:08 archlinux kernel: usb_autopm_get_interface+0x18/0x50 Oct 22 13:00:08 archlinux kernel: rtl_work_func_t+0x69/0x2d0 [r8152] Oct 22 13:00:08 archlinux kernel: ? __schedule+0x29a/0x830 Oct 22 13:00:08 archlinux kernel: process_one_work+0x1da/0x3d0 Oct 22 13:00:08 archlinux kernel: worker_thread+0x4d/0x3d0 Oct 22 13:00:08 archlinux kernel: ? rescuer_thread+0x410/0x410 Oct 22 13:00:08 archlinux kernel: kthread+0x142/0x160 Oct 22 13:00:08 archlinux kernel: ? __kthread_bind_mask+0x60/0x60 Oct 22 13:00:08 archlinux kernel: ret_from_fork+0x22/0x30 Oct 22 13:00:08 archlinux kernel: INFO: task kworker/6:1:17664 blocked for more than 122 seconds. Oct 22 13:00:08 archlinux kernel: Tainted: G W 5.9.0-1-mainline #1 Oct 22 13:00:08 archlinux kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 13:00:08 archlinux kernel: task:kworker/6:1 state:D stack: 0 pid:17664 ppid: 2 flags:0x00004080 Oct 22 13:00:08 archlinux kernel: Workqueue: events_long rtl_hw_phy_work_func_t [r8152] Oct 22 13:00:08 archlinux kernel: Call Trace: Oct 22 13:00:08 archlinux kernel: __schedule+0x292/0x830 Oct 22 13:00:08 archlinux kernel: schedule+0x46/0xf0 Oct 22 13:00:08 archlinux kernel: rpm_resume+0x189/0x820 Oct 22 13:00:08 archlinux kernel: ? wait_woken+0x80/0x80 Oct 22 13:00:08 archlinux kernel: rpm_resume+0x2fe/0x820 Oct 22 13:00:08 archlinux kernel: __pm_runtime_resume+0x3b/0x60 Oct 22 13:00:08 archlinux kernel: usb_autopm_get_interface+0x18/0x50 Oct 22 13:00:08 archlinux kernel: rtl_hw_phy_work_func_t+0x5e/0x5b0 [r8152] Oct 22 13:00:08 archlinux kernel: ? _raw_spin_unlock_irq+0x1d/0x30 Oct 22 13:00:08 archlinux kernel: ? finish_task_switch+0x80/0x270 Oct 22 13:00:08 archlinux kernel: ? __switch_to_asm+0x36/0x70 Oct 22 13:00:08 archlinux kernel: process_one_work+0x1da/0x3d0 Oct 22 13:00:08 archlinux kernel: worker_thread+0x4d/0x3d0 Oct 22 13:00:08 archlinux kernel: ? rescuer_thread+0x410/0x410 Oct 22 13:00:08 archlinux kernel: kthread+0x142/0x160 Oct 22 13:00:08 archlinux kernel: ? __kthread_bind_mask+0x60/0x60 Oct 22 13:00:08 archlinux kernel: ret_from_fork+0x22/0x30 I can confirm this on 5.9 xanmod kernel and 5.9 archlinux kernel. I have this freezes regularly leading to DE freezes even, and broken kworker tasks. Before I didn't experience them (post 5.9). I also noticed it was power management so I changed the udev rule as well, but it doesn't work: ACTION=="add", SUBSYSTEM=="usb", ATTR{idVendor}=="0bda", ATTR{idProduct}=="8153", TEST=="power/control", ATTR{power/control}="on" (Good to see it's being suggested here as well, but unfortunately doesn't fix this bug) It's so bad I ordered another ethernet usb adapter and made sure it's not the same chip (rtl8152). [10695.345916] INFO: task wpa_supplicant:1520 blocked for more than 122 seconds. [10695.345917] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.345917] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.345918] task:wpa_supplicant state:D stack: 0 pid: 1520 ppid: 1 flags:0x00000080 [10695.345920] Call Trace: [10695.345925] __schedule+0x421/0xbf0 [10695.345928] schedule_preempt_disabled+0x65/0xe0 [10695.345929] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.345946] nl80211_pre_doit+0xd3/0x1a0 [cfg80211] [10695.345949] genl_rcv_msg+0x17a/0x2f0 [10695.345951] ? genl_family_rcv_msg_attrs_parse.isra.0+0xd0/0xd0 [10695.345951] netlink_rcv_skb+0x42/0x100 [10695.345953] genl_rcv+0x1f/0x30 [10695.345954] netlink_unicast+0x260/0x400 [10695.345955] netlink_sendmsg+0x23d/0x470 [10695.345957] sock_sendmsg+0x59/0x60 [10695.345958] ___sys_sendmsg+0x330/0x400 [10695.345960] ? __check_object_size+0x42/0x13b [10695.345963] ? _copy_to_user+0x22/0x30 [10695.345964] ? unix_ioctl+0xee/0x220 [10695.345966] ? sock_do_ioctl+0x37/0x130 [10695.345968] ? __cgroup_bpf_run_filter_setsockopt+0xc5/0x2d0 [10695.345969] __sys_sendmsg+0x93/0xe0 [10695.345971] do_syscall_64+0x33/0x80 [10695.345973] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.345974] RIP: 0033:0x7fe2b243a7e7 [10695.345975] Code: Bad RIP value. [10695.345975] RSP: 002b:00007ffd1fb0b928 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [10695.345976] RAX: ffffffffffffffda RBX: 0000558a53a537c0 RCX: 00007fe2b243a7e7 [10695.345977] RDX: 0000000000000000 RSI: 00007ffd1fb0b960 RDI: 0000000000000006 [10695.345977] RBP: 0000558a53a536d0 R08: 0000000000000004 R09: 0000558a53a49010 [10695.345978] R10: 00007ffd1fb0ba34 R11: 0000000000000246 R12: 0000558a53a8dc10 [10695.345978] R13: 00007ffd1fb0b960 R14: 00007ffd1fb0ba34 R15: 0000558a53a90150 [10695.345993] INFO: task Qt bearer threa:3018 blocked for more than 122 seconds. [10695.345993] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.345993] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.345994] task:Qt bearer threa state:D stack: 0 pid: 3018 ppid: 1 flags:0x00000080 [10695.345994] Call Trace: [10695.345996] __schedule+0x421/0xbf0 [10695.345997] schedule_preempt_disabled+0x65/0xe0 [10695.345998] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.345999] ? __netlink_lookup+0xb3/0x110 [10695.346000] __netlink_dump_start+0xbe/0x2d0 [10695.346002] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346002] rtnetlink_rcv_msg+0x28f/0x390 [10695.346004] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346004] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346005] netlink_rcv_skb+0x42/0x100 [10695.346006] netlink_unicast+0x260/0x400 [10695.346007] netlink_sendmsg+0x23d/0x470 [10695.346008] sock_sendmsg+0x59/0x60 [10695.346009] __sys_sendto+0x14d/0x190 [10695.346011] ? kmem_cache_alloc+0x3ca/0x470 [10695.346013] ? __audit_syscall_exit+0x2e6/0x340 [10695.346014] __x64_sys_sendto+0x20/0x30 [10695.346015] do_syscall_64+0x33/0x80 [10695.346016] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346016] RIP: 0033:0x7f925ecf177c [10695.346017] Code: Bad RIP value. [10695.346017] RSP: 002b:00007f91d5f1bd60 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346018] RAX: ffffffffffffffda RBX: 0000000000000032 RCX: 00007f925ecf177c [10695.346018] RDX: 0000000000000020 RSI: 00007f91d5f1be00 RDI: 0000000000000032 [10695.346018] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346019] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f91d5f1bec8 [10695.346019] R13: 000055b6ca681010 R14: 00007f91d013dbc8 R15: 0000000000000000 [10695.346059] INFO: task Qt bearer threa:2622 blocked for more than 122 seconds. [10695.346059] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.346059] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.346060] task:Qt bearer threa state:D stack: 0 pid: 2622 ppid: 2242 flags:0x00000080 [10695.346061] Call Trace: [10695.346062] __schedule+0x421/0xbf0 [10695.346063] schedule_preempt_disabled+0x65/0xe0 [10695.346064] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.346065] ? __netlink_lookup+0xb3/0x110 [10695.346066] __netlink_dump_start+0xbe/0x2d0 [10695.346067] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346067] rtnetlink_rcv_msg+0x28f/0x390 [10695.346068] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346069] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346070] netlink_rcv_skb+0x42/0x100 [10695.346070] netlink_unicast+0x260/0x400 [10695.346071] netlink_sendmsg+0x23d/0x470 [10695.346072] sock_sendmsg+0x59/0x60 [10695.346073] __sys_sendto+0x14d/0x190 [10695.346074] ? kmem_cache_alloc+0x3ca/0x470 [10695.346076] ? __audit_syscall_exit+0x2e6/0x340 [10695.346077] __x64_sys_sendto+0x20/0x30 [10695.346077] do_syscall_64+0x33/0x80 [10695.346078] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346079] RIP: 0033:0x7f3e53c1c77c [10695.346079] Code: Bad RIP value. [10695.346079] RSP: 002b:00007f3e3fffdd60 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346080] RAX: ffffffffffffffda RBX: 0000000000000012 RCX: 00007f3e53c1c77c [10695.346080] RDX: 0000000000000020 RSI: 00007f3e3fffde00 RDI: 0000000000000012 [10695.346081] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346081] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3e3fffdec8 [10695.346081] R13: 0000557f15b7b170 R14: 00007f3e38024598 R15: 0000000000000000 [10695.346083] INFO: task Qt bearer threa:2618 blocked for more than 122 seconds. [10695.346083] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.346083] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.346084] task:Qt bearer threa state:D stack: 0 pid: 2618 ppid: 2242 flags:0x00000080 [10695.346084] Call Trace: [10695.346086] __schedule+0x421/0xbf0 [10695.346087] schedule_preempt_disabled+0x65/0xe0 [10695.346088] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.346089] ? __netlink_lookup+0xb3/0x110 [10695.346089] __netlink_dump_start+0xbe/0x2d0 [10695.346090] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346091] rtnetlink_rcv_msg+0x28f/0x390 [10695.346092] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346093] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346093] netlink_rcv_skb+0x42/0x100 [10695.346094] netlink_unicast+0x260/0x400 [10695.346095] netlink_sendmsg+0x23d/0x470 [10695.346096] sock_sendmsg+0x59/0x60 [10695.346097] __sys_sendto+0x14d/0x190 [10695.346098] ? kmem_cache_alloc+0x3ca/0x470 [10695.346099] ? __audit_syscall_exit+0x2e6/0x340 [10695.346100] __x64_sys_sendto+0x20/0x30 [10695.346101] do_syscall_64+0x33/0x80 [10695.346102] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346102] RIP: 0033:0x7f9a5860377c [10695.346102] Code: Bad RIP value. [10695.346103] RSP: 002b:00007f9a50a40d60 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346103] RAX: ffffffffffffffda RBX: 0000000000000012 RCX: 00007f9a5860377c [10695.346104] RDX: 0000000000000020 RSI: 00007f9a50a40e00 RDI: 0000000000000012 [10695.346104] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346104] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9a50a40ec8 [10695.346105] R13: 000055c45d221f40 R14: 00007f9a3c0231f8 R15: 0000000000000000 [10695.346106] INFO: task Qt bearer threa:2599 blocked for more than 122 seconds. [10695.346107] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.346107] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.346107] task:Qt bearer threa state:D stack: 0 pid: 2599 ppid: 2242 flags:0x00000080 [10695.346108] Call Trace: [10695.346109] __schedule+0x421/0xbf0 [10695.346110] schedule_preempt_disabled+0x65/0xe0 [10695.346111] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.346112] ? __netlink_lookup+0xb3/0x110 [10695.346113] __netlink_dump_start+0xbe/0x2d0 [10695.346114] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346114] rtnetlink_rcv_msg+0x28f/0x390 [10695.346115] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346116] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346117] netlink_rcv_skb+0x42/0x100 [10695.346117] netlink_unicast+0x260/0x400 [10695.346118] netlink_sendmsg+0x23d/0x470 [10695.346119] sock_sendmsg+0x59/0x60 [10695.346120] __sys_sendto+0x14d/0x190 [10695.346121] ? kmem_cache_alloc+0x3ca/0x470 [10695.346122] ? __audit_syscall_exit+0x2e6/0x340 [10695.346123] __x64_sys_sendto+0x20/0x30 [10695.346124] do_syscall_64+0x33/0x80 [10695.346125] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346125] RIP: 0033:0x7fb7ccd0477c [10695.346126] Code: Bad RIP value. [10695.346126] RSP: 002b:00007fb7c5141d60 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346127] RAX: ffffffffffffffda RBX: 000000000000000e RCX: 00007fb7ccd0477c [10695.346127] RDX: 0000000000000020 RSI: 00007fb7c5141e00 RDI: 000000000000000e [10695.346127] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346128] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb7c5141ec8 [10695.346128] R13: 0000560c32306460 R14: 00007fb7b0023728 R15: 0000000000000000 [10695.346131] INFO: task Qt bearer threa:2619 blocked for more than 122 seconds. [10695.346132] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.346132] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.346132] task:Qt bearer threa state:D stack: 0 pid: 2619 ppid: 2242 flags:0x00000080 [10695.346133] Call Trace: [10695.346134] __schedule+0x421/0xbf0 [10695.346135] schedule_preempt_disabled+0x65/0xe0 [10695.346136] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.346137] ? __netlink_lookup+0xb3/0x110 [10695.346138] __netlink_dump_start+0xbe/0x2d0 [10695.346138] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346139] rtnetlink_rcv_msg+0x28f/0x390 [10695.346140] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346141] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346141] netlink_rcv_skb+0x42/0x100 [10695.346142] netlink_unicast+0x260/0x400 [10695.346143] netlink_sendmsg+0x23d/0x470 [10695.346144] sock_sendmsg+0x59/0x60 [10695.346145] __sys_sendto+0x14d/0x190 [10695.346146] ? kmem_cache_alloc+0x3ca/0x470 [10695.346147] ? __audit_syscall_exit+0x2e6/0x340 [10695.346148] __x64_sys_sendto+0x20/0x30 [10695.346149] do_syscall_64+0x33/0x80 [10695.346150] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346150] RIP: 0033:0x7f205d33d77c [10695.346150] Code: Bad RIP value. [10695.346151] RSP: 002b:00007f2051e6ad60 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346151] RAX: ffffffffffffffda RBX: 000000000000000e RCX: 00007f205d33d77c [10695.346152] RDX: 0000000000000020 RSI: 00007f2051e6ae00 RDI: 000000000000000e [10695.346152] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346152] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2051e6aec8 [10695.346153] R13: 00005585c2703570 R14: 00007f2048023238 R15: 0000000000000000 [10695.346155] INFO: task Qt bearer threa:2689 blocked for more than 122 seconds. [10695.346156] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.346156] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.346156] task:Qt bearer threa state:D stack: 0 pid: 2689 ppid: 2242 flags:0x00000080 [10695.346157] Call Trace: [10695.346158] __schedule+0x421/0xbf0 [10695.346159] schedule_preempt_disabled+0x65/0xe0 [10695.346160] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.346161] ? __netlink_lookup+0xb3/0x110 [10695.346161] __netlink_dump_start+0xbe/0x2d0 [10695.346162] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346163] rtnetlink_rcv_msg+0x28f/0x390 [10695.346164] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346165] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346165] netlink_rcv_skb+0x42/0x100 [10695.346166] netlink_unicast+0x260/0x400 [10695.346167] netlink_sendmsg+0x23d/0x470 [10695.346168] sock_sendmsg+0x59/0x60 [10695.346169] __sys_sendto+0x14d/0x190 [10695.346169] ? kmem_cache_alloc+0x3ca/0x470 [10695.346171] ? __audit_syscall_exit+0x2e6/0x340 [10695.346172] __x64_sys_sendto+0x20/0x30 [10695.346172] do_syscall_64+0x33/0x80 [10695.346173] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346174] RIP: 0033:0x7fd5f37e577c [10695.346174] Code: Bad RIP value. [10695.346174] RSP: 002b:00007fd5d8a588e0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346175] RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007fd5f37e577c [10695.346175] RDX: 0000000000000020 RSI: 00007fd5d8a58980 RDI: 000000000000001c [10695.346175] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346176] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5d8a58a48 [10695.346176] R13: 000055f1843c4cb0 R14: 00007fd5d0022e88 R15: 0000000000000000 [10695.346178] INFO: task Qt bearer threa:2707 blocked for more than 122 seconds. [10695.346178] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.346178] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.346178] task:Qt bearer threa state:D stack: 0 pid: 2707 ppid: 2242 flags:0x00000080 [10695.346179] Call Trace: [10695.346180] __schedule+0x421/0xbf0 [10695.346181] schedule_preempt_disabled+0x65/0xe0 [10695.346182] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.346183] ? __netlink_lookup+0xb3/0x110 [10695.346184] __netlink_dump_start+0xbe/0x2d0 [10695.346185] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346185] rtnetlink_rcv_msg+0x28f/0x390 [10695.346186] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346187] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346188] netlink_rcv_skb+0x42/0x100 [10695.346188] netlink_unicast+0x260/0x400 [10695.346189] netlink_sendmsg+0x23d/0x470 [10695.346190] sock_sendmsg+0x59/0x60 [10695.346194] __sys_sendto+0x14d/0x190 [10695.346194] ? kmem_cache_alloc+0x3ca/0x470 [10695.346196] ? __audit_syscall_exit+0x2e6/0x340 [10695.346197] __x64_sys_sendto+0x20/0x30 [10695.346197] do_syscall_64+0x33/0x80 [10695.346198] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346199] RIP: 0033:0x7fb71194c77c [10695.346199] Code: Bad RIP value. [10695.346199] RSP: 002b:00007fb6f65eb8e0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346200] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 00007fb71194c77c [10695.346200] RDX: 0000000000000020 RSI: 00007fb6f65eb980 RDI: 000000000000001e [10695.346200] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346201] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb6f65eba48 [10695.346201] R13: 00005569f0cee640 R14: 00007fb6ec022bd8 R15: 0000000000000000 [10695.346204] INFO: task Qt bearer threa:2624 blocked for more than 122 seconds. [10695.346204] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.346204] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.346205] task:Qt bearer threa state:D stack: 0 pid: 2624 ppid: 2242 flags:0x00000080 [10695.346205] Call Trace: [10695.346207] __schedule+0x421/0xbf0 [10695.346208] schedule_preempt_disabled+0x65/0xe0 [10695.346209] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.346209] ? __netlink_lookup+0xb3/0x110 [10695.346210] __netlink_dump_start+0xbe/0x2d0 [10695.346211] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346212] rtnetlink_rcv_msg+0x28f/0x390 [10695.346213] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346214] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346214] netlink_rcv_skb+0x42/0x100 [10695.346215] netlink_unicast+0x260/0x400 [10695.346216] netlink_sendmsg+0x23d/0x470 [10695.346217] sock_sendmsg+0x59/0x60 [10695.346218] __sys_sendto+0x14d/0x190 [10695.346218] ? kmem_cache_alloc+0x3ca/0x470 [10695.346220] ? __audit_syscall_exit+0x2e6/0x340 [10695.346221] __x64_sys_sendto+0x20/0x30 [10695.346222] do_syscall_64+0x33/0x80 [10695.346223] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346223] RIP: 0033:0x7ff007e7077c [10695.346223] Code: Bad RIP value. [10695.346224] RSP: 002b:00007feff3ffdd60 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346224] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007ff007e7077c [10695.346225] RDX: 0000000000000020 RSI: 00007feff3ffde00 RDI: 0000000000000010 [10695.346225] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346225] R10: 0000000000000000 R11: 0000000000000246 R12: 00007feff3ffdec8 [10695.346226] R13: 0000558857abbde0 R14: 00007fefec023d28 R15: 0000000000000000 [10695.346228] INFO: task Qt bearer threa:2613 blocked for more than 122 seconds. [10695.346228] Tainted: P OE 5.9.1-xanmod1-1 #1 [10695.346229] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10695.346229] task:Qt bearer threa state:D stack: 0 pid: 2613 ppid: 2242 flags:0x00000080 [10695.346229] Call Trace: [10695.346231] __schedule+0x421/0xbf0 [10695.346232] schedule_preempt_disabled+0x65/0xe0 [10695.346233] __mutex_lock.constprop.0+0x16a/0x4e0 [10695.346234] ? __netlink_lookup+0xb3/0x110 [10695.346234] __netlink_dump_start+0xbe/0x2dKernel.org Bugzilla – Bug 200977 Daily crash with r8152 driver requiring reboot Last modified: 2020-11-03 02:21:03 UTC 0 [10695.346235] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346236] rtnetlink_rcv_msg+0x28f/0x390 [10695.346237] ? rtnl_fill_ifinfo+0x1250/0x1250 [10695.346237] ? rtnl_calcit.isra.0+0x120/0x120 [10695.346238] netlink_rcv_skb+0x42/0x100 [10695.346239] netlink_unicast+0x260/0x400 [10695.346240] netlink_sendmsg+0x23d/0x470 [10695.346241] sock_sendmsg+0x59/0x60 [10695.346242] __sys_sendto+0x14d/0x190 [10695.346243] ? kmem_cache_alloc+0x3ca/0x470 [10695.346244] ? __audit_syscall_exit+0x2e6/0x340 [10695.346245] __x64_sys_sendto+0x20/0x30 [10695.346246] do_syscall_64+0x33/0x80 [10695.346247] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [10695.346247] RIP: 0033:0x7efc7386977c [10695.346247] Code: Bad RIP value. [10695.346248] RSP: 002b:00007efc5f7fcd60 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [10695.346248] RAX: ffffffffffffffda RBX: 0000000000000011 RCX: 00007efc7386977c [10695.346249] RDX: 0000000000000020 RSI: 00007efc5f7fce00 RDI: 0000000000000011 [10695.346249] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [10695.346249] R10: 0000000000000000 R11: 0000000000000246 R12: 00007efc5f7fcec8 [10695.346250] R13: 00005639fd9561a0 R14: 00007efc50023a18 R15: 0000000000000000 FWIW, not all devices using the r8152 driver seem affected by the crash. I was facing the crash on ArchLinux with a TP-Link UE330 adapter (which is also a 3 port USB hub) for several weeks, then switched to the Dell USB 3.0 adapter that was provided with my Dell XPS 13 and issues are gone. Here are a few more details about the two adapters: TP-Link UE330 (driver crashes daily) ==================================== Manufacturer link: https://www.tp-link.com/us/home-networking/usb-adapter/ue330/ lsusb output: Bus 002 Device 011: ID 2357:0601 TP-Link UE300 10/100/1000 LAN (ethernet mode) [Realtek RTL8153] dmesg output on connection: [67391.806666] usb 1-2: new high-speed USB device number 9 using xhci_hcd [67391.937643] usb 1-2: New USB device found, idVendor=2109, idProduct=2812, bcdDevice=90.90 [67391.937649] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [67391.937653] usb 1-2: Product: USB2.0 Hub [67391.937657] usb 1-2: Manufacturer: VIA Labs, Inc. [67391.938816] hub 1-2:1.0: USB hub found [67391.939129] hub 1-2:1.0: 4 ports detected [67392.199873] usb 2-2: new SuperSpeed Gen 1 USB device number 10 using xhci_hcd [67392.451388] usb 2-2: New USB device found, idVendor=2109, idProduct=0812, bcdDevice=90.91 [67392.451390] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [67392.451391] usb 2-2: Product: USB3.0 Hub [67392.451392] usb 2-2: Manufacturer: VIA Labs, Inc. [67392.453648] hub 2-2:1.0: USB hub found [67392.453888] hub 2-2:1.0: 4 ports detected [67393.182805] usb 2-2.3: new SuperSpeed Gen 1 USB device number 11 using xhci_hcd [67393.195489] usb 2-2.3: New USB device found, idVendor=2357, idProduct=0601, bcdDevice=30.00 [67393.195494] usb 2-2.3: New USB device strings: Mfr=1, Product=2, SerialNumber=6 [67393.195497] usb 2-2.3: Product: USB 10/100/1000 LAN [67393.195500] usb 2-2.3: Manufacturer: TP-LINK [67393.195503] usb 2-2.3: SerialNumber: 000001000000 [67393.268035] usb 2-2.3: reset SuperSpeed Gen 1 USB device number 11 using xhci_hcd [67393.305565] r8152 2-2.3:1.0: load rtl8153a-3 v2 02/07/20 successfully [67393.333721] r8152 2-2.3:1.0 eth0: v1.11.11 [67394.095205] r8152 2-2.3:1.0 enp0s20f0u2u3: renamed from eth0 [67396.402944] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s20f0u2u3: link becomes ready [67396.403218] r8152 2-2.3:1.0 enp0s20f0u2u3: carrier on Dell (driver works flawlessly) ============================== Manufacturer link: https://www.dell.com/en-us/shop/accessories/apd/443-BBBD lsusb output: Bus 002 Device 002: ID 0bda:8153 Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter dmesg output on connection: [67411.254489] usb 2-2: new SuperSpeed Gen 1 USB device number 12 using xhci_hcd [67411.266955] usb 2-2: New USB device found, idVendor=0bda, idProduct=8153, bcdDevice=30.00 [67411.266957] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=6 [67411.266959] usb 2-2: Product: USB 10/100/1000 LAN [67411.266961] usb 2-2: Manufacturer: Realtek [67411.266962] usb 2-2: SerialNumber: 000001000000 [67411.384731] usb 2-2: reset SuperSpeed Gen 1 USB device number 12 using xhci_hcd [67411.419736] r8152 2-2:1.0: load rtl8153a-3 v2 02/07/20 successfully [67411.448526] r8152 2-2:1.0 eth0: v1.11.11 [67412.205872] r8152 2-2:1.0 enp0s20f0u2: renamed from eth0 [67414.435174] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s20f0u2: link becomes ready [67414.438885] r8152 2-2:1.0 enp0s20f0u2: carrier on This issue appeared to go away for me after I manually compiled the latest Realtek drivers against kernel 4.19.155 Nvm, still doing it. Maybe something interesting here is that with all the three identical adapters, its always a single one that does this. The rest seem to keep working fine. What is obvious in Patrick Decat contribution is that the difference between the working adapter and the problematic one is the USB hub in the problematic one. I have two different r8152 adapters, with a 3-ports USB hub each, and faces the problem with both of them : - Uni (UNIEHUB02) - Techole (24041) Same problem here with 5.9.8. Note, that in my case, following two messages always appear before crash: [240360.160631] usb 4-2.1.2: reset SuperSpeed Gen 1 USB device number 7 using xhci_hcd [240360.190095] r8152 4-2.1.2:1.0 enx3ce1a1c0a2a8: Invalid header when reading pass-thru MAC addr [240576.999077] INFO: task NetworkManager:1417 blocked for more than 120 seconds. [240576.999085] Tainted: P S OE 5.9.8-050908-generic #202011101634 [240576.999088] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [240576.999092] task:NetworkManager state:D stack: 0 pid: 1417 ppid: 1 flags:0x00000000 [240576.999097] Call Trace: [240576.999111] __schedule+0x1fe/0x5e0 [240576.999118] schedule+0x55/0xc0 [240576.999124] schedule_preempt_disabled+0xe/0x10 [240576.999128] __mutex_lock.constprop.0+0x14a/0x490 [240576.999135] ? kmalloc_large_node+0x83/0x90 [240576.999139] __mutex_lock_slowpath+0x13/0x20 [240576.999142] mutex_lock+0x34/0x40 [240576.999148] rtnl_lock+0x15/0x20 I have the same problem, on Ubuntu 20.04 with 5.4.0-54-generic #60-Ubuntu SMP Fri Nov 6 10:37:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux, machine is a Dell XPS as well. It starts manifesting when I connect a Lenovo made USB-C "Travel Hub" (LO1UD014-CS-R) that has a a USB, HDMI and Ethernet port, and I guess runs on the r8152. Incidentally, it seems to not crash if I just plug the hub (r8152 module is loaded), but not connect it to the network. Or at least crash much less frequently. I followed the lead, that this crash always happens after the USB device gets reset, so I tried to set 'avoid_reset_quirk' attribute, which in my case goes like: echo "1" > /sys/bus/usb/devices/4-2.1.2/avoid_reset_quirk Instead of only resetting, it now disconnects and reconnects the device (probably when coming out of suspend, but I haven't verified that). So far it has happened twice and no crashes. dmesg now shows: [85067.650516] usb 4-2.1.2: USB disconnect, device number 7 [85067.796406] usb 4-2.1.2: new SuperSpeed Gen 1 USB device number 8 using xhci_hcd [85067.817097] usb 4-2.1.2: New USB device found, idVendor=17ef, idProduct=3082, bcdDevice=31.01 [85067.817098] usb 4-2.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=6 [85067.817099] usb 4-2.1.2: Product: ThinkPad TBT 3 Dock [85067.817099] usb 4-2.1.2: Manufacturer: Realtek [85067.817100] usb 4-2.1.2: SerialNumber: 101C0A2A8 [85067.996852] usb 4-2.1.2: reset SuperSpeed Gen 1 USB device number 8 using xhci_hcd [85068.032488] r8152 4-2.1.2:1.0 (unnamed net_device) (uninitialized): Invalid header when reading pass-thru MAC addr [85068.049003] r8152 4-2.1.2:1.0: load rtl8153b-2 v1 10/23/19 successfully [85068.081728] r8152 4-2.1.2:1.0 eth0: v1.11.11 [85068.705681] r8152 4-2.1.2:1.0 enx3ce1a1c0a2a8: renamed from eth0 This is hardly ideal, but at least now I don't need to start my day by rebooting the system. I don't know, if this can even be called a workaround. All my issues with the USB Adapters have disappeared after I used these kernel parameters; usbcore.old_scheme_first=1 usbcore.use_both_schemes=0 usbcore.autosuspend=-1 pci_aspm=off Not sure which of them, or a combination of actually does it. Been running for a week 24/7 with them now. I can confirm that I've been running for a few days now with no crashes (after previously seeing crashes within 2-4 hours of working), with the following kernel parameters based on @Emtee's suggestion: usbcore.autosuspend=-1 pci_aspm=off So that narrows down the list, at least for my case. Next week I will try to remove one of them and see what happens. I get this with WPA3-Personal and NETGEAR A6210, the only usb device that can do SAE I had a usb hub with 10 Gb/s usb-a, but removed that and switched computers, with vanilla usb3: still had the problem — appears on 5 GHz 40 MHz carrier (faster) — does not appear on 2.4 GHz 802.11g (slower) so it is a timing problem. my trace has similar locations but is not identical when it happens, the computer has to be shut off. It is not recoverable processes can not be killed with -9, but if you’re in tmux, you can start a new shell Dec 11 00:21:09 c34x kernel: [ 3988.736261] Call Trace: Dec 11 00:21:09 c34x kernel: [ 3988.736263] __schedule+0x2e3/0x740 Dec 11 00:21:09 c34x kernel: [ 3988.736265] schedule+0x42/0xb0 Dec 11 00:21:09 c34x kernel: [ 3988.736266] schedule_preempt_disabled+0xe/0x10 Dec 11 00:21:09 c34x kernel: [ 3988.736268] __mutex_lock.isra.0+0x182/0x4f0 Dec 11 00:21:09 c34x kernel: [ 3988.736270] ? poll_freewait+0x8a/0xa0 Dec 11 00:21:09 c34x kernel: [ 3988.736271] ? do_select+0x6ee/0x770 Dec 11 00:21:09 c34x kernel: [ 3988.736274] __mutex_lock_slowpath+0x13/0x20 Dec 11 00:21:09 c34x kernel: [ 3988.736275] mutex_lock+0x2e/0x40 Dec 11 00:21:09 c34x kernel: [ 3988.736277] rtnl_lock+0x15/0x20 Dec 11 00:21:09 c34x kernel: [ 3988.736297] nl80211_pre_doit+0x104/0x1b0 [cfg80211] Dec 11 00:21:09 c34x kernel: [ 3988.736300] genl_family_rcv_msg+0x1a5/0x470 Dec 11 00:21:09 c34x kernel: [ 3988.736302] genl_rcv_msg+0x4c/0xa0 Dec 11 00:21:09 c34x kernel: [ 3988.736303] ? _cond_resched+0x19/0x30 Dec 11 00:21:09 c34x kernel: [ 3988.736304] ? genl_family_rcv_msg+0x470/0x470 Dec 11 00:21:09 c34x kernel: [ 3988.736305] netlink_rcv_skb+0x50/0x120 Dec 11 00:21:09 c34x kernel: [ 3988.736307] genl_rcv+0x29/0x40 Dec 11 00:21:09 c34x kernel: [ 3988.736308] netlink_unicast+0x187/0x220 Dec 11 00:21:09 c34x kernel: [ 3988.736309] netlink_sendmsg+0x222/0x3e0 Dec 11 00:21:09 c34x kernel: [ 3988.736312] sock_sendmsg+0x65/0x70 Dec 11 00:21:09 c34x kernel: [ 3988.736313] ____sys_sendmsg+0x212/0x280 Dec 11 00:21:09 c34x kernel: [ 3988.736315] ___sys_sendmsg+0x88/0xd0 Dec 11 00:21:09 c34x kernel: [ 3988.736316] ? ___sys_recvmsg+0x88/0xc0 Dec 11 00:21:09 c34x kernel: [ 3988.736318] ? __sys_recvfrom+0x10c/0x1d0 Dec 11 00:21:09 c34x kernel: [ 3988.736321] ? __cgroup_bpf_run_filter_setsockopt+0xae/0x2d0 Dec 11 00:21:09 c34x kernel: [ 3988.736322] ? _cond_resched+0x19/0x30 Dec 11 00:21:09 c34x kernel: [ 3988.736324] ? aa_sk_perm+0x43/0x170 Dec 11 00:21:09 c34x kernel: [ 3988.736326] __sys_sendmsg+0x5c/0xa0 Dec 11 00:21:09 c34x kernel: [ 3988.736328] __x64_sys_sendmsg+0x1f/0x30 Dec 11 00:21:09 c34x kernel: [ 3988.736331] do_syscall_64+0x57/0x190 Dec 11 00:21:09 c34x kernel: [ 3988.736332] entry_SYSCALL_64_after_hwframe+0x44/0xa9 I get 5 each time, both supplicant and hostapd are -9 survivors when it happens, ip a hangs The system cannot shut down so long power button press It is not specific to a certain network adapter There does not have to be a usb-hub it is something more generic like the supplicant or usb3 communications as usual, faster is more likely to crash It so happens that the first, newer more sophisticated computer, the crash magnet, had a 0bda:8153 Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter operating at 5 Gb/s usb side and 100 Mb/s Ethernet side on a 10 Gb/s host port with adapter plugged into a 10 Gb/s hub port It was also in the preceding days successfully operated on the same host port via a 480 Mb/s hub It seems that if the Realtek is used at 100 Mb/s rather than 1 Gb/s Ethernet side, it does not crash Further research have found that whenever ieee80211n is enabled it crashes That means no ieee80211ac or ieee80211ax that requires that feature https://w1.fi/cgit/hostap/plain/hostapd/hostapd.conf But with the 54 Mb/s 2.4 GHz teenager from 2003, it works The canary is: tail --follow=name --retry /var/log/syslog | grep "Call Trace" Whenever that command outputs something, long-press on power button is the answer — network devices are horded by processes that can’t be killed — network stack operations will hang — no new ssh connections can be established Save yourself some time and hit the power button right away Figure out what it is you cannot do, and don’t do that again Hi, FWIW, I had another variant of r8152 crash for the first time yesterday since early November with my Dell XPS 9350 after having switched from a TP-Link UE330 to a Dell USB 3.0 Ethernet adapter (see #c19): https://bugzilla.kernel.org/show_bug.cgi?id=198931#c59 I was experiencing this on Manjaro Kernel 5.10.2 and 5.9.11 with a RTL8153 10/100/1000 Ethernet adapter built-in to a USB-C connected Philips 272B7QUPBEB 27" QHD LCD Monitor. The workaround was indeed to set usbcore.autosuspend=-1 pci_aspm=off kernel params. I reported this to the Linux USB mailing list at https://marc.info/?l=linux-usb&m=160995334424921&w=2 and have been asked to provide kernel logs with dynamic debugging enabled for the module usbcore. (Un)fortunately I've not been able to reproduce the issue again, even with those kernel params removed. If someone following this bug is still seeing this problem, perhaps you can reply to the thread at https://marc.info/?l=linux-usb&m=161001241812767&w=2 with those logs? There are instructions at https://www.kernel.org/doc/html/latest/admin-guide/dynamic-debug-howto.html and assuming you have a system with CONFIG_DYNAMIC_DEBUG set (see if the debugfs is mounted, usually at /sys/kernel/debug) then I got the extra logging enabled with the command sudo sh -c "echo 'module usbcore +p' > /sys/kernel/debug/dynamic_debug/control" (In reply to Shahar Evron from comment #28) > I can confirm that I've been running for a few days now with no crashes > (after previously seeing crashes within 2-4 hours of working), with the > following kernel parameters based on @Emtee's suggestion: > > usbcore.autosuspend=-1 pci_aspm=off > > So that narrows down the list, at least for my case. Next week I will try to > remove one of them and see what happens. I've so far had stability with a much smaller hammer: # /etc/udev/rules.d/10-wd15-docking-station-debug.rules ACTION=="add", SUBSYSTEM=="usb", ATTRS{idVendor}=="0bda", ATTR{idProduct}=="8153", ATTR{product}=="USB 10/100/1000 LAN", TEST=="power/control", ATTR{power/control}:="on" 22 days uptime so far I think I have the same issue. It did started to happen some weeks ago. I will post log when it happens next time. (In reply to Russ Dill from comment #35) > I've so far had stability with a much smaller hammer: > > # /etc/udev/rules.d/10-wd15-docking-station-debug.rules > ACTION=="add", SUBSYSTEM=="usb", ATTRS{idVendor}=="0bda", > ATTR{idProduct}=="8153", ATTR{product}=="USB 10/100/1000 LAN", > TEST=="power/control", ATTR{power/control}:="on" > > 22 days uptime so far Thank you, Russ. I came across this issue as well with ThinkPad T14 AMD and Bus 005 Device 004: ID 17ef:a387 Lenovo USB-C Dock Ethernet I am now trying: % /etc/udev/rules.d# cat 10-fix-r8152-crashes.rules # https://bugzilla.kernel.org/show_bug.cgi?id=200977#c35 ACTION=="add", SUBSYSTEM=="usb", ATTRS{idVendor}=="17ef", ATTR{idProduct}=="a387", ATTR{product}=="USB 10/100/1000 LAN", TEST=="power/control", ATTR{power/control}:="on" Hopefully it will help. Maybe this one is better: ACTION=="add", SUBSYSTEM=="usb", ATTRS{idVendor}=="17ef", ATTR{idProduct}=="a387", ATTR{product}=="Lenovo USB-C Dock Ethernet", TEST=="power/control", ATTR{power/control}:="on" Dunno whether the product name is actually required. This issue appears to be discussed on Lenovo forums as well: https://forums.lenovo.com/t5/Other-Linux-Discussions/ThinkPad-USB-C-Dock-Gen-2-kernel-driver-r8152-system-becomes-unresponsive/m-p/5080714 I had the same problem with https://www.itpro.com/software/video-conferencing/357295/starleaf-huddle-review-a-star-performer and at first I thought it was related to the HDMI port on the hub which was not supported by my USB C port - https://forums.linuxmint.com/viewtopic.php?t=349374 Unfortulatelly is seems to be the r8152 driver. I have compilled this driver - https://github.com/wget/realtek-r8152-linux but the problem seems to persist. Any ideas to what I can try next? Thanks! The work-around I tried previously did not work. It still crashed… or… well in my case it might have been two problems in one. After resume from hibernation on my ThinkPad T14 AMD Gen 1 often enough the kernel is in a state where there seems to be some corruption of kernel memory. So I am not sure completely what comes from what. However my current work-around is: I do not use the Lenovo USB-C dock with that ethernet chip. Maybe this can help: I ran into the same / very similar problem and had posted this new issue. There it states "This has been fixed in linux 5.13.9.": https://bugzilla.kernel.org/show_bug.cgi?id=205155#c28 I cannot test this anymore though. I am now running kernel 5.14 and the issue seems to be gone. Thanks for the help! My distro is based on Ubuntu Focal (20.04) and there's problem with libc if you want the latest kernel. The problem can be overcame if you add this lovely ppa - https://launchpad.net/~tuxinvader/+archive/ubuntu/lts-mainline and install the meta package linux-generic-5.14 Kernel: 5.14.10-1-MANJARO Issue still there. |