Bug 205155 - [r8152 !?] network parts of kernel hang completely (mutex lockup)
Summary: [r8152 !?] network parts of kernel hang completely (mutex lockup)
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 high
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
: 205557 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-10-10 15:54 UTC by Tormen
Modified: 2022-02-02 16:44 UTC (History)
17 users (show)

See Also:
Kernel Version: 5.3.7
Subsystem:
Regression: No
Bisected commit-id:


Attachments
2019-09-14 journalctl.log (258.35 KB, text/plain)
2019-10-10 15:57 UTC, Tormen
Details
2019-10-10 journalctl -b -1 (after crash) shortened, look for <REMOVED LINES> (1.21 MB, text/plain)
2019-10-10 18:06 UTC, Tormen
Details
2019-10-10 2nd journalctl -b log showing kernel hang but at different WD15 (901.02 KB, text/plain)
2019-10-10 20:47 UTC, Tormen
Details
2019-09-16 journalctl log -- another case I had recorded (1.42 MB, text/plain)
2019-10-10 20:48 UTC, Tormen
Details
2019-10-11 journalctl log (516.62 KB, text/plain)
2019-10-11 08:43 UTC, Tormen
Details
Screenshot of system shutdown (142.88 KB, image/jpeg)
2019-10-13 15:42 UTC, Tormen
Details
journalctl of problem even with supposed fix of linked issue (1.87 MB, text/plain)
2019-11-10 11:30 UTC, Tormen
Details

Description Tormen 2019-10-10 15:54:02 UTC
Since a few weeks now and several kernel versions I experience regualrily a hang of all the network parts resulting in all GUI processes related to network access being blocked.

The only way out of this is always a reboot, and even this does not work without the help of sysRQ.

The problem only occurs when using one of several DELL USB-C docking stations that have a realtek network card built-in which is using the r8152 driver.

From what I can tell, this seems to be the problem:

Sep 14 10:50:20 huit kernel: INFO: task kworker/8:1:245 blocked for more than 122 seconds.
Sep 14 10:50:20 huit kernel:       Tainted: G           OE     5.2.14-arch2-1-ARCH #1
Sep 14 10:50:20 huit kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 14 10:50:20 huit kernel: kworker/8:1     D    0   245      2 0x80004000
Sep 14 10:50:20 huit kernel: Workqueue: events rtl_work_func_t [r8152]
Sep 14 10:50:20 huit kernel: Call Trace:
Sep 14 10:50:20 huit kernel:  ? __schedule+0x27f/0x6d0
Sep 14 10:50:20 huit kernel:  schedule+0x43/0xd0
Sep 14 10:50:20 huit kernel:  rpm_resume+0x18b/0x790
Sep 14 10:50:20 huit kernel:  ? wait_woken+0x70/0x70
Sep 14 10:50:20 huit kernel:  rpm_resume+0x302/0x790
Sep 14 10:50:20 huit kernel:  __pm_runtime_resume+0x3b/0x60
Sep 14 10:50:20 huit kernel:  usb_autopm_get_interface+0x18/0x50
Sep 14 10:50:20 huit kernel:  rtl_work_func_t+0x6b/0x290 [r8152]
Sep 14 10:50:20 huit kernel:  ? __schedule+0x287/0x6d0
Sep 14 10:50:20 huit kernel:  process_one_work+0x1d1/0x3e0
Sep 14 10:50:20 huit kernel:  worker_thread+0x4a/0x3d0
Sep 14 10:50:20 huit kernel:  kthread+0xfb/0x130
Sep 14 10:50:20 huit kernel:  ? process_one_work+0x3e0/0x3e0
Sep 14 10:50:20 huit kernel:  ? kthread_park+0x80/0x80
Sep 14 10:50:20 huit kernel:  ret_from_fork+0x35/0x40


Please advise if I can collect further input.
Comment 1 Tormen 2019-10-10 15:57:34 UTC
Created attachment 285449 [details]
2019-09-14 journalctl.log
Comment 2 Tormen 2019-10-10 18:06:57 UTC
Created attachment 285453 [details]
2019-10-10 journalctl -b -1 (after crash) shortened, look for <REMOVED LINES>

In this case the problem starts with:

Oct 10 16:56:15 huit kernel: INFO: task NetworkManager:139485 blocked for more than 122 seconds.
Oct 10 16:56:15 huit kernel:       Tainted: G        W  OE     5.2.14-arch2-1-ARCH #1
Oct 10 16:56:15 huit kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 10 16:56:15 huit kernel: NetworkManager  D    0 139485      1 0x00000080
Oct 10 16:56:15 huit kernel: Call Trace:
Oct 10 16:56:15 huit kernel:  ? __schedule+0x27f/0x6d0
Oct 10 16:56:15 huit kernel:  schedule+0x43/0xd0
Oct 10 16:56:15 huit kernel:  schedule_preempt_disabled+0x14/0x20
Oct 10 16:56:15 huit kernel:  __mutex_lock.isra.0+0x27d/0x530
Oct 10 16:56:15 huit kernel:  ? security_capable+0x40/0x60
Oct 10 16:56:15 huit kernel:  rtnetlink_rcv_msg+0xf1/0x3c0
Oct 10 16:56:15 huit kernel:  ? rtnl_calcit.isra.0+0x120/0x120
Oct 10 16:56:15 huit kernel:  netlink_rcv_skb+0x75/0x140
Oct 10 16:56:15 huit kernel:  netlink_unicast+0x177/0x1f0
Oct 10 16:56:15 huit kernel:  netlink_sendmsg+0x1fe/0x3c0
Oct 10 16:56:15 huit kernel:  sock_sendmsg+0x4f/0x60
Oct 10 16:56:15 huit kernel:  ___sys_sendmsg+0x304/0x370
Oct 10 16:56:15 huit kernel:  ? ___sys_recvmsg+0x17b/0x200
Oct 10 16:56:15 huit kernel:  __sys_sendmsg+0x81/0xd0
Oct 10 16:56:15 huit kernel:  do_syscall_64+0x5f/0x1d0
Oct 10 16:56:15 huit kernel:  ? page_fault+0x8/0x30
Oct 10 16:56:15 huit kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 10 16:56:15 huit kernel: RIP: 0033:0x7f2a5f39c5d5
Oct 10 16:56:15 huit kernel: Code: Bad RIP value.
Oct 10 16:56:15 huit kernel: RSP: 002b:00007fffea750d40 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
Oct 10 16:56:15 huit kernel: RAX: ffffffffffffffda RBX: 00000000000b0f64 RCX: 00007f2a5f39c5d5
Oct 10 16:56:15 huit kernel: RDX: 0000000000000000 RSI: 00007fffea750d80 RDI: 000000000000000c
Oct 10 16:56:15 huit kernel: RBP: 000056061843e090 R08: 0000000000000000 R09: 0000000000000000
Oct 10 16:56:15 huit kernel: R10: 0000000000000006 R11: 0000000000000293 R12: 0000000000000000
Oct 10 16:56:15 huit kernel: R13: 00007fffea750eb8 R14: 00007fffea750eb4 R15: 0000000000000000
Oct 10 16:56:15 huit kernel: INFO: task Qt bearer threa:363470 blocked for more than 122 seconds.
Oct 10 16:56:15 huit kernel:       Tainted: G        W  OE     5.2.14-arch2-1-ARCH #1
Oct 10 16:56:15 huit kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 10 16:56:15 huit kernel: Qt bearer threa D    0 363470      1 0x00000080
Oct 10 16:56:15 huit kernel: Call Trace:


What is odd though - preceeding this kernel problem - is that there were a LOT of disconnect, reconnect of USB devices:

Oct 10 09:02:20 huit kernel: usb 2-5.2: Dell TB16 Dock, disable RX aggregation
...
Oct 10 09:02:21 huit kernel: usb 2-5.2: Dell TB16 Dock, disable RX aggregation
...
Oct 10 09:02:22 huit kernel: usb 2-5.2: Dell TB16 Dock, disable RX aggregation
...
the last one was at:
...
Oct 10 15:08:28 huit kernel: usb 2-5.2: Dell TB16 Dock, disable RX aggregation

(so nearly 2h before the crash)

But I did NOT unplug and replug the Dock every second!! ;)


And these repeating kernel messages always included also the network card (built in the Dell TB16 Dock):

Oct 10 09:02:22 huit kernel: usb 2-5.2: new SuperSpeed Gen 1 USB device number 87 using xhci_hcd
Oct 10 09:02:22 huit kernel: usb 2-5.2: New USB device found, idVendor=0bda, idProduct=8153, bcdDevice=30.11
Oct 10 09:02:22 huit kernel: usb 2-5.2: New USB device strings: Mfr=1, Product=2, SerialNumber=6
Oct 10 09:02:22 huit kernel: usb 2-5.2: Product: USB 10/100/1000 LAN
Oct 10 09:02:22 huit kernel: usb 2-5.2: Manufacturer: Realtek
Oct 10 09:02:22 huit kernel: usb 2-5.2: SerialNumber: 000002000000
Oct 10 09:02:22 huit kernel: usb 2-5.2: reset SuperSpeed Gen 1 USB device number 87 using xhci_hcd
Oct 10 09:02:22 huit kernel: usb 2-5.2: Dell TB16 Dock, disable RX aggregation
Oct 10 09:02:22 huit kernel: r8152 2-5.2:1.0 (unnamed net_device) (uninitialized): Using pass-thru MAC addr e4:b9:7a:dd:41:87
Oct 10 09:02:22 huit NetworkManager[139485]: <info>  [1570690942.7467] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/29931)
Oct 10 09:02:22 huit kernel: r8152 2-5.2:1.0 eth0: v1.09.9
Oct 10 09:02:22 huit mtp-probe[1800533]: checking bus 2, device 87: "/sys/devices/pci0000:00/0000:00:14.0/usb2/2-5/2-5.2"
Oct 10 09:02:22 huit mtp-probe[1800533]: bus: 2, device: 87 was not an MTP device
Comment 3 Tormen 2019-10-10 18:09:31 UTC
Probably not important, but please note that I possess multiple (at least 2) docking stations, but NOT *DELL TB16* as reported by the kernel, but *DELL WD15*.

The former is a Thunderbolt one and the latter should be USB-C only I believe.
Comment 4 Tormen 2019-10-10 18:21:41 UTC
If this helps I can also provide more journalctl.logs

E.g. this one:

Aug 28 12:19:05 huit dbus-launch[1008]: See https://wayland.freedesktop.org/libinput/doc/1.14.0/touchpad-jumping-cursors.html for details
Aug 28 12:19:16 huit kernel: INFO: task ThreadPoolForeg:6590 blocked for more than 122 seconds.
Aug 28 12:19:16 huit kernel:       Tainted: G           OE     5.2.9-arch1-1-ARCH #1
Aug 28 12:19:16 huit kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 28 12:19:16 huit kernel: ThreadPoolForeg D    0  6590  28203 0x00000080
Aug 28 12:19:16 huit kernel: Call Trace:
Aug 28 12:19:16 huit kernel:  ? __schedule+0x27f/0x6d0
Aug 28 12:19:16 huit kernel:  schedule+0x3d/0xc0
Aug 28 12:19:16 huit kernel:  schedule_preempt_disabled+0x14/0x20
Aug 28 12:19:16 huit kernel:  __mutex_lock.isra.0+0x27d/0x530
Aug 28 12:19:16 huit kernel:  __netlink_dump_start+0x54/0x1e0
Aug 28 12:19:16 huit kernel:  ? nla_put_ifalias+0xa0/0xa0
Aug 28 12:19:16 huit kernel:  rtnetlink_rcv_msg+0x296/0x3c0
Aug 28 12:19:16 huit kernel:  ? nla_put_ifalias+0xa0/0xa0
Aug 28 12:19:16 huit kernel:  ? rtnl_calcit.isra.0+0x120/0x120
Aug 28 12:19:16 huit kernel:  netlink_rcv_skb+0x75/0x140
Aug 28 12:19:16 huit kernel:  netlink_unicast+0x177/0x1f0
Aug 28 12:19:16 huit kernel:  netlink_sendmsg+0x1fe/0x3c0
Aug 28 12:19:16 huit kernel:  sock_sendmsg+0x4f/0x60
Aug 28 12:19:16 huit kernel:  __sys_sendto+0x120/0x190
Aug 28 12:19:16 huit kernel:  __x64_sys_sendto+0x25/0x30
Aug 28 12:19:16 huit kernel:  do_syscall_64+0x5f/0x1d0
Aug 28 12:19:16 huit kernel:  ? page_fault+0x8/0x30
Aug 28 12:19:16 huit kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 28 12:19:16 huit kernel: RIP: 0033:0x7f02b985f6a4
Aug 28 12:19:16 huit kernel: Code: Bad RIP value.
Aug 28 12:19:16 huit kernel: RSP: 002b:00007f02a7d94b30 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
Aug 28 12:19:16 huit kernel: RAX: ffffffffffffffda RBX: 00007f02a7d95c50 RCX: 00007f02b985f6a4
Aug 28 12:19:16 huit kernel: RDX: 0000000000000014 RSI: 00007f02a7d95c50 RDI: 0000000000000032
Aug 28 12:19:16 huit kernel: RBP: 0000000000000000 R08: 00007f02a7d95bf4 R09: 000000000000000c
Aug 28 12:19:16 huit kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 00007f02a7d95bf4
Aug 28 12:19:16 huit kernel: R13: 00007f02a7d96200 R14: 00007f02a7d96308 R15: 0000000000000032
Aug 28 12:19:16 huit kernel: INFO: task Qt bearer threa:14720 blocked for more than 122 seconds.
Aug 28 12:19:16 huit kernel:       Tainted: G           OE     5.2.9-arch1-1-ARCH #1
Aug 28 12:19:16 huit kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 28 12:19:16 huit kernel: Qt bearer threa D    0 14720      1 0x00000080
Aug 28 12:19:16 huit kernel: Call Trace:
...
Comment 5 Tormen 2019-10-10 18:25:29 UTC
Or this one:

Sep 04 10:19:38 huit dbus-launch[1124]:  ]
Sep 04 10:19:57 huit kernel: INFO: task Qt bearer threa:29521 blocked for more than 122 seconds.
Sep 04 10:19:57 huit kernel:       Tainted: G        W  OE     5.2.11-arch1-1-ARCH #1
Sep 04 10:19:57 huit kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 04 10:19:57 huit kernel: Qt bearer threa D    0 29521      1 0x00000080
Sep 04 10:19:57 huit kernel: Call Trace:
Sep 04 10:19:57 huit kernel:  ? __schedule+0x27f/0x6d0
Sep 04 10:19:57 huit kernel:  schedule+0x3d/0xc0
Sep 04 10:19:57 huit kernel:  schedule_preempt_disabled+0x14/0x20
Sep 04 10:19:57 huit kernel:  __mutex_lock.isra.0+0x27d/0x530
Sep 04 10:19:57 huit kernel:  __netlink_dump_start+0x54/0x1e0
Sep 04 10:19:57 huit kernel:  ? rtnl_fill_ifinfo+0x1020/0x1020
Sep 04 10:19:57 huit kernel:  rtnetlink_rcv_msg+0x296/0x3c0
Sep 04 10:19:57 huit kernel:  ? rtnl_fill_ifinfo+0x1020/0x1020
Sep 04 10:19:57 huit kernel:  ? rtnl_calcit.isra.0+0x120/0x120
Sep 04 10:19:57 huit kernel:  netlink_rcv_skb+0x75/0x140
Sep 04 10:19:57 huit kernel:  netlink_unicast+0x177/0x1f0
Sep 04 10:19:57 huit kernel:  netlink_sendmsg+0x1fe/0x3c0
Sep 04 10:19:57 huit kernel:  sock_sendmsg+0x4f/0x60
Sep 04 10:19:57 huit kernel:  __sys_sendto+0x120/0x190
Sep 04 10:19:57 huit kernel:  __x64_sys_sendto+0x25/0x30
Sep 04 10:19:57 huit kernel:  do_syscall_64+0x5f/0x1d0
Sep 04 10:19:57 huit kernel:  ? prepare_exit_to_usermode+0x85/0xb0
Sep 04 10:19:57 huit kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Sep 04 10:19:57 huit kernel: RIP: 0033:0x7f73191ff6a4
Sep 04 10:19:57 huit kernel: Code: Bad RIP value.
Sep 04 10:19:57 huit kernel: RSP: 002b:00007f7308e30ba0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
Sep 04 10:19:57 huit kernel: RAX: ffffffffffffffda RBX: 00007f7308e31d30 RCX: 00007f73191ff6a4
Sep 04 10:19:57 huit kernel: RDX: 0000000000000014 RSI: 00007f7308e31c70 RDI: 000000000000001e
Sep 04 10:19:57 huit kernel: RBP: 0000000000000000 R08: 00007f7308e31c30 R09: 000000000000000c
Sep 04 10:19:57 huit kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 00007f7308e31c30
Sep 04 10:19:57 huit kernel: R13: 00007f7308e31c70 R14: 000000000800ea30 R15: 00007f7308e30be0
Sep 04 10:19:57 huit kernel: INFO: task Thread (pooled):30239 blocked for more than 122 seconds.
Sep 04 10:19:57 huit kernel:       Tainted: G        W  OE     5.2.11-arch1-1-ARCH #1
Sep 04 10:19:57 huit kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 04 10:19:57 huit kernel: Thread (pooled) D    0 30239      1 0x00000080
Sep 04 10:19:57 huit kernel: Call Trace:
Sep 04 10:19:57 huit kernel:  ? __schedule+0x27f/0x6d0

...

You get the idea so I'll stop here. 

Please let me know if you need different / more debug output.
Comment 6 Tormen 2019-10-10 20:47:12 UTC
Created attachment 285455 [details]
2019-10-10 2nd journalctl -b log showing kernel hang but at different WD15

The first journalctl log from today (2019-10-10) was when plugged to WD15 at work, this one now happened at home.
Comment 7 Tormen 2019-10-10 20:48:51 UTC
Created attachment 285457 [details]
2019-09-16 journalctl log -- another case I had recorded
Comment 8 Tormen 2019-10-11 08:43:16 UTC
Created attachment 285459 [details]
2019-10-11 journalctl log

I noticed something interesting this time (and I believe this was always the case I just did not see and understand the pattern yet):

wireless was off (via Networkmanager and consequently also via rfkill!) and I was connected to the realtek wired network adapter via the DELL WD15 usb-c dock at home.

And out of a sudden my machine lost internet. I first noticed it in firefox: Page could not load anymore.
Then I opened a shell and pinged 1.1.1.1: Timeout, no response.
My wife beside me (with another DELL laptop at another WD15 docking station, but under Windows still had internet no problem).

And then I remembere that the kernel posting in the logs always appeared after some hang-checker noticed that parts of the kernel are dead / locked / not reactive anymore.

So I waited and TADA... 2minutes later the usual hang-check message came.

Maybe these observations can help understand / narrow down the problem ?

Where I am still trying to remember is if this ever worked or not. I think it did work at the beginning, so this would then indicate a regression... but I really can't say for sure :((
Comment 9 Tormen 2019-10-13 15:42:25 UTC
Created attachment 285485 [details]
Screenshot of system shutdown

Maybe this can help as well:

Screenshot of system shutdown:
systemd unable to end network related user-space processes (last line).

And the only way I found to get past this blocker in the shutdown process is to use SysRQ + R E I S U B.

Waiting or even forcing systemd to shutdown by pressing 7x Ctrl+Alt+Del within 2 seconds does not work.
Comment 10 Tormen 2019-10-16 14:35:08 UTC
I removed the [r8152] because it /seems to me like this is linked, but maybe it's not.

Yesterday it happened again. And there was also a kernel trace from iwifi hours before the hang-check timer alerted about the problem of this ticket again.

But again the r8152 seems to be involved ... and/or the DELL docking station.
As I was not sure what to leave out, here is the full kernel log:

Showing first the iwlwifi kernel Call Trace and then the hang-check related call traces:

https://0x0.st/zxYB.txt
Comment 11 Tormen 2019-10-16 22:24:56 UTC
Sorry: 
This bug is already happening since several months and I just remembered that during several weeks in summer I was so annoyed by it, that I blacklisted the r8152 module and I am pretty sure to remember that the same bug did NOT happen anymore during that time!

It's just annoying because then I also don't get the 1000Mbit of the wired network connection of the docking station :/
Comment 12 Tormen 2019-11-10 11:19:01 UTC
Update:

Even after following the advice from the linked ticket (by adding this to udev rules):

ACTION=="add", SUBSYSTEM=="usb", ATTR{idVendor}=="0bda", ATTR{idProduct}=="8153", TEST=="power/control", ATTR{power/control}="on"
# Disable CPU turbo:
KERNEL=="cpu",RUN+="/bin/sh -c 'echo -n 1 > /sys/devices/system/cpu/intel_pstate/no_turbo'"

I got the same problem again!!
Comment 13 Tormen 2019-11-10 11:20:18 UTC
Is anyone actually reading this?

I thought quality of drivers should matter in linux ?

This driver seems buggy.

Why does no one seem to care?
Comment 14 Tormen 2019-11-10 11:30:08 UTC
Created attachment 285845 [details]
journalctl of problem even with supposed fix of linked issue

journalctl of problem even with supposed fix of linked issue

complete journalctl but without "sudo", "dbus-launch", "thunar" targets ;)
Comment 15 Adam Mizerski 2019-11-21 14:21:08 UTC
I have the same problem. I have an "i-tec USB 3.0 Metal Gigabit Ethernet Adapter", which uses this chip. I have network lockups with both drivers: provided with kernel and downloaded from RealTek. Also network speed over USB3 is much worse than USB2.

I tried discussing the issue on openSUSE Support mailing list, but without success (https://lists.opensuse.org/opensuse-support/2019-11/msg00057.html).
Comment 16 Ronan Pigott 2019-12-02 23:27:35 UTC
I can also reproduce this bug on 5.4.1.

5.4 includes some recent patches to r8152, but it doesn't seem that this lockup is fixed.
Comment 17 Tormen 2019-12-05 10:18:16 UTC
*** Bug 205557 has been marked as a duplicate of this bug. ***
Comment 18 Adam Mizerski 2019-12-15 20:43:27 UTC
Question to all affected: What distro are you using? What kernel?

Recently I was talking to a friend about this issue. He shown me his adapter and told me he had no problems running it with Ubuntu. I checked that it's the same chip.

I'm using openSUSE Tumbleweed. I installed kernel-vanilla few hours ago and plugged the adapter in. So far it's good. I'll do some more testing and if it turns out it's reliable I'll make a bug report on openSUSE bugzilla that the default kernel is broken.
Comment 19 Tormen 2019-12-16 15:31:26 UTC
Hi @Adam,

I am using Archlinux with archlinux 64-bit kernel.
Currently:
Linux huit 5.3.7-arch1-1-ARCH #1 SMP PREEMPT Fri Oct 18 00:17:03 UTC 2019 x86_64 GNU/Linux

The problem reoccured today.

Tormen
Comment 20 Adam Mizerski 2019-12-17 07:39:18 UTC
I got lockup with vanilla kernel 5.3.12 too.
Comment 21 François Guerraz 2020-04-23 08:29:03 UTC
I can confirm the issue on 5.6.6 (vanilla) with a Dell DA200 on a Dell XPS 9350. The kernel locks up after a minute or two.
Comment 22 gökçe 2020-10-17 14:14:57 UTC
Like Tormen's comment #0, I get a similar error with 5.8.14-arch1-1 *without* mutex_lockup:

...
rpm_resume+0x189/0x820
? wait_woken+0x80/0x80
rpm_resume+0x2fe/0x820
__pm_runtime_resume+0x3b/0x60
usb_autopm_get_interface+0x18/0x50
rtl8152_get_link_ksettings+0x27/0x80 [r8152]
ethtool_get_settings+0xa7/0x1e0
dev_ethtool+0x109c/0x2a90
...

In my case the problem happens sporadically, last time it happened 2 days after boot.
Comment 23 Emil Jacobsen 2020-11-03 20:09:39 UTC
I have the same problem, it's very frustrating. The system freezes a few times per day. I'm using Debian (Buster) on a Dell XPS 13, and a USB hub with ethernet port by Satechi. I'm really new to this, let me know if I can help with more information, or otherwise. Thanks.
Comment 24 Daniel Martí 2020-11-11 09:06:47 UTC
I thought I was going crazy - my system locks up with a usb-c ethernet dongle too. Arch Linux with 5.9.6, and a Thinkpad X1 Carbon 6th gen (2018) with a Syncwire USB-C to Ethernet adapter (B074C76S4H).

I'm also new here, so I'm happy to follow instructions to help debug this.
Comment 25 Igor 2020-11-25 17:29:55 UTC
I also appear to be having the same issue. Started after a recent reboot to kernel `5.4.0-54-generic #60-Ubuntu SMP` -- previous kernel was likely ubuntu's `5.4.0-42-generic` so the issue may have started somewhere in there. hard reboot seems the only option to fix, since even `sudo` does not work (probably hanging on hostname resolution?)
Comment 26 jwpevans 2021-02-03 12:42:48 UTC
I've also experienced this issue on kernel version `5.4.0-65-generic`. The trigger for me appears to be a USB reset - see the following `syslog` extract:

```
Feb  3 12:14:52 my-machine kernel: [  470.526316] usb 2-2.3: reset SuperSpeed Gen 1 USB device number 3 using xhci_hcd
Feb  3 12:14:52 my-machine kernel: [  471.045687] usb 2-2.3.4: reset SuperSpeed Gen 1 USB device number 4 using xhci_hcd
Feb  3 12:17:06 my-machine kernel: [  604.924779] INFO: task kworker/3:2:127 blocked for more than 120 seconds.
Feb  3 12:17:06 my-machine kernel: [  604.924782]       Not tainted 5.4.0-65-generic #73-Ubuntu
Feb  3 12:17:06 my-machine kernel: [  604.924783] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb  3 12:17:06 my-machine kernel: [  604.924784] kworker/3:2     D    0   127      2 0x80004000
Feb  3 12:17:06 my-machine kernel: [  604.924791] Workqueue: events_long rtl_hw_phy_work_func_t [r8152]
Feb  3 12:17:06 my-machine kernel: [  604.924792] Call Trace:
Feb  3 12:17:06 my-machine kernel: [  604.924797]  __schedule+0x2e3/0x740
Feb  3 12:17:06 my-machine kernel: [  604.924798]  schedule+0x42/0xb0
Feb  3 12:17:06 my-machine kernel: [  604.924800]  rpm_resume+0x174/0x780
Feb  3 12:17:06 my-machine kernel: [  604.924802]  ? __switch_to_asm+0x40/0x70
Feb  3 12:17:06 my-machine kernel: [  604.924804]  ? wait_woken+0x80/0x80
Feb  3 12:17:06 my-machine kernel: [  604.924806]  rpm_resume+0x31d/0x780
Feb  3 12:17:06 my-machine kernel: [  604.924807]  ? __switch_to_asm+0x40/0x70
Feb  3 12:17:06 my-machine kernel: [  604.924809]  ? __switch_to_asm+0x34/0x70
Feb  3 12:17:06 my-machine kernel: [  604.924810]  ? __switch_to_asm+0x40/0x70
Feb  3 12:17:06 my-machine kernel: [  604.924811]  ? __switch_to_asm+0x34/0x70
Feb  3 12:17:06 my-machine kernel: [  604.924813]  ? __switch_to_asm+0x40/0x70
Feb  3 12:17:06 my-machine kernel: [  604.924814]  __pm_runtime_resume+0x52/0x80
Feb  3 12:17:06 my-machine kernel: [  604.924816]  usb_autopm_get_interface+0x1d/0x50
Feb  3 12:17:06 my-machine kernel: [  604.924818]  rtl_hw_phy_work_func_t+0x29/0xa0 [r8152]
Feb  3 12:17:06 my-machine kernel: [  604.924821]  process_one_work+0x1eb/0x3b0
Feb  3 12:17:06 my-machine kernel: [  604.924822]  worker_thread+0x4d/0x400
Feb  3 12:17:06 my-machine kernel: [  604.924824]  kthread+0x104/0x140
Feb  3 12:17:06 my-machine kernel: [  604.924825]  ? process_one_work+0x3b0/0x3b0
Feb  3 12:17:06 my-machine kernel: [  604.924826]  ? kthread_park+0x90/0x90
Feb  3 12:17:06 my-machine kernel: [  604.924828]  ret_from_fork+0x35/0x40
Feb  3 12:17:06 my-machine kernel: [  604.924831] INFO: task kworker/3:3:220 blocked for more than 120 seconds.
Feb  3 12:17:06 my-machine kernel: [  604.924833]       Not tainted 5.4.0-65-generic #73-Ubuntu
Feb  3 12:17:06 my-machine kernel: [  604.924833] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb  3 12:17:06 my-machine kernel: [  604.924834] kworker/3:3     D    0   220      2 0x80004000
Feb  3 12:17:06 my-machine kernel: [  604.924838] Workqueue: usb_hub_wq hub_event
Feb  3 12:17:06 my-machine kernel: [  604.924839] Call Trace:
Feb  3 12:17:06 my-machine kernel: [  604.924841]  __schedule+0x2e3/0x740
Feb  3 12:17:06 my-machine kernel: [  604.924842]  schedule+0x42/0xb0
Feb  3 12:17:06 my-machine kernel: [  604.924843]  rpm_resume+0x174/0x780
Feb  3 12:17:06 my-machine kernel: [  604.924844]  ? wait_woken+0x80/0x80
Feb  3 12:17:06 my-machine kernel: [  604.924845]  rpm_resume+0x31d/0x780
Feb  3 12:17:06 my-machine kernel: [  604.924847]  ? get_registers+0x94/0xc0 [r8152]
Feb  3 12:17:06 my-machine kernel: [  604.924848]  __pm_runtime_resume+0x52/0x80
Feb  3 12:17:06 my-machine kernel: [  604.924850]  usb_autopm_get_interface+0x1d/0x50
Feb  3 12:17:06 my-machine kernel: [  604.924851]  rtl8152_set_mac_address+0x54/0x1c0 [r8152]
Feb  3 12:17:06 my-machine kernel: [  604.924853]  ? __queue_work+0x14c/0x3f0
Feb  3 12:17:06 my-machine kernel: [  604.924854]  set_ethernet_addr+0x78/0x80 [r8152]
Feb  3 12:17:06 my-machine kernel: [  604.924856]  ? trace_event_raw_event_workqueue_work+0xa0/0xa0
Feb  3 12:17:06 my-machine kernel: [  604.924858]  rtl8152_reset_resume+0x50/0x60 [r8152]
Feb  3 12:17:06 my-machine kernel: [  604.924859]  usb_resume_interface.isra.0+0x4a/0xd0
Feb  3 12:17:06 my-machine kernel: [  604.924861]  usb_resume_both+0xf0/0x140
Feb  3 12:17:06 my-machine kernel: [  604.924862]  ? usb_runtime_suspend+0x70/0x70
Feb  3 12:17:06 my-machine kernel: [  604.924864]  usb_runtime_resume+0x1a/0x20
Feb  3 12:17:06 my-machine kernel: [  604.924865]  __rpm_callback+0x8c/0x150
Feb  3 12:17:06 my-machine kernel: [  604.924866]  rpm_callback+0x57/0x80
Feb  3 12:17:06 my-machine kernel: [  604.924867]  ? usb_runtime_suspend+0x70/0x70
Feb  3 12:17:06 my-machine kernel: [  604.924868]  rpm_resume+0x568/0x780
Feb  3 12:17:06 my-machine kernel: [  604.924871]  ? try_to_del_timer_sync+0x54/0x80
Feb  3 12:17:06 my-machine kernel: [  604.924872]  __pm_runtime_resume+0x52/0x80
Feb  3 12:17:06 my-machine kernel: [  604.924873]  usb_autoresume_device+0x20/0x60
Feb  3 12:17:06 my-machine kernel: [  604.924875]  usb_remote_wakeup+0x4c/0x90
Feb  3 12:17:06 my-machine kernel: [  604.924877]  port_event+0x40b/0x780
Feb  3 12:17:06 my-machine kernel: [  604.924879]  hub_event+0x152/0x390
Feb  3 12:17:06 my-machine kernel: [  604.924880]  process_one_work+0x1eb/0x3b0
Feb  3 12:17:06 my-machine kernel: [  604.924882]  worker_thread+0x4d/0x400
Feb  3 12:17:06 my-machine kernel: [  604.924883]  kthread+0x104/0x140
Feb  3 12:17:06 my-machine kernel: [  604.924885]  ? process_one_work+0x3b0/0x3b0
Feb  3 12:17:06 my-machine kernel: [  604.924886]  ? kthread_park+0x90/0x90
Feb  3 12:17:06 my-machine kernel: [  604.924887]  ret_from_fork+0x35/0x40
```

~ 2 minutes after the USB reset, the kernel detects two hung tasks.
Comment 27 Ronan Pigott 2021-08-20 20:17:14 UTC
Anyone want to test if this is fixed by 776ac63a98?
Comment 28 Ronan Pigott 2021-08-20 20:43:51 UTC
Oh nvm, didn't realize it had already been cherry picked to stable. Indeed it does work.

This has been fixed in linux 5.13.9.
Comment 29 tobias 2021-09-04 06:16:59 UTC
Freeze still occurs with

Linux 5.13.13-arch1-1 #1 SMP PREEMPT Thu, 26 Aug 2021 19:14:36 +0000 x86_64 GNU/Linux

journal shows the following:

Tainted: G           OE     5.13.13-arch1-1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/2:1     state:D stack:    0 pid:   86 ppid:     2 flags:0x00004000
Workqueue: events rtl_work_func_t [r8152]
Call Trace:
 __schedule+0x310/0x930
 schedule+0x5b/0xc0
 rpm_resume+0x18c/0x800
 ? do_wait_intr_irq+0xc0/0xc0
 rpm_resume+0x30d/0x800
 __pm_runtime_resume+0x4a/0x80
 usb_autopm_get_interface+0x18/0x40
 rtl_work_func_t+0x69/0x290 [r8152]
 process_one_work+0x1e0/0x3b0
 worker_thread+0x50/0x3b0
 ? process_one_work+0x3b0/0x3b0
 kthread+0x130/0x160
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x1f/0x30

Do you have any further advice or suggestion what to do?
3+ daily freezes are quite annoying...

Let me know if you need further information.
Comment 30 tobias 2021-09-04 19:17:06 UTC
(In reply to tobias from comment #29)
Update: Sorry, my report seems to be an arch kernel specific issue. The patch just seems not to be included...  

Now I use the mainstream kernel
5.14.0-1-git-09284-gf1583cb1be35
and no freeze happened for longer than usual...
Comment 31 Ronan Pigott 2021-09-05 08:40:50 UTC
It is included in 5.13.9-arch1 [1]

I will say that just as reported in Takashi's mail [2], there is still a crash that can occur.

However, before the patch I had a 100% reproduction rate. In my experience accidentally plugging in my ethernet adapter, even briefly, was 100% guaranteed to cause the crash, and it happened precisely on schedule to, within a few minutes.

Now I can actually use the device, sometimes for hours, without error. But it can still crash, just rarely.

[1] https://github.com/archlinux/linux/commit/a6b2ef5b5ff
[2] https://marc.info/?l=linux-netdev&m=162628202522057
Comment 32 Adam Mizerski 2021-10-15 11:51:49 UTC
I tried recently to use my adapter and got freeze after some time with this dmesg message:

[228382.213739] ------------[ cut here ]------------
[228382.213762] NETDEV WATCHDOG: enp0s20f0u2 (r8152): transmit queue 0 timed out
[228382.213854] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x24a/0x250
[228382.213888] Modules linked in: vhost_net vhost vhost_iotlb tap ip6t_REJECT r8153_ecm cdc_ether r8152 tun ccm rfcomm cmac algif_hash algif_skcipher af_alg bnep ath3k btusb btrtl btbcm btintel bluetooth ecdh_generic xfs snd_seq_dummy snd_hrtimer snd_seq af_packet nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CHECKSUM ipt_REJECT xt_tcpudp nf_nat_tftp nft_objref nf_conntrack_tftp wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 curve25519_x86_64 libcurve25519_generic libchacha libblake2s_generic ip6_udp_tunnel udp_tunnel xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype snd_usb_audio snd_usbmidi_lib snd_rawmidi nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib snd_seq_device br_netfilter nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject bridge stp llc nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack
[228382.214379]  nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables bpfilter dmi_sysfs msr uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev ax88179_178a mc usbnet mii snd_soc_skl ath9k snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc ath9k_common snd_soc_sst_dsp ath9k_hw snd_soc_acpi_intel_match snd_hda_codec_hdmi snd_soc_acpi ath snd_soc_core snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp snd_compress coretemp mac80211 snd_pcm_dmaengine snd_hda_intel kvm_intel dwc3 snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec iTCO_wdt intel_pmc_bxt intel_rapl_msr ee1004 iTCO_vendor_support kvm ulpi snd_hda_core cfg80211 snd_hwdep snd_pcm processor_thermal_device_pci_legacy snd_timer processor_thermal_device processor_thermal_rfim irqbypass snd
[228382.214828]  processor_thermal_mbox joydev pcspkr processor_thermal_rapl intel_pch_thermal intel_rapl_common i2c_i801 intel_pmc_core rfkill i2c_smbus soundcore libarc4 int340x_thermal_zone topstar_laptop dwc3_pci ac sparse_keymap intel_xhci_usb_role_switch roles intel_soc_dts_iosf thermal tiny_power_button button fuse configfs binfmt_misc dm_crypt essiv authenc trusted asn1_encoder tee hid_generic usbhid uas usb_storage i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit ttm drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm xhci_pci xhci_pci_renesas xhci_hcd aesni_intel crypto_simd cryptd usbcore serio_raw battery pinctrl_sunrisepoint video btrfs blake2b_generic libcrc32c crc32c_intel xor raid6_pq sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
[228382.215118] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.14.6-1-default #1 openSUSE Tumbleweed 539bdc3afabb63aa01656ffde27ce5025c985e0c
[228382.215133] Hardware name: Purism Librem 15 v3/Librem 15 v3, BIOS 4.14-Purism-1 06/18/2021
[228382.215139] RIP: 0010:dev_watchdog+0x24a/0x250
[228382.215159] Code: ac 4f fd ff eb a9 4c 89 f7 c6 05 f1 8c 49 01 01 e8 0b 32 fa ff 44 89 e9 4c 89 f6 48 c7 c7 a8 20 c2 8e 48 89 c2 e8 3f 41 16 00 <0f> 0b eb 8a 66 90 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55 4d 89
[228382.215167] RSP: 0018:ffffa3ba80003ea8 EFLAGS: 00010286
[228382.215178] RAX: 0000000000000000 RBX: ffff941e90ef8000 RCX: 000000000000083f
[228382.215185] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f
[228382.215192] RBP: ffff941c6e0f83dc R08: ffffffff8f357ec8 R09: 00000000ffffdfff
[228382.215198] R10: ffffffff8f277ee0 R11: ffffffff8f277ee0 R12: ffff941c6e0f8480
[228382.215205] R13: 0000000000000000 R14: ffff941c6e0f8000 R15: ffff941e90ef8080
[228382.215211] FS:  0000000000000000(0000) GS:ffff941fbfc00000(0000) knlGS:0000000000000000
[228382.215220] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[228382.215227] CR2: 000038401d546000 CR3: 0000000150e10003 CR4: 00000000003706f0
[228382.215235] Call Trace:
[228382.215242]  <IRQ>
[228382.215250]  ? pfifo_fast_enqueue+0x150/0x150
[228382.215265]  call_timer_fn+0x26/0xf0
[228382.215278]  __run_timers.part.0+0x1cc/0x230
[228382.215289]  ? __hrtimer_run_queues+0x136/0x270
[228382.215301]  ? recalibrate_cpu_khz+0x10/0x10
[228382.215312]  ? ktime_get+0x38/0x90
[228382.215325]  run_timer_softirq+0x26/0x50
[228382.215335]  __do_softirq+0xc6/0x268
[228382.215348]  __irq_exit_rcu+0xb7/0xe0
[228382.215358]  sysvec_apic_timer_interrupt+0x72/0x90
[228382.215372]  </IRQ>
[228382.215377]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[228382.215387] RIP: 0010:cpuidle_enter_state+0xc7/0x350
[228382.215401] Code: 8b 3d 65 7c 05 72 e8 e8 a9 91 ff 49 89 c5 0f 1f 44 00 00 31 ff e8 d9 bf 91 ff 45 84 ff 0f 85 fa 00 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 06 01 00 00 49 63 d6 4c 2b 2c 24 48 8d 04 52 48 8d
[228382.215408] RSP: 0018:ffffffff8f203e50 EFLAGS: 00000246
[228382.215418] RAX: ffff941fbfc2f900 RBX: 0000000000000008 RCX: 000000000000001f
[228382.215425] RDX: 0000000000000000 RSI: 00000000316204ea RDI: 0000000000000000
[228382.215431] RBP: ffff941fbfc3ac00 R08: 0000cfb660b8ab4d R09: 0000000000021240
[228382.215437] R10: 0000000000000001 R11: 00000000000013d1 R12: ffffffff8f45a280
[228382.215443] R13: 0000cfb660b8ab4d R14: 0000000000000008 R15: 0000000000000000
[228382.215456]  cpuidle_enter+0x29/0x40
[228382.215468]  do_idle+0x1f2/0x2b0
[228382.215479]  cpu_startup_entry+0x19/0x20
[228382.215489]  start_kernel+0x90b/0x932
[228382.215507]  secondary_startup_64_no_verify+0xc2/0xcb
[228382.215523] ---[ end trace dc56753d11ce3fd7 ]---
[228382.215536] r8152 2-2:1.0 enp0s20f0u2: Tx timeout
[228387.337474] r8152 2-2:1.0 enp0s20f0u2: Tx timeout
[228393.221356] r8152 2-2:1.0 enp0s20f0u2: Tx timeout
Comment 33 Vitaly 2021-11-22 07:48:42 UTC
Also have same problem.
Kernel: 5.14.10-1-MANJARO
Type-C Ethernet adapter

14:40:02: ------------[ cut here ]------------
14:40:02: NETDEV WATCHDOG: enp0s13f0u1u1 (r8152): transmit queue 0 timed out
14:40:02: WARNING: CPU: 6 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x25b/0x270
14:40:02: Modules linked in: rfcomm r8153_ecm r8152 cdc_ether usbnet mii squashfs qrtr ns loop cmac algif_hash algif_skcipher af_alg bnep dm_crypt cbc encrypted_keys trusted asn1_encoder tee snd_soc_skl_hda_dsp snd_soc_intel_hda_dsp_common snd_soc_hdac_hdmi snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_soc_dmic snd_sof_pci_intel_icl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp intel_tcc_cooling snd_sof x86_pkg_temp_thermal intel_powerclamp coretemp btusb snd_soc_hdac_hda btrtl btbcm snd_hda_ext_core btintel kvm_intel snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus bluetooth usbhid ledtrig_audio snd_soc_core kvm joydev mousedev ecdh_generic snd_compress ecc iTCO_wdt intel_pmc_bxt hid_multitouch iTCO_vendor_support mei_hdcp iwlmvm intel_rapl_msr ac97_bus irqbypass crct10dif_pclmul intel_wmi_thunderbolt nouveau wmi_bmof snd_pcm_dmaengine crc32_pclmul mac80211 i915
14:40:02:  ghash_clmulni_intel snd_hda_intel aesni_intel libarc4 crypto_simd cryptd snd_intel_dspcfg snd_intel_sdw_acpi rapl intel_cstate mxm_wmi intel_uncore iwlwifi drm_ttm_helper i2c_algo_bit snd_hda_codec ttm drm_kms_helper snd_hda_core intel_spi_pci intel_spi pcspkr psmouse spi_nor vfat mtd cfg80211 dm_mod fat snd_hwdep i2c_i801 cec processor_thermal_device_pci_legacy i2c_smbus processor_thermal_device intel_gtt ucsi_acpi processor_thermal_rfim agpgart processor_thermal_mbox processor_thermal_rapl mei_me intel_lpss_pci syscopyarea typec_ucsi intel_rapl_common intel_lpss sysfillrect mei idma64 rfkill typec intel_soc_dts_iosf sysimgblt i2c_hid_acpi fb_sys_fops roles tpm_crb int3403_thermal i2c_hid int340x_thermal_zone tpm_tis wmi tpm_tis_core video acpi_pad tpm mac_hid int3400_thermal acpi_thermal_rel rng_core acpi_tad snd_aloop snd_pcm snd_timer snd soundcore v4l2loopback_dc(OE) videodev drm fuse mc crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 serio_raw
14:40:02:  atkbd libps2 i8042 crc32c_intel xhci_pci serio
14:40:02: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G           OE     5.14.10-1-MANJARO #1
14:40:02: Hardware name: TIMI RedmiBook 14 II/TM2001, BIOS RMAIC4B0P0303 05/29/2020
14:40:02: RIP: 0010:dev_watchdog+0x25b/0x270
14:40:02: Code: 1a 44 70 ff eb 94 4c 89 f7 c6 05 23 2d 2c 01 01 e8 ba f7 f9 ff 44 89 e9 4c 89 f6 48 c7 c7 30 32 03 94 48 89 c2 e8 61 ca 17 00 <0f> 0b e9 72 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f
14:40:02: RSP: 0018:ffffb32cc02d0ea8 EFLAGS: 00010286
14:40:02: RAX: 0000000000000000 RBX: ffff8c7157397200 RCX: 0000000000000027
14:40:02: RDX: ffff8c74dfb98728 RSI: 0000000000000001 RDI: ffff8c74dfb98720
14:40:02: RBP: ffff8c71545a93dc R08: 0000000000000000 R09: ffffb32cc02d0cd8
14:40:02: R10: ffffb32cc02d0cd0 R11: ffffffff946cd108 R12: ffff8c71545a9480
14:40:02: R13: 0000000000000000 R14: ffff8c71545a9000 R15: ffff8c7157397280
14:40:02: FS:  0000000000000000(0000) GS:ffff8c74dfb80000(0000) knlGS:0000000000000000
14:40:02: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
14:40:02: CR2: 00007faedc149808 CR3: 000000017356a002 CR4: 0000000000770ee0
14:40:02: PKRU: 55555554
14:40:02: Call Trace:
14:40:02:  <IRQ>
14:40:02:  ? pfifo_fast_reset+0x120/0x120
14:40:02:  ? pfifo_fast_reset+0x120/0x120
14:40:02:  call_timer_fn+0x24/0x130
14:40:02:  __run_timers+0x1e6/0x270
14:40:02:  run_timer_softirq+0x19/0x30
14:40:02:  __do_softirq+0xcd/0x2b4
14:40:02:  irq_exit_rcu+0xa9/0xc0
14:40:02:  sysvec_apic_timer_interrupt+0x72/0x90
14:40:02:  </IRQ>
14:40:02:  asm_sysvec_apic_timer_interrupt+0x12/0x20
14:40:02: RIP: 0010:cpuidle_enter_state+0xc7/0x380
14:40:02: Code: 8b 3d b5 97 be 6c e8 58 8e 8a ff 49 89 c5 0f 1f 44 00 00 31 ff e8 79 9b 8a ff 45 84 ff 0f 85 da 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 d6 4c 2b 2c 24 48 8d 04 52 48 8d
14:40:02: RSP: 0018:ffffb32cc0157ea8 EFLAGS: 00000246
14:40:02: RAX: ffff8c74dfbace00 RBX: 0000000000000002 RCX: 000000000000001f
14:40:02: RDX: 0000000000000000 RSI: 0000000055785785 RDI: 0000000000000000
14:40:02: RBP: ffff8c74dfbb7f00 R08: 0000004352fa6c45 R09: 0000000000000018
14:40:02: R10: 0000000000003b3c R11: 00000000000005b3 R12: ffffffff94748120
14:40:02: R13: 0000004352fa6c45 R14: 0000000000000002 R15: 0000000000000000
14:40:02:  ? cpuidle_enter_state+0xb7/0x380
14:40:02:  cpuidle_enter+0x29/0x40
14:40:02:  do_idle+0x1e1/0x270
14:40:02:  cpu_startup_entry+0x19/0x20
14:40:02:  secondary_startup_64_no_verify+0xc2/0xcb
14:40:02: ---[ end trace 83f747ac5e12b99e ]---
14:40:02: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
14:40:05: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx status -2
14:40:05: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx status -2
14:40:05: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx status -2
14:40:05: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx status -2
14:40:07: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
14:40:12: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
14:40:17: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
14:40:22: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
14:40:27: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
14:40:32: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
14:40:37: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
14:40:42: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
Comment 34 Ronan Pigott 2022-01-30 09:47:51 UTC
I tried it out again and lockup I had appears to be completely fixed. I've used my adapter without issue for a few days now. There have been several changes in the driver since I last commented, but I haven't bisected them.
Comment 35 Ronan Pigott 2022-01-30 09:50:25 UTC
Eh, accidentally hit save before I added version info:

$ pacman -Q linux
linux 5.16.3.arch1-1
Comment 36 tobias 2022-01-30 09:52:58 UTC
I confirm no regression since 5.14.0-1-git.
Works smoothly for months now.
From my point of view the ticket can be closed.
Comment 37 Vitaly 2022-02-01 03:55:43 UTC
Still get same error, but very rare on kernel 5.15.16-1-MANJARO
Possibly it's rely with suspend/lock now... I will try to reproduce.
Comment 38 Ronan Pigott 2022-02-01 18:54:22 UTC
@Vitaly I think yours and Adam's issue is something else entirely. It's missing the characteristic "INFO: task kworker blocked for more than 122 seconds" that appears in Tormen's reports and my own in the duplicate issue https://bugzilla.kernel.org/show_bug.cgi?id=205557#c4
Comment 39 Vitaly 2022-02-02 05:35:11 UTC
Nope errors still in place for me. Today it goes down in couple minutes after system boot.

kernel: 5.15.16-1-MANJARO

kernel: ------------[ cut here ]------------
kernel: NETDEV WATCHDOG: enp0s13f0u1u1 (r8152): transmit queue 0 timed out
kernel: WARNING: CPU: 6 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x29a/0x2b0
kernel: Modules linked in: iptable_filter rfcomm qrtr ns squashfs loop tun cmac algif_hash algif_skcipher af_alg bnep dm_crypt cbc encrypted_keys trusted asn1_encoder tee snd_soc_skl_hda_>
kernel:  i915 snd_intel_dspcfg vfat intel_cstate wmi_bmof intel_wmi_thunderbolt intel_rapl_msr nouveau btbcm snd_intel_sdw_acpi intel_spi_pci fat intel_uncore intel_spi i2c_i801 psmouse b>
kernel:  xhci_pci serio
kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G           OE     5.15.16-1-MANJARO #1 b9dc4de00f51b119e02c16819fef338d2dcefc27
kernel: Hardware name: TIMI RedmiBook 14 II/TM2001, BIOS RMAIC4B0P0303 05/29/2020
kernel: RIP: 0010:dev_watchdog+0x29a/0x2b0
kernel: Code: ff ff 48 8b 1c 24 c6 05 3d 1c 3e 01 01 48 89 df e8 8b 86 f9 ff 44 89 e9 48 89 de 48 c7 c7 d0 62 d5 bb 48 89 c2 e8 18 24 19 00 <0f> 0b e9 5b ff ff ff 66 66 2e 0f 1f 84 00 00 >
kernel: RSP: 0018:ffffae53802d8ea0 EFLAGS: 00010246
kernel: RAX: 0000000000000000 RBX: ffff8b3645e2d000 RCX: 0000000000000000
kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
kernel: RBP: ffff8b3645e2d41c R08: 0000000000000000 R09: 0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b3642b6ec80
kernel: R13: 0000000000000000 R14: ffff8b3645e2d4c0 R15: 0000000000000001
kernel: FS:  0000000000000000(0000) GS:ffff8b39dfb80000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007fcc13edfb40 CR3: 0000000171d36003 CR4: 0000000000770ee0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel:  <IRQ>
kernel:  ? pfifo_fast_reset+0x140/0x140
kernel:  ? pfifo_fast_reset+0x140/0x140
kernel:  call_timer_fn+0x24/0x130
kernel:  __run_timers+0x1fc/0x290
kernel:  run_timer_softirq+0x19/0x30
kernel:  __do_softirq+0xcd/0x2ca
kernel:  irq_exit_rcu+0x9b/0xc0
kernel:  sysvec_apic_timer_interrupt+0x72/0x90
kernel:  </IRQ>
kernel:  <TASK>
kernel:  asm_sysvec_apic_timer_interrupt+0x12/0x20
kernel: RIP: 0010:cpuidle_enter_state+0xc7/0x380
kernel: Code: 8b 3d 45 a7 f1 44 e8 78 f9 7d ff 49 89 c5 0f 1f 44 00 00 31 ff e8 19 07 7e ff 45 84 ff 0f 85 e8 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 1f 01 00 00 49 63 d6 4c 2b 2c >
kernel: RSP: 0018:ffffae538015fea8 EFLAGS: 00000246
kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
kernel: RBP: ffff8b39dfbbc000 R08: 0000000000000000 R09: 0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffbc548740
kernel: R13: 00000049363341dd R14: 0000000000000001 R15: 0000000000000000
kernel:  ? cpuidle_enter_state+0xb7/0x380
kernel:  cpuidle_enter+0x29/0x40
kernel:  do_idle+0x1e9/0x280
kernel:  cpu_startup_entry+0x19/0x20
kernel:  secondary_startup_64_no_verify+0xc2/0xcb
kernel:  </TASK>
kernel: ---[ end trace 77aabeceedf6d6b1 ]---
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx status -2
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx status -2
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx status -2
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx status -2
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout
kernel: r8152 2-1.1:1.0 enp0s13f0u1u1: Tx timeout

Note You need to log in before you can comment on or make changes to this bug.