Bug 212509
Summary: | [rtl8139] HARDIRQ-safe -> HARDIRQ-unsafe lock order detected | ||
---|---|---|---|
Product: | Process Management | Reporter: | Erhard F. (erhard_f) |
Component: | Other | Assignee: | process_other |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | kubakici |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: |
https://bugzilla.kernel.org/show_bug.cgi?id=211575 https://bugzilla.kernel.org/show_bug.cgi?id=214027 |
||
Kernel Version: | 5.11.11 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg (kernel 5.11.11, Shuttle XPC FS51, Pentium 4)
kernel .config (kernel 5.11.11, Shuttle XPC FS51, Pentium 4) kernel .config (5.16-rc4, PowerMac G4 DP) dmesg (kernel 5.17-rc3, Shuttle XPC FS51, Pentium 4) kernel .config (5.18-rc3, PowerMac G4 DP) dmesg (5.18-rc3, PowerMac G4 DP) |
Description
Erhard F.
2021-03-31 19:24:19 UTC
Created attachment 296169 [details]
kernel .config (kernel 5.11.11, Shuttle XPC FS51, Pentium 4)
Created attachment 299953 [details]
kernel .config (5.16-rc4, PowerMac G4 DP)
Got this on my PowerMac G4 too with lock debugging enabled:
[...]
=====================================================
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
5.16.0-rc4-PowerMacG4+ #2 Not tainted
-----------------------------------------------------
swapper/0/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
c1e32b50 (_xmit_ETHER#2){+.-.}-{2:2}, at: gem_poll+0xaf4/0x13f4
and this task is already holding:
c133cd90 (target_list_lock){-...}-{2:2}, at: write_msg+0x6c/0x154
which would create a new lock dependency:
(target_list_lock){-...}-{2:2} -> (_xmit_ETHER#2){+.-.}-{2:2}
but this new dependency connects a HARDIRQ-irq-safe lock:
(target_list_lock){-...}-{2:2}
... which became HARDIRQ-irq-safe at:
lock_acquire+0x1c0/0x460
_raw_spin_lock_irqsave+0x68/0x94
write_msg+0x6c/0x154
console_unlock+0x558/0x65c
vprintk_emit+0x220/0x228
_printk+0xa8/0xd4
crng_fast_load+0xf8/0x154
add_interrupt_randomness+0x1e4/0x320
handle_irq_event_percpu+0x8c/0xdc
handle_irq_event+0x60/0x98
handle_fasteoi_irq+0xd4/0x1a8
handle_irq_desc+0x40/0x54
__do_irq+0x1b0/0x2dc
__do_IRQ+0xd0/0xd8
__do_IRQ+0x1c/0xd8
do_IRQ+0x20/0x34
HardwareInterrupt_virt+0x100/0x104
early_fixmap_pagetable+0x0/0x2000
default_idle_call+0xac/0xd8
do_idle+0x134/0x148
cpu_startup_entry+0x24/0x28
start_kernel+0x2fc/0x320
0x3610
to a HARDIRQ-irq-unsafe lock:
(_xmit_ETHER#2){+.-.}-{2:2}
... which became HARDIRQ-irq-unsafe at:
...
lock_acquire+0x1c0/0x460
_raw_spin_lock+0x50/0x70
gem_set_link_modes+0x12c/0x4e0
gem_link_timer+0x1a8/0x524
call_timer_fn+0x10c/0x398
__run_timers.part.0+0x2f4/0x314
run_timer_softirq+0x78/0xf4
__do_softirq+0x3cc/0x660
__irq_exit_rcu+0x1bc/0x274
irq_exit+0x14/0x68
Decrementer_virt+0x100/0x104
0x1baff47
default_idle_call+0x84/0xd8
do_idle+0x134/0x148
cpu_startup_entry+0x24/0x28
start_kernel+0x2fc/0x320
0x3610
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(_xmit_ETHER#2);
local_irq_disable();
lock(target_list_lock);
lock(_xmit_ETHER#2);
<Interrupt>
lock(target_list_lock);
*** DEADLOCK ***
3 locks held by swapper/0/1:
#0: c1193e60 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x1f4/0x228
#1: c1194200 (console_owner){-.-.}-{0:0}, at: console_unlock+0x2e0/0x65c
#2: c133cd90 (target_list_lock){-...}-{2:2}, at: write_msg+0x6c/0x154
[...]
Created attachment 300411 [details]
dmesg (kernel 5.17-rc3, Shuttle XPC FS51, Pentium 4)
Still shows up on 5.17-rc3.
[...]
=====================================================
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
5.17.0-rc3-P4 #1 Not tainted
-----------------------------------------------------
systemd-shutdow/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
c12d98a4 (&tp->rx_lock){+.-.}-{2:2}, at: rtl8139_poll+0x2b/0x404
and this task is already holding:
daf40608 (target_list_lock){-.-.}-{2:2}, at: write_msg+0x4b/0x114
which would create a new lock dependency:
(target_list_lock){-.-.}-{2:2} -> (&tp->rx_lock){+.-.}-{2:2}
but this new dependency connects a HARDIRQ-irq-safe lock:
(target_list_lock){-.-.}-{2:2}
... which became HARDIRQ-irq-safe at:
lock_acquire+0xa7/0x165
_raw_spin_lock_irqsave+0x3f/0x53
write_msg+0x4b/0x114
console_unlock+0x395/0x440
vprintk_emit+0x57/0x8d
vprintk_default+0x10/0x15
vprintk+0x49/0x50
_printk+0x14/0x19
perf_duration_warn+0x40/0x45
irq_work_single+0x3c/0x7b
irq_work_run_list+0x2b/0x39
irq_work_run+0x14/0x27
__sysvec_irq_work+0x27/0x2e
sysvec_irq_work+0x18/0x2b
handle_exception_return+0x0/0xad
free_unref_page_list+0x199/0x1a5
release_pages+0x289/0x291
__pagevec_release+0x29/0x40
__invalidate_mapping_pages+0x14b/0x18d
invalidate_mapping_pages+0x10/0x16
btrfs_relocate_block_group+0x1e8/0x2a7
btrfs_relocate_chunk+0x46/0x139
btrfs_shrink_device+0x36c/0x6b8
btrfs_rm_device+0x1ed/0x625
btrfs_ioctl_rm_dev_v2+0x120/0x19f
btrfs_ioctl+0x730/0x737
__ia32_sys_ioctl+0x99d/0x9e8
__do_fast_syscall_32+0xbc/0xe1
do_fast_syscall_32+0x29/0x59
do_SYSENTER_32+0x12/0x14
entry_SYSENTER_32+0x98/0xf1
to a HARDIRQ-irq-unsafe lock:
(&tp->rx_lock){+.-.}-{2:2}
... which became HARDIRQ-irq-unsafe at:
...
lock_acquire+0xa7/0x165
_raw_spin_lock+0x23/0x31
rtl8139_poll+0x2b/0x404
__napi_poll+0x26/0x13d
net_rx_action+0xf0/0x213
__do_softirq+0x136/0x27f
do_softirq_own_stack+0x50/0x56
irq_exit_rcu+0x96/0xa9
common_interrupt+0x29/0x3c
asm_common_interrupt+0xff/0x180
default_idle+0xe/0x10
arch_cpu_idle+0xd/0xf
default_idle_call+0x48/0x7c
do_idle+0xc7/0x1f2
cpu_startup_entry+0x17/0x19
rest_init+0x1af/0x1b2
arch_call_rest_init+0x8/0xa
start_kernel+0x2c6/0x33e
i386_start_kernel+0x94/0x97
startup_32_smp+0x161/0x164
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&tp->rx_lock);
local_irq_disable();
lock(target_list_lock);
lock(&tp->rx_lock);
<Interrupt>
lock(target_list_lock);
*** DEADLOCK ***
3 locks held by systemd-shutdow/1:
#0: dae82e00 (console_lock){+.+.}-{0:0}, at: console_trylock_spinning+0xe/0x17e
#1: daeb4ff0 (console_owner){-.-.}-{0:0}, at: console_lock_spinning_enable+0x23/0x42
#2: daf40608 (target_list_lock){-.-.}-{2:2}, at: write_msg+0x4b/0x114
the dependencies between HARDIRQ-irq-safe lock and the holding lock:
-> (target_list_lock){-.-.}-{2:2} ops: 2816 {
IN-HARDIRQ-W at:
lock_acquire+0xa7/0x165
_raw_spin_lock_irqsave+0x3f/0x53
write_msg+0x4b/0x114
console_unlock+0x395/0x440
vprintk_emit+0x57/0x8d
vprintk_default+0x10/0x15
vprintk+0x49/0x50
_printk+0x14/0x19
perf_duration_warn+0x40/0x45
irq_work_single+0x3c/0x7b
irq_work_run_list+0x2b/0x39
irq_work_run+0x14/0x27
__sysvec_irq_work+0x27/0x2e
sysvec_irq_work+0x18/0x2b
handle_exception_return+0x0/0xad
free_unref_page_list+0x199/0x1a5
release_pages+0x289/0x291
__pagevec_release+0x29/0x40
__invalidate_mapping_pages+0x14b/0x18d
invalidate_mapping_pages+0x10/0x16
btrfs_relocate_block_group+0x1e8/0x2a7
btrfs_relocate_chunk+0x46/0x139
btrfs_shrink_device+0x36c/0x6b8
btrfs_rm_device+0x1ed/0x625
btrfs_ioctl_rm_dev_v2+0x120/0x19f
btrfs_ioctl+0x730/0x737
__ia32_sys_ioctl+0x99d/0x9e8
__do_fast_syscall_32+0xbc/0xe1
do_fast_syscall_32+0x29/0x59
do_SYSENTER_32+0x12/0x14
entry_SYSENTER_32+0x98/0xf1
IN-SOFTIRQ-W at:
lock_acquire+0xa7/0x165
_raw_spin_lock_irqsave+0x3f/0x53
write_msg+0x4b/0x114
console_unlock+0x395/0x440
vprintk_emit+0x57/0x8d
vprintk_default+0x10/0x15
vprintk+0x49/0x50
_printk+0x14/0x19
crng_finalize_init+0xc8/0x118
crng_reseed+0x137/0x141
credit_entropy_bits+0xb6/0xbe
add_timer_randomness+0x8b/0x93
add_disk_randomness+0x27/0x40
scsi_end_request+0x42/0x221
scsi_io_completion+0x51/0xd6
scsi_finish_command+0x9a/0xa2
scsi_complete+0x62/0x9b
blk_complete_reqs+0x28/0x32
blk_done_softirq+0x14/0x16
__do_softirq+0x136/0x27f
do_softirq_own_stack+0x50/0x56
irq_exit_rcu+0x96/0xa9
common_interrupt+0x29/0x3c
asm_common_interrupt+0xff/0x180
default_idle+0xe/0x10
arch_cpu_idle+0xd/0xf
default_idle_call+0x48/0x7c
do_idle+0xc7/0x1f2
cpu_startup_entry+0x17/0x19
rest_init+0x1af/0x1b2
arch_call_rest_init+0x8/0xa
start_kernel+0x2c6/0x33e
i386_start_kernel+0x94/0x97
startup_32_smp+0x161/0x164
INITIAL USE at:
lock_acquire+0xa7/0x165
_raw_spin_lock_irqsave+0x3f/0x53
init_netconsole+0x13f/0x208
do_one_initcall+0xae/0x16d
do_initcall_level+0x7b/0x8d
do_initcalls+0x3c/0x5d
do_basic_setup+0x17/0x19
kernel_init_freeable+0x156/0x1a2
kernel_init+0x13/0x11a
ret_from_fork+0x1c/0x28
}
... key at: [<daf40608>] target_list_lock+0x10/0x7c
the dependencies between the lock to be acquired
and HARDIRQ-irq-unsafe lock:
-> (&tp->rx_lock){+.-.}-{2:2} ops: 232598 {
HARDIRQ-ON-W at:
lock_acquire+0xa7/0x165
_raw_spin_lock+0x23/0x31
rtl8139_poll+0x2b/0x404
__napi_poll+0x26/0x13d
net_rx_action+0xf0/0x213
__do_softirq+0x136/0x27f
do_softirq_own_stack+0x50/0x56
irq_exit_rcu+0x96/0xa9
common_interrupt+0x29/0x3c
asm_common_interrupt+0xff/0x180
default_idle+0xe/0x10
arch_cpu_idle+0xd/0xf
default_idle_call+0x48/0x7c
do_idle+0xc7/0x1f2
cpu_startup_entry+0x17/0x19
rest_init+0x1af/0x1b2
arch_call_rest_init+0x8/0xa
start_kernel+0x2c6/0x33e
i386_start_kernel+0x94/0x97
startup_32_smp+0x161/0x164
IN-SOFTIRQ-W at:
lock_acquire+0xa7/0x165
_raw_spin_lock+0x23/0x31
rtl8139_poll+0x2b/0x404
__napi_poll+0x26/0x13d
net_rx_action+0xf0/0x213
__do_softirq+0x136/0x27f
do_softirq_own_stack+0x50/0x56
irq_exit_rcu+0x96/0xa9
common_interrupt+0x29/0x3c
asm_common_interrupt+0xff/0x180
default_idle+0xe/0x10
arch_cpu_idle+0xd/0xf
default_idle_call+0x48/0x7c
do_idle+0xc7/0x1f2
cpu_startup_entry+0x17/0x19
rest_init+0x1af/0x1b2
arch_call_rest_init+0x8/0xa
start_kernel+0x2c6/0x33e
i386_start_kernel+0x94/0x97
startup_32_smp+0x161/0x164
INITIAL USE at:
lock_acquire+0xa7/0x165
_raw_spin_lock+0x23/0x31
rtl8139_poll+0x2b/0x404
__napi_poll+0x26/0x13d
net_rx_action+0xf0/0x213
__do_softirq+0x136/0x27f
do_softirq_own_stack+0x50/0x56
irq_exit_rcu+0x96/0xa9
common_interrupt+0x29/0x3c
asm_common_interrupt+0xff/0x180
default_idle+0xe/0x10
arch_cpu_idle+0xd/0xf
default_idle_call+0x48/0x7c
do_idle+0xc7/0x1f2
cpu_startup_entry+0x17/0x19
rest_init+0x1af/0x1b2
arch_call_rest_init+0x8/0xa
start_kernel+0x2c6/0x33e
i386_start_kernel+0x94/0x97
startup_32_smp+0x161/0x164
}
... key at: [<db65be44>] rtl8139_init_one.__key.13+0x0/0x8
... acquired at:
lock_acquire+0xa7/0x165
_raw_spin_lock+0x23/0x31
rtl8139_poll+0x2b/0x404
netpoll_poll_dev+0xd7/0x161
netpoll_send_skb+0x1d7/0x27d
netpoll_send_udp+0x3f1/0x432
write_msg+0xbf/0x114
console_unlock+0x395/0x440
vprintk_emit+0x57/0x8d
devkmsg_emit+0x1f/0x24
devkmsg_write+0xff/0x122
do_iter_readv_writev+0x118/0x138
do_iter_write+0x60/0xf0
vfs_writev+0x66/0x11c
do_writev+0x68/0xde
__ia32_sys_writev+0x12/0x17
__do_fast_syscall_32+0xbc/0xe1
do_fast_syscall_32+0x29/0x59
do_SYSENTER_32+0x12/0x14
entry_SYSENTER_32+0x98/0xf1
stack backtrace:
CPU: 1 PID: 1 Comm: systemd-shutdow Not tainted 5.17.0-rc3-P4 #1
Hardware name: /FS51, BIOS 6.00 PG 12/02/2003
Call Trace:
dump_stack_lvl+0x77/0xae
dump_stack+0xd/0x12
validate_chain+0x1ee2/0x2125
__lock_acquire+0x8d3/0xbb6
lock_acquire+0xa7/0x165
? rtl8139_poll+0x2b/0x404
_raw_spin_lock+0x23/0x31
? rtl8139_poll+0x2b/0x404
rtl8139_poll+0x2b/0x404
? __irq_put_desc_unlock+0x11/0x2f
? enable_irq+0x3e/0x57
netpoll_poll_dev+0xd7/0x161
netpoll_send_skb+0x1d7/0x27d
netpoll_send_udp+0x3f1/0x432
write_msg+0xbf/0x114
? write_ext_msg+0x20e/0x20e
console_unlock+0x395/0x440
vprintk_emit+0x57/0x8d
devkmsg_emit+0x1f/0x24
devkmsg_write+0xff/0x122
do_iter_readv_writev+0x118/0x138
do_iter_write+0x60/0xf0
vfs_writev+0x66/0x11c
do_writev+0x68/0xde
__ia32_sys_writev+0x12/0x17
__do_fast_syscall_32+0xbc/0xe1
? irqentry_exit+0x32/0x74
? irqentry_exit_to_user_mode+0x2a/0x2d
do_fast_syscall_32+0x29/0x59
do_SYSENTER_32+0x12/0x14
entry_SYSENTER_32+0x98/0xf1
EIP: 0xb7f41509
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 0f 1f 00 58 b8 77 00 00 00 cd 80 90 0f 1f
EAX: ffffffda EBX: 00000003 ECX: bfb274e4 EDX: 00000005
ESI: b7c21000 EDI: b7f36748 EBP: bfb2751b ESP: bfb27400
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00200246
? asm_exc_nmi+0xad/0x10a
systemd-shutdown[1]: Unmounting file systems.
Created attachment 300770 [details]
kernel .config (5.18-rc3, PowerMac G4 DP)
Created attachment 300771 [details]
dmesg (5.18-rc3, PowerMac G4 DP)
====================================================== WARNING: possible circular locking dependency detected 5.18.0-rc3-PMacG4 #3 Not tainted ------------------------------------------------------ ksoftirqd/1/19 is trying to acquire lock: c1167d74 (console_owner){..-.}-{0:0}, at: console_unlock+0x210/0x79c but task is already holding lock: c1d05c50 (_xmit_ETHER#2){+.-.}-{2:2}, at: gem_poll+0x6ec/0x1194 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (_xmit_ETHER#2){+.-.}-{2:2}: _raw_spin_lock+0x40/0x58 gem_poll+0x794/0x1194 netpoll_poll_dev+0x154/0x360 netpoll_send_skb+0x384/0x3e8 write_msg+0x158/0x17c console_unlock+0x4d4/0x79c vprintk_emit+0x154/0x474 _printk+0x5c/0x84 register_console+0x28c/0x370 init_netconsole+0x2e4/0x41c do_one_initcall+0x6c/0x384 kernel_init_freeable+0x234/0x340 kernel_init+0x28/0x144 ret_from_kernel_thread+0x5c/0x64 -> #1 (target_list_lock){..-.}-{2:2}: _raw_spin_lock_irqsave+0x58/0x8c write_msg+0x78/0x17c console_unlock+0x4d4/0x79c register_console+0x260/0x370 init_netconsole+0x2e4/0x41c do_one_initcall+0x6c/0x384 kernel_init_freeable+0x234/0x340 kernel_init+0x28/0x144 ret_from_kernel_thread+0x5c/0x64 -> #0 (console_owner){..-.}-{0:0}: __lock_acquire+0x17b4/0x265c lock_acquire.part.0+0x110/0x2e4 console_unlock+0x250/0x79c vprintk_emit+0x154/0x474 dev_vprintk_emit+0x174/0x1b8 dev_printk_emit+0x50/0x78 netdev_err+0x74/0x9c gem_poll+0xbf4/0x1194 __napi_poll.constprop.0+0x4c/0x244 net_rx_action+0x244/0x728 __do_softirq+0x1a8/0x5a4 run_ksoftirqd+0x70/0x98 smpboot_thread_fn+0x1a8/0x21c kthread+0x108/0x10c ret_from_kernel_thread+0x5c/0x64 other info that might help us debug this: Chain exists of: console_owner --> target_list_lock --> _xmit_ETHER#2 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(_xmit_ETHER#2); lock(target_list_lock); lock(_xmit_ETHER#2); lock(console_owner); *** DEADLOCK *** 2 locks held by ksoftirqd/1/19: #0: c1d05c50 (_xmit_ETHER#2){+.-.}-{2:2}, at: gem_poll+0x6ec/0x1194 #1: c1167c48 (console_lock){+.+.}-{0:0}, at: dev_vprintk_emit+0x174/0x1b8 stack backtrace: CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.18.0-rc3-PMacG4 #3 Call Trace: [e90799c0] [c06e87dc] dump_stack_lvl+0x80/0xc0 (unreliable) [e90799e0] [c00d3288] check_noncircular+0x188/0x1a0 [e9079a30] [c00d7770] __lock_acquire+0x17b4/0x265c [e9079ab0] [c00d536c] lock_acquire.part.0+0x110/0x2e4 [e9079b20] [c00e2228] console_unlock+0x250/0x79c [e9079bf0] [c00e3c4c] vprintk_emit+0x154/0x474 [e9079c30] [c07aaa24] dev_vprintk_emit+0x174/0x1b8 [e9079cb0] [c07aaab8] dev_printk_emit+0x50/0x78 [e9079ce0] [c08b9d9c] netdev_err+0x74/0x9c [e9079d20] [c082db58] gem_poll+0xbf4/0x1194 [e9079db0] [c08b5578] __napi_poll.constprop.0+0x4c/0x244 [e9079de0] [c08b59b4] net_rx_action+0x244/0x728 [e9079e50] [c0aa0398] __do_softirq+0x1a8/0x5a4 [e9079ec0] [c0057c54] run_ksoftirqd+0x70/0x98 [e9079ed0] [c0088114] smpboot_thread_fn+0x1a8/0x21c [e9079f00] [c0082088] kthread+0x108/0x10c [e9079f30] [c001e334] ret_from_kernel_thread+0x5c/0x64 Have not seen this on 5.18.x, 5.19.x nor on 6.0-rc on any of my affected systems. I think it's safe to close here. |