Bug 211877 - Make "unregister_netdevice: waiting for dev to become free" diagnostic useful
Summary: Make "unregister_netdevice: waiting for dev to become free" diagnostic useful
Status: RESOLVED CODE_FIX
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Sanitizers (show other bugs)
Hardware: All Linux
: P1 enhancement
Assignee: MM/Sanitizers virtual assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-21 14:29 UTC by Dmitry Vyukov
Modified: 2022-03-30 09:19 UTC (History)
2 users (show)

See Also:
Kernel Version: ALL
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Dmitry Vyukov 2021-02-21 14:29:48 UTC
syzkaller triggers tremendous amounts of "unregister_netdevice: waiting for dev to become free" warnings:
https://syzkaller.appspot.com/bug?id=949ecf93b67ab1df8f890571d24ef9db50872c96
The warning comes from:
https://elixir.bootlin.com/linux/v5.11/source/net/core/dev.c#L10261
The warning is triggered after 10 seconds of waiting for the device to become free (all references are dropped). While 10 second wait generally should not happen in normal life (NETDEV_UNREGISTER notification should make everybody drop references), it seems to fire falsely very often during fuzzing. At least it's not possible to understand if it's really a false positive or not. All messages the same, so we can't e.g. detect only 10-th such message. We raise other stall/hang timeouts to 100-140 seconds and in qemu 3x more. 10 seconds is really too unreliable timeout.
We used to ignore these messages entirely, but then real hangs are detected as unuseful "no output".
We need to make this timeout configurable and/or fire a real WARNING after some timeout.
If we keep "unregister_netdevice: waiting for dev to become free" message and add a WARNING, then we need to somehow change the message text, so that syzkaller does not consider it as bug anymore (while still considers the old message as bug on older kernels).
Comment 1 Dmitry Vyukov 2021-03-20 14:41:56 UTC
FTR mailed "net: make unregister netdev warning timeout configurable".
Comment 2 Andrey Konovalov 2022-03-29 18:00:47 UTC
The patch was merged [1], this issue is resolved, right? 

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5aa3afe107d9099fc0dea2acf82c3e3c8f0f20e2
Comment 3 Dmitry Vyukov 2022-03-30 09:19:38 UTC
Yes, +Eric's patches for debug refcounts in the net subsystem.

Note You need to log in before you can comment on or make changes to this bug.