Bug 212227 - unregister netdevice unexpected fail
Summary: unregister netdevice unexpected fail
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV6 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Hideaki YOSHIFUJI
Depends on:
Reported: 2021-03-11 09:24 UTC by Yi Chen
Modified: 2021-03-19 13:00 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.11.5
Regression: No
Bisected commit-id:


Description Yi Chen 2021-03-11 09:24:42 UTC
Unregister netdevice would fail after send sctp traffic in net namespace

Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 2f9165cb5217 - nvme-pci: add quirks for Lexar 256GB SSD

ip netns add N
ip -n N link set lo up
ip -n N link add type veth
ip -n N addr add 2001:db8:ffff:21::1/64 dev veth0
ip -n N addr add 2001:db8:ffff:21::2/64 dev veth1
ip -n N link set veth0 up
ip -n N link set veth1 up
sleep 1

ip netns exec N sctp_test -H 2001:db8:ffff:21::2 -P 9999 -l &
sleep 1
ip netns exec N timeout 5 sctp_test -H 2001:db8:ffff:21::1 -P 6013 -h 2001:db8:ffff:21::2 -p 9999 -s -c 1 -x 1 -X 1
echo $?
ip netns del N

Then wait seconds, check dmesg:
[  422.889040] unregister_netdevice: waiting for veth0 to become free. Usage count = 1
[  433.039034] unregister_netdevice: waiting for veth0 to become free. Usage count = 1
[  443.158895] unregister_netdevice: waiting for veth0 to become free. Usage count = 1
Comment 1 Xin Long 2021-03-17 08:42:59 UTC
This is actually an SCTP bug: the pernet ctrlsock could hold the dst_entry and release it until sctp_ctrlsock_ops->exit() is called, which is too late, as default_device_ops->exit() called before it will be hanging there to waiting for its releasing.

The fix should let the ctrlsock not hold the dst_entry in sctp_packet_transmit():

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 6614c9fdc51e..a6aa17df09ef 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -584,13 +584,6 @@ int sctp_packet_transmit(struct sctp_packet *packet, gfp_t gfp)
                goto out;

-       rcu_read_lock();
-       if (__sk_dst_get(sk) != tp->dst) {
-               dst_hold(tp->dst);
-               sk_setup_caps(sk, tp->dst);
-       }
-       rcu_read_unlock();
        /* pack up chunks */
        pkt_count = sctp_packet_pack(packet, head, gso, gfp);
        if (!pkt_count) {
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 3fd06a27105d..5cb1aa5f067b 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -1135,6 +1135,7 @@ static void sctp_outq_flush_data(struct sctp_flush_ctx *ctx,

 static void sctp_outq_flush_transports(struct sctp_flush_ctx *ctx)
+       struct sock *sk = ctx->asoc->base.sk;
        struct list_head *ltransport;
        struct sctp_packet *packet;
        struct sctp_transport *t;
@@ -1144,6 +1145,12 @@ static void sctp_outq_flush_transports(struct sctp_flush_ctx *ctx)
                t = list_entry(ltransport, struct sctp_transport, send_ready);
                packet = &t->packet;
                if (!sctp_packet_empty(packet)) {
+                       rcu_read_lock();
+                       if (t->dst && __sk_dst_get(sk) != t->dst) {
+                               dst_hold(t->dst);
+                               sk_setup_caps(sk, t->dst);
+                       }
+                       rcu_read_unlock();
                        error = sctp_packet_transmit(packet, ctx->gfp);
                        if (error < 0)
                                ctx->q->asoc->base.sk->sk_err = -error;

By moving sk_setup_caps() out of sctp_packet_transmit(), it will also save some rounds when sending packets only in one transport at the same time.

I will post it upstream soon.

Comment 3 Yi Chen 2021-03-19 13:00:18 UTC
Glad to see it resolved so quickly

Note You need to log in before you can comment on or make changes to this bug.