Bug 64911 - [bisected]spinlock bad magic on br_stp_rcv
Summary: [bisected]spinlock bad magic on br_stp_rcv
Status: RESOLVED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-11-13 09:49 UTC by Alexander Y. Fomichev
Modified: 2014-05-16 09:21 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.12.0 (mainline), >=3.10.17 (stable)
Subsystem:
Regression: No
Bisected commit-id:


Attachments
spinlock bad magic trace (136.26 KB, application/octet-stream)
2013-11-13 09:49 UTC, Alexander Y. Fomichev
Details
check for br_port_exists in br_stp_rcv (477 bytes, patch)
2013-11-13 09:57 UTC, Alexander Y. Fomichev
Details | Diff

Description Alexander Y. Fomichev 2013-11-13 09:49:44 UTC
Created attachment 114501 [details]
spinlock bad magic trace

Hello,
few days ago i tried to switch to 3.10.18 and have caught "spinlock bad magic".

[    6.120692] Freeing unused kernel memory: 1144k freed
[   19.062504] BUG: spinlock bad magic on CPU#6, swapper/6/0
[   19.068011]  lock: 0xffff8810529ec000, .magic: 3a310000, .owner: <none>/-1, .owner_cpu: 825110576
[   19.076984] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 3.10.17 #1
[   19.083065] Hardware name: Supermicro X9DRG-HF/X9DRG-HF, BIOS 1.0c 08/22/2012
[   19.090278]  0000000000000000 ffff88107fc03b70 ffffffff817823c4 ffff88107fc03b90
[   19.098057]  ffffffff81782450 ffff8810529ec000 ffffffff81a21568 ffff88107fc03bb0
[   19.105841]  ffffffff81782476 ffff8810529ec000 ffff8810529ec000 ffff88107fc03bd0
[   19.113624] Call Trace:
[   19.116149]  <IRQ>  [<ffffffff817823c4>] dump_stack+0x19/0x1b
[   19.122114]  [<ffffffff81782450>] spin_dump+0x8a/0x8f
[   19.127242]  [<ffffffff81782476>] spin_bug+0x21/0x26
[   19.132291]  [<ffffffff81357f22>] do_raw_spin_lock+0xb2/0xc0
[   19.138029]  [<ffffffff8178db8e>] _raw_spin_lock+0xe/0x10
[   19.143506]  [<ffffffff81760d03>] br_stp_rcv+0x73/0x370
....................................

It looks like uninitialized br->lock in in br_stp_rcv
./net/bridge/br_stp_bpdu.c +158

        p = br_port_get_rcu(dev);

        br = p->br;
        spin_lock(&br->lock); <- here
-----------------------------------------------

Bisect leads me to 716ec052d2280d511e10e90ad54a86f5b5d4dcc2 (mainline): 960b8e5018a552f62cfbc0dfe94be7b6ba178f13(stable)
Before that br_port_get_rcu has internal check for br_port_exists, without it scenario like this is possible:

- stp broadcasts 01:80:c2:00:00:00 is coming on some interface
- bridge module loaded but there is no bridge on that interface (br->lock is not initialized).
- interface is in promiscuous mod
 (llc_rcv droping PACKET_OTHERHOST to protect us in promiscuous mode but seems like not as for broadcasts like this)
- rx_handler_data is initialised (it was a macvlan in my case but could be team of somewhat else)

So It seems like STP needs its own IFF_BRIDGE_PORT check.
Comment 1 Alexander Y. Fomichev 2013-11-13 09:57:31 UTC
Created attachment 114511 [details]
check for br_port_exists in br_stp_rcv

a simple fix assuming that an easiest option is to check br_port_exists in br_stp_rcv as before.
Comment 2 Alexander Y. Fomichev 2014-05-16 09:21:20 UTC
fixed by 859828c0ea476b42f3a93d69d117aaba90994b6f
related: https://bugzilla.redhat.com/show_bug.cgi?id=1025770

Note You need to log in before you can comment on or make changes to this bug.