Created attachment 71312 [details] Log for circular locking dependency While running CPU hotplug stress test + kernel compilation + pm_test (at the core level), unsafe locking scenarios are detected. Excerpt: [ 807.775666] ======================================================= [ 807.775994] [ INFO: possible circular locking dependency detected ] [ 807.776253] 3.1.0-rc2 #1 [ 807.776364] ------------------------------------------------------- [ 807.776621] kworker/u:6/29543 is trying to acquire lock: [ 807.776915] (alc_key){..-...}, at: [<ffffffff81168fa9>] kmem_cache_free+0x1a9/0x240 [ 807.777415] [ 807.777417] but task is already holding lock: [ 807.777816] (&(&parent->list_lock)->rlock){-.-...}, at: [<ffffffff81169294>] __drain_alien_cache+0x64/0xa0 [ 807.778398] [ 807.778399] which lock already depends on the new lock. [ 807.778401] [ 807.778975] [ 807.778976] the existing dependency chain (in reverse order) is: [ 807.779425] [ 807.779426] -> #1 (&(&parent->list_lock)->rlock){-.-...}: [ 807.779891] [<ffffffff810ab50c>] validate_chain+0x6cc/0x7d0 [ 807.780227] [<ffffffff810ab914>] __lock_acquire+0x304/0x500 [ 807.780557] [<ffffffff810ac1d2>] lock_acquire+0xa2/0x130 [ 807.780877] [<ffffffff815349b6>] _raw_spin_lock+0x36/0x70 [ 807.781208] [<ffffffff81169294>] __drain_alien_cache+0x64/0xa0 [ 807.781548] [<ffffffff811698cb>] kfree+0x1db/0x2a0 [ 807.781847] [<ffffffff81169a21>] free_alien_cache+0x91/0xa0 [ 807.782178] [<ffffffff8152e9b9>] cpuup_prepare+0x168/0x1a9 [ 807.782507] [<ffffffff8152ea2f>] cpuup_callback+0x35/0xc5 [ 807.782829] [<ffffffff815393a4>] notifier_call_chain+0x94/0xd0 [ 807.783173] [<ffffffff8109770e>] __raw_notifier_call_chain+0xe/0x10 [ 807.783535] [<ffffffff8106d000>] __cpu_notify+0x20/0x40 [ 807.783858] [<ffffffff8152cf02>] _cpu_up+0x6e/0x10e [ 807.784172] [<ffffffff8152d07b>] cpu_up+0xd9/0xec [ 807.784468] [<ffffffff81e21bd6>] smp_init+0x41/0x96 [ 807.784771] [<ffffffff81e03791>] kernel_init+0x1ef/0x2a6 [ 807.785092] [<ffffffff81540184>] kernel_thread_helper+0x4/0x10 [ 807.785433] [ 807.785434] -> #0 (alc_key){..-...}: [ 807.785824] [<ffffffff810aae18>] check_prev_add+0x528/0x550 [ 807.786156] [<ffffffff810ab50c>] validate_chain+0x6cc/0x7d0 [ 807.786488] [<ffffffff810ab914>] __lock_acquire+0x304/0x500 [ 807.786823] [<ffffffff810ac1d2>] lock_acquire+0xa2/0x130 [ 807.787143] [<ffffffff815349b6>] _raw_spin_lock+0x36/0x70 [ 807.787467] [<ffffffff81168fa9>] kmem_cache_free+0x1a9/0x240 [ 807.787802] [<ffffffff81169094>] slab_destroy+0x54/0x80 [ 807.788123] [<ffffffff8116911d>] free_block+0x5d/0x170 [ 807.788439] [<ffffffff811692bc>] __drain_alien_cache+0x8c/0xa0 [ 807.788778] [<ffffffff811698cb>] kfree+0x1db/0x2a0 [ 807.789087] [<ffffffff8144aeb0>] skb_release_data+0xd0/0x100 [ 807.789426] [<ffffffff8144aefe>] __kfree_skb+0x1e/0xa0 [ 807.789741] [<ffffffff8144afb1>] consume_skb+0x31/0x80 [ 807.790058] [<ffffffffa01d4e74>] bnx2_free_skbs+0x234/0x390 [bnx2] [ 807.790416] [<ffffffffa01d5096>] bnx2_suspend+0xc6/0xe0 [bnx2] [ 807.790758] [<ffffffff812978a6>] pci_legacy_suspend+0x46/0xe0 [ 807.791101] [<ffffffff8129854d>] pci_pm_freeze+0xad/0xd0 [ 807.791422] [<ffffffff8135e7f6>] pm_op+0x136/0x1a0 [ 807.791724] [<ffffffff8135f18b>] __device_suspend+0x26b/0x2d0 [ 807.792063] [<ffffffff8136016f>] async_suspend+0x1f/0xa0 [ 807.792384] [<ffffffff81099634>] async_run_entry_fn+0x84/0x160 [ 807.792726] [<ffffffff8108938a>] process_one_work+0x1aa/0x520 [ 807.793065] [<ffffffff8108ba7b>] worker_thread+0x17b/0x3b0 [ 807.793394] [<ffffffff81090af6>] kthread+0xb6/0xc0 [ 807.793698] [<ffffffff81540184>] kernel_thread_helper+0x4/0x10 [ 807.794041] [ 807.794042] other info that might help us debug this: [ 807.794043] [ 807.794602] Possible unsafe locking scenario: [ 807.794603] [ 807.794995] CPU0 CPU1 [ 807.795250] ---- ---- [ 807.795505] lock(&(&parent->list_lock)->rlock); [ 807.795789] lock(alc_key); [ 807.796101] lock(&(&parent->list_lock)->rlock); [ 807.796557] lock(alc_key); [ 807.796771] [ 807.796772] *** DEADLOCK *** [ 807.796773] [ 807.797252] 5 locks held by kworker/u:6/29543: [ 807.797503] #0: (events_unbound){.+.+.+}, at: [<ffffffff8108931d>] process_one_work+0x13d/0x520 [ 807.798058] #1: ((&entry->work)){+.+.+.}, at: [<ffffffff8108931d>] process_one_work+0x13d/0x520 [ 807.798591] #2: (&__lockdep_no_validate__){......}, at: [<ffffffff8135efc3>] __device_suspend+0xa3/0x2d0 [ 807.799160] #3: (&(&nc->lock)->rlock){-.-...}, at: [<ffffffff811698b4>] kfree+0x1c4/0x2a0 [ 807.799672] #4: (&(&parent->list_lock)->rlock){-.-...}, at: [<ffffffff81169294>] __drain_alien_cache+0x64/0xa0 [ 807.800270] [ 807.800271] stack backtrace: [ 807.800608] Pid: 29543, comm: kworker/u:6 Not tainted 3.1.0-rc2 #1 [ 807.800929] Call Trace: [ 807.801115] [<ffffffff810a8e39>] print_circular_bug+0x109/0x110 [ 807.801428] [<ffffffff810aae18>] check_prev_add+0x528/0x550 [ 807.801728] [<ffffffff810ab50c>] validate_chain+0x6cc/0x7d0 [ 807.802040] [<ffffffff8101a3f9>] ? sched_clock+0x9/0x10 [ 807.802327] [<ffffffff8109839d>] ? sched_clock_cpu+0xcd/0x110 [ 807.802635] [<ffffffff810ab914>] __lock_acquire+0x304/0x500 [ 807.802938] [<ffffffff810ac1d2>] lock_acquire+0xa2/0x130 [ 807.803230] [<ffffffff81168fa9>] ? kmem_cache_free+0x1a9/0x240 [ 807.803541] [<ffffffff815349b6>] _raw_spin_lock+0x36/0x70 [ 807.803835] [<ffffffff81168fa9>] ? kmem_cache_free+0x1a9/0x240 [ 807.804147] [<ffffffff81168fa9>] kmem_cache_free+0x1a9/0x240 [ 807.804453] [<ffffffff81169094>] slab_destroy+0x54/0x80 [ 807.804740] [<ffffffff8116911d>] free_block+0x5d/0x170 [ 807.805027] [<ffffffff811692bc>] __drain_alien_cache+0x8c/0xa0 [ 807.805337] [<ffffffff811698cb>] kfree+0x1db/0x2a0 [ 807.805610] [<ffffffff8144aeb0>] skb_release_data+0xd0/0x100 [ 807.805916] [<ffffffff8144aefe>] __kfree_skb+0x1e/0xa0 [ 807.806201] [<ffffffff8144afb1>] consume_skb+0x31/0x80 [ 807.806487] [<ffffffffa01d4e74>] bnx2_free_skbs+0x234/0x390 [bnx2] [ 807.806813] [<ffffffffa01d5096>] bnx2_suspend+0xc6/0xe0 [bnx2] [ 807.807130] [<ffffffff812978a6>] pci_legacy_suspend+0x46/0xe0 [ 807.807445] [<ffffffff8129854d>] pci_pm_freeze+0xad/0xd0 [ 807.807736] [<ffffffff8135e7f6>] pm_op+0x136/0x1a0 [ 807.808015] [<ffffffff8135f18b>] __device_suspend+0x26b/0x2d0 [ 807.808334] [<ffffffff8136016f>] async_suspend+0x1f/0xa0 [ 807.808625] [<ffffffff81099634>] async_run_entry_fn+0x84/0x160 [ 807.808943] [<ffffffff8108938a>] process_one_work+0x1aa/0x520 [ 807.809255] [<ffffffff8108931d>] ? process_one_work+0x13d/0x520 [ 807.809568] [<ffffffff810995b0>] ? async_schedule+0x20/0x20 [ 807.809870] [<ffffffff8108ba7b>] worker_thread+0x17b/0x3b0 [ 807.810167] [<ffffffff8108b900>] ? manage_workers+0x120/0x120 [ 807.810476] [<ffffffff81090af6>] kthread+0xb6/0xc0 [ 807.810746] [<ffffffff810aa5fd>] ? trace_hardirqs_on_caller+0x10d/0x1a0 [ 807.811092] [<ffffffff81540184>] kernel_thread_helper+0x4/0x10 [ 807.811404] [<ffffffff81535774>] ? retint_restore_args+0x13/0x13 [ 807.811723] [<ffffffff81090a40>] ? __init_kthread_worker+0x70/0x70 [ 807.812053] [<ffffffff81540180>] ? gs_change+0x13/0x13
does the problem still exist in the latest upstream kernel?
bug closed as there is no response from the bug reporter. please feel free to reopen it if the problem still exists in the latest upstream kernel.