Bug 42242

Summary: Circular locking dependency detected during cpu hotplug + pm_test + kernel compilation
Product: Power Management Reporter: Srivatsa S. Bhat (srivatsa)
Component: Hibernation/SuspendAssignee: power-management_other
Status: CLOSED INSUFFICIENT_DATA    
Severity: normal CC: rui.zhang, srivatsa
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.1.0-rc2 Subsystem:
Regression: No Bisected commit-id:
Attachments: Log for circular locking dependency

Description Srivatsa S. Bhat 2011-09-02 10:51:18 UTC
Created attachment 71312 [details]
Log for circular locking dependency

While running CPU hotplug stress test + kernel compilation + pm_test (at the core level), unsafe locking scenarios are detected.

Excerpt:
 [  807.775666] =======================================================
 [  807.775994] [ INFO: possible circular locking dependency detected ]
 [  807.776253] 3.1.0-rc2 #1
 [  807.776364] -------------------------------------------------------
 [  807.776621] kworker/u:6/29543 is trying to acquire lock:
 [  807.776915]  (alc_key){..-...}, at: [<ffffffff81168fa9>] kmem_cache_free+0x1a9/0x240
 [  807.777415] 
 [  807.777417] but task is already holding lock:
 [  807.777816]  (&(&parent->list_lock)->rlock){-.-...}, at: [<ffffffff81169294>] __drain_alien_cache+0x64/0xa0
 [  807.778398] 
 [  807.778399] which lock already depends on the new lock.
 [  807.778401] 
 [  807.778975] 
 [  807.778976] the existing dependency chain (in reverse order) is:
 [  807.779425] 
 [  807.779426] -> #1 (&(&parent->list_lock)->rlock){-.-...}:
 [  807.779891]        [<ffffffff810ab50c>] validate_chain+0x6cc/0x7d0
 [  807.780227]        [<ffffffff810ab914>] __lock_acquire+0x304/0x500
 [  807.780557]        [<ffffffff810ac1d2>] lock_acquire+0xa2/0x130
 [  807.780877]        [<ffffffff815349b6>] _raw_spin_lock+0x36/0x70
 [  807.781208]        [<ffffffff81169294>] __drain_alien_cache+0x64/0xa0
 [  807.781548]        [<ffffffff811698cb>] kfree+0x1db/0x2a0
 [  807.781847]        [<ffffffff81169a21>] free_alien_cache+0x91/0xa0
 [  807.782178]        [<ffffffff8152e9b9>] cpuup_prepare+0x168/0x1a9
 [  807.782507]        [<ffffffff8152ea2f>] cpuup_callback+0x35/0xc5
 [  807.782829]        [<ffffffff815393a4>] notifier_call_chain+0x94/0xd0
 [  807.783173]        [<ffffffff8109770e>] __raw_notifier_call_chain+0xe/0x10
 [  807.783535]        [<ffffffff8106d000>] __cpu_notify+0x20/0x40
 [  807.783858]        [<ffffffff8152cf02>] _cpu_up+0x6e/0x10e
 [  807.784172]        [<ffffffff8152d07b>] cpu_up+0xd9/0xec
 [  807.784468]        [<ffffffff81e21bd6>] smp_init+0x41/0x96
 [  807.784771]        [<ffffffff81e03791>] kernel_init+0x1ef/0x2a6
 [  807.785092]        [<ffffffff81540184>] kernel_thread_helper+0x4/0x10
 [  807.785433] 
 [  807.785434] -> #0 (alc_key){..-...}:
 [  807.785824]        [<ffffffff810aae18>] check_prev_add+0x528/0x550
 [  807.786156]        [<ffffffff810ab50c>] validate_chain+0x6cc/0x7d0
 [  807.786488]        [<ffffffff810ab914>] __lock_acquire+0x304/0x500
 [  807.786823]        [<ffffffff810ac1d2>] lock_acquire+0xa2/0x130
 [  807.787143]        [<ffffffff815349b6>] _raw_spin_lock+0x36/0x70
 [  807.787467]        [<ffffffff81168fa9>] kmem_cache_free+0x1a9/0x240
 [  807.787802]        [<ffffffff81169094>] slab_destroy+0x54/0x80
 [  807.788123]        [<ffffffff8116911d>] free_block+0x5d/0x170
 [  807.788439]        [<ffffffff811692bc>] __drain_alien_cache+0x8c/0xa0
 [  807.788778]        [<ffffffff811698cb>] kfree+0x1db/0x2a0
 [  807.789087]        [<ffffffff8144aeb0>] skb_release_data+0xd0/0x100
 [  807.789426]        [<ffffffff8144aefe>] __kfree_skb+0x1e/0xa0
 [  807.789741]        [<ffffffff8144afb1>] consume_skb+0x31/0x80
 [  807.790058]        [<ffffffffa01d4e74>] bnx2_free_skbs+0x234/0x390 [bnx2]
 [  807.790416]        [<ffffffffa01d5096>] bnx2_suspend+0xc6/0xe0 [bnx2]
 [  807.790758]        [<ffffffff812978a6>] pci_legacy_suspend+0x46/0xe0
 [  807.791101]        [<ffffffff8129854d>] pci_pm_freeze+0xad/0xd0
 [  807.791422]        [<ffffffff8135e7f6>] pm_op+0x136/0x1a0
 [  807.791724]        [<ffffffff8135f18b>] __device_suspend+0x26b/0x2d0
 [  807.792063]        [<ffffffff8136016f>] async_suspend+0x1f/0xa0
 [  807.792384]        [<ffffffff81099634>] async_run_entry_fn+0x84/0x160
 [  807.792726]        [<ffffffff8108938a>] process_one_work+0x1aa/0x520
 [  807.793065]        [<ffffffff8108ba7b>] worker_thread+0x17b/0x3b0
 [  807.793394]        [<ffffffff81090af6>] kthread+0xb6/0xc0
 [  807.793698]        [<ffffffff81540184>] kernel_thread_helper+0x4/0x10
 [  807.794041] 
 [  807.794042] other info that might help us debug this:
 [  807.794043] 
 [  807.794602]  Possible unsafe locking scenario:
 [  807.794603] 
 [  807.794995]        CPU0                    CPU1
 [  807.795250]        ----                    ----
 [  807.795505]   lock(&(&parent->list_lock)->rlock);
 [  807.795789]                                lock(alc_key);
 [  807.796101]                                lock(&(&parent->list_lock)->rlock);
 [  807.796557]   lock(alc_key);
 [  807.796771] 
 [  807.796772]  *** DEADLOCK ***
 [  807.796773] 
 [  807.797252] 5 locks held by kworker/u:6/29543:
 [  807.797503]  #0:  (events_unbound){.+.+.+}, at: [<ffffffff8108931d>] process_one_work+0x13d/0x520
 [  807.798058]  #1:  ((&entry->work)){+.+.+.}, at: [<ffffffff8108931d>] process_one_work+0x13d/0x520
 [  807.798591]  #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff8135efc3>] __device_suspend+0xa3/0x2d0
 [  807.799160]  #3:  (&(&nc->lock)->rlock){-.-...}, at: [<ffffffff811698b4>] kfree+0x1c4/0x2a0
 [  807.799672]  #4:  (&(&parent->list_lock)->rlock){-.-...}, at: [<ffffffff81169294>] __drain_alien_cache+0x64/0xa0
 [  807.800270] 
 [  807.800271] stack backtrace:
 [  807.800608] Pid: 29543, comm: kworker/u:6 Not tainted 3.1.0-rc2 #1
 [  807.800929] Call Trace:
 [  807.801115]  [<ffffffff810a8e39>] print_circular_bug+0x109/0x110
 [  807.801428]  [<ffffffff810aae18>] check_prev_add+0x528/0x550
 [  807.801728]  [<ffffffff810ab50c>] validate_chain+0x6cc/0x7d0
 [  807.802040]  [<ffffffff8101a3f9>] ? sched_clock+0x9/0x10
 [  807.802327]  [<ffffffff8109839d>] ? sched_clock_cpu+0xcd/0x110
 [  807.802635]  [<ffffffff810ab914>] __lock_acquire+0x304/0x500
 [  807.802938]  [<ffffffff810ac1d2>] lock_acquire+0xa2/0x130
 [  807.803230]  [<ffffffff81168fa9>] ? kmem_cache_free+0x1a9/0x240
 [  807.803541]  [<ffffffff815349b6>] _raw_spin_lock+0x36/0x70
 [  807.803835]  [<ffffffff81168fa9>] ? kmem_cache_free+0x1a9/0x240
 [  807.804147]  [<ffffffff81168fa9>] kmem_cache_free+0x1a9/0x240
 [  807.804453]  [<ffffffff81169094>] slab_destroy+0x54/0x80
 [  807.804740]  [<ffffffff8116911d>] free_block+0x5d/0x170
 [  807.805027]  [<ffffffff811692bc>] __drain_alien_cache+0x8c/0xa0
 [  807.805337]  [<ffffffff811698cb>] kfree+0x1db/0x2a0
 [  807.805610]  [<ffffffff8144aeb0>] skb_release_data+0xd0/0x100
 [  807.805916]  [<ffffffff8144aefe>] __kfree_skb+0x1e/0xa0
 [  807.806201]  [<ffffffff8144afb1>] consume_skb+0x31/0x80
 [  807.806487]  [<ffffffffa01d4e74>] bnx2_free_skbs+0x234/0x390 [bnx2]
 [  807.806813]  [<ffffffffa01d5096>] bnx2_suspend+0xc6/0xe0 [bnx2]
 [  807.807130]  [<ffffffff812978a6>] pci_legacy_suspend+0x46/0xe0
 [  807.807445]  [<ffffffff8129854d>] pci_pm_freeze+0xad/0xd0
 [  807.807736]  [<ffffffff8135e7f6>] pm_op+0x136/0x1a0
 [  807.808015]  [<ffffffff8135f18b>] __device_suspend+0x26b/0x2d0
 [  807.808334]  [<ffffffff8136016f>] async_suspend+0x1f/0xa0
 [  807.808625]  [<ffffffff81099634>] async_run_entry_fn+0x84/0x160
 [  807.808943]  [<ffffffff8108938a>] process_one_work+0x1aa/0x520
 [  807.809255]  [<ffffffff8108931d>] ? process_one_work+0x13d/0x520
 [  807.809568]  [<ffffffff810995b0>] ? async_schedule+0x20/0x20
 [  807.809870]  [<ffffffff8108ba7b>] worker_thread+0x17b/0x3b0
 [  807.810167]  [<ffffffff8108b900>] ? manage_workers+0x120/0x120
 [  807.810476]  [<ffffffff81090af6>] kthread+0xb6/0xc0
 [  807.810746]  [<ffffffff810aa5fd>] ? trace_hardirqs_on_caller+0x10d/0x1a0
 [  807.811092]  [<ffffffff81540184>] kernel_thread_helper+0x4/0x10
 [  807.811404]  [<ffffffff81535774>] ? retint_restore_args+0x13/0x13
 [  807.811723]  [<ffffffff81090a40>] ? __init_kthread_worker+0x70/0x70
 [  807.812053]  [<ffffffff81540180>] ? gs_change+0x13/0x13
Comment 1 Zhang Rui 2012-01-18 05:53:36 UTC
does the problem still exist in the latest upstream kernel?
Comment 2 Zhang Rui 2012-05-24 08:09:03 UTC
bug closed as there is no response from the bug reporter.
please feel free to reopen it if the problem still exists in the latest upstream kernel.