Bug 65071

Summary: lockdep warning in pci_enable_sriov() path
Product: Drivers Reporter: Bjorn Helgaas (bjorn)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
URL: http://lkml.kernel.org/r/CAE9FiQXYQEAZ=0sG6+2OdffBqfLS9MpoN1xviRR9aDbxPxcKxQ@mail.gmail.com
Kernel Version: 3.10-rc1 Subsystem:
Regression: No Bisected commit-id:

Description Bjorn Helgaas 2013-11-16 00:00:37 UTC
Yinghai reported the following lockdep warning:

mlx4_core 0000:02:00.0: NOP command IRQ test passed

=============================================
[ INFO: possible recursive locking detected ]
3.10.0-rc1-yh-00114-gf59c98e-dirty #1588 Not tainted
---------------------------------------------
kworker/0:1/2285 is trying to acquire lock:
 ((&wfc.work)){+.+.+.}, at: [<ffffffff810ab745>] flush_work+0x5/0x280

but task is already holding lock:
 ((&wfc.work)){+.+.+.}, at: [<ffffffff810aabe2>] process_one_work+0x202/0x490

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock((&wfc.work));
  lock((&wfc.work));

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by kworker/0:1/2285:
 #0:  (events){.+.+.+}, at: [<ffffffff810aabe2>] process_one_work+0x202/0x490
 #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff810aabe2>] process_one_work+0x202/0x490
 #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff81765eea>] device_attach+0x2a/0xc0

stack backtrace:
CPU: 0 PID: 2285 Comm: kworker/0:1 Not tainted 3.10.0-rc1-yh-00114-gf59c98e-dirty #1588
Hardware name: Oracle Corporation  unknown       / , BIOS 11016600    05/17/2011
Workqueue: events work_for_cpu_fn
 ffffffff83350bc0 ffff881025c11778 ffffffff82093a74 ffff881025c11838
 ffffffff810ed194 ffff881025c117b8 ffff881025c38000 0000b787702301dc
 ffff881000000000 0000000000000002 ffffffff8322cba0 ffff881025c11878
Call Trace:
 [<ffffffff82093a74>] dump_stack+0x19/0x1b
 [<ffffffff810ed194>] validate_chain.isra.19+0x8f4/0x1210
 [<ffffffff810f0c40>] __lock_acquire+0xac0/0xce0
 [<ffffffff810f150a>] lock_acquire+0xda/0x130
 [<ffffffff810ab78c>] flush_work+0x4c/0x280
 [<ffffffff810aba42>] work_on_cpu+0x82/0x90
 [<ffffffff8151ebcf>] pci_device_probe+0xaf/0x110
 [<ffffffff8176608d>] driver_probe_device+0xdd/0x220
 [<ffffffff817662b3>] __device_attach+0x33/0x50
 [<ffffffff817640b6>] bus_for_each_drv+0x56/0xa0
 [<ffffffff81765f48>] device_attach+0x88/0xc0
 [<ffffffff81515b49>] pci_bus_add_device+0x39/0x60
 [<ffffffff81540605>] pci_bus_add_vf+0x25/0x40
 [<ffffffff81540834>] pci_bus_add_device_vfs+0xa4/0xe0
 [<ffffffff81c1faa6>] __mlx4_init_one+0xa96/0xc90
 [<ffffffff81c1fd0d>] mlx4_init_one+0x4d/0x60
 [<ffffffff8151e2db>] local_pci_probe+0x4b/0x80
 [<ffffffff810a7958>] work_for_cpu_fn+0x18/0x30
 [<ffffffff810aac6b>] process_one_work+0x28b/0x490
Comment 1 Bjorn Helgaas 2013-11-16 00:05:30 UTC
This problem occurs when drivers call pci_enable_sriov() from their .probe() method, via this path:

  pci_call_probe
    work_on_cpu(cpu, local_pci_probe, ...)                 # .probe for PF
      driver .probe
        pci_enable_sriov
          ...
            pci_bus_add_device                             # add new VF
              ...
                pci_call_probe
                  work_on_cpu(cpu, local_pci_probe, ...)   # .probe() for VF

Drivers that support SR-IOV should implement the .sriov_configure() method
and enable SR-IOV there, because then users can use the sysfs interface to
configure each instance of a PF differently.  But some drivers enable
SR-IOV in .probe() so they can continue supporting module parameters for
the number of VFs.

Drivers that use SR-IOV but don't yet implement .sriov_configure()
include: be, cxgb4, efx, enic, mlx4, and vxge.