Bug 87591

Summary: Host will call trace when loading igbvf.
Product: Drivers Reporter: Zhou, Chao (chao.zhou)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED CODE_FIX    
Severity: normal CC: bjorn, bonzini, yinghai
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.18.0-rc2 Subsystem:
Regression: Yes Bisected commit-id:

Description Zhou, Chao 2014-11-03 01:58:52 UTC
Environment:
------------
Host OS (ia32/ia32e/IA64):ia32e
Guest OS (ia32/ia32e/IA64):ia32e
Guest OS Type (Linux/Windows):Linux
kvm.git Commit:cac7f2429872d3733dc3f9915857b1691da2eb2f
qemu.git Commit:3e9418e160cd8901c83a3c88967158084f5b5c03
Host Kernel Version:3.18.0-rc2
Hardware:Haswell_EP, Ivytown_EP


Bug detailed description:
--------------------------
after load igbvf, the host will call trace.

note:
this is a kernel bug
kvm.git  + qemu.git  = result
cac7f242 + 3e9418e1  = bad
da01e614 + 3e9418e1  = good


Reproduce steps:
----------------
1.start up a host with kvm 
2. rmmod igb
3. modprobe igb max_vfs=3

Current result:
----------------
host call trace after load igbvf

Expected result:
----------------
host works fine after load igbvf

Basic root-causing log:
----------------------
igb 0000:86:00.0: Enabling SR-IOV VFs using the module parameter is deprecated - please use the pci sysfs interface.
igb 0000:86:00.0: irq 42 for MSI/MSI-X
igb 0000:86:00.0: irq 43 for MSI/MSI-X
igb 0000:86:00.0: irq 44 for MSI/MSI-X
pci 0000:87:10.0: [8086:10ca] type 00 class 0x020000
BUG: unable to handle kernel NULL pointer dereference at 00000000000002c8
IP: [<ffffffff8127fedb>] pci_get_hp_params+0x36/0x164
PGD 1036cd2067 PUD 103c31b067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: igb(+) nfsv3 nfs_acl auth_rpcgss oid_registry nfsv4 dns_resolver nfs fscache lockd sunrpc grace kvm_intel kvm bridge stp llc autofs4 8021q cpufreq_ondemand ipv6 joydev microcode pcspkr i2c_algo_bit i2c_i801 i2c_core ehci_pci ehci_hcd xhci_pci xhci_hcd ixgbe ptp pps_core hwmon mdio tpm_tis tpm ipmi_si ipmi_msghandler acpi_cpufreq button dm_mirror dm_region_hash dm_log dm_mod [last unloaded: igb]
CPU: 4 PID: 7215 Comm: modprobe Not tainted 3.18.0-rc2 #2
Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.TP2.0025.R02.1403131625 03/13/2014
task: ffff88103e5aa010 ti: ffff881036e60000 task.ti: ffff881036e60000
RIP: 0010:[<ffffffff8127fedb>]  [<ffffffff8127fedb>] pci_get_hp_params+0x36/0x164
RSP: 0018:ffff881036e638a8  EFLAGS: 00010206
RAX: 0000000000000090 RBX: ffff881036e63918 RCX: 0000000000000000
RDX: ffff88203ecc0800 RSI: ffff881036e63918 RDI: ffff88103ed39000
RBP: ffff881036e63908 R08: ffff88203ecbd340 R09: 0000000000000008
R10: 0000000000000005 R11: 0000000000000000 R12: ffff88103ed39090
R13: ffff88103c202400 R14: ffff88103ee367c0 R15: 0000000000000000
FS:  00007f6dfaa3e700(0000) GS:ffff88107f480000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002c8 CR3: 0000001034ffa000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 ffff88103ed39490 0000000000000097 ffff881036e638f8 ffffffff813e9af3
 ffff881036e63944 ffffffff00000002 ffff881036e638e8 ffff88103ed39000
 ffff88103ed39090 ffff88103c202400 ffff88103ee367c0 0000000000000000
Call Trace:
 [<ffffffff813e9af3>] ? pci_conf1_read+0xdc/0xe9
 [<ffffffff81269245>] pci_configure_device+0x24/0x63
 [<ffffffff81267607>] ? pci_bus_read_config_word+0x6e/0x7c
 [<ffffffff812692e9>] pci_device_add+0x1f/0x114
 [<ffffffff8127f363>] virtfn_add.clone.1+0x223/0x328
 [<ffffffff8127f6ff>] sriov_enable+0x297/0x370
 [<ffffffff8127f806>] pci_enable_sriov+0x2e/0x30
 [<ffffffffa03de9f7>] igb_enable_sriov+0xdc/0x187 [igb]
 [<ffffffffa03e6530>] igb_pci_enable_sriov+0x11/0x2d [igb]
 [<ffffffffa03e66bf>] igb_sw_init+0x173/0x19d [igb]
 [<ffffffffa03e69d7>] igb_probe+0x2ee/0xb47 [igb]
 [<ffffffff8126fce1>] local_pci_probe+0x3a/0x81
 [<ffffffff8126fdab>] pci_call_probe+0x83/0x8e
 [<ffffffff8126fc5d>] ? pci_match_device+0xc7/0xe4
 [<ffffffff8126ffe4>] pci_device_probe+0x57/0x7d
 [<ffffffff813067db>] ? driver_sysfs_add+0x6e/0x93
 [<ffffffff813069e9>] really_probe+0x9c/0x1a9
 [<ffffffff81306b28>] driver_probe_device+0x32/0x4e
 [<ffffffff81306b9c>] __driver_attach+0x58/0x7c
 [<ffffffff81306b44>] ? driver_probe_device+0x4e/0x4e
 [<ffffffff813052ec>] bus_for_each_dev+0x53/0x91
 [<ffffffff8130676b>] driver_attach+0x19/0x1b
 [<ffffffff81305a83>] bus_add_driver+0xd7/0x1cf
 [<ffffffff813070e5>] driver_register+0x89/0xc1
 [<ffffffffa03fe000>] ? 0xffffffffa03fe000
 [<ffffffff812700b3>] __pci_register_driver+0x46/0x48
 [<ffffffffa03fe04f>] igb_init_module+0x4f/0x51 [igb]
 [<ffffffff810002ab>] do_one_initcall+0xe3/0x170
 [<ffffffff81117e3b>] ? __vunmap+0xad/0xb8
 [<ffffffff8109c12b>] do_init_module+0x2b/0x174
 [<ffffffff8109cfb0>] load_module+0x43e/0x569
 [<ffffffff8109c274>] ? do_init_module+0x174/0x174
 [<ffffffff8109b361>] ? module_sect_show+0x20/0x20
 [<ffffffff8109d1fb>] SyS_init_module+0x54/0x81
 [<ffffffff814ab792>] system_call_fastpath+0x12/0x17
Code: 54 53 48 89 f3 48 83 ec 38 48 8b 47 10 eb 4b 48 8b 50 10 48 85 d2 75 09 48 8b 80 10 01 00 00 eb 0a 48 8b 40 38 48 05 90 00 00 00 <48> 8b 80 38 02 00 00 48 85 c0 74 20 4c 8b 60 08 4d 85 e4 74 17
RIP  [<ffffffff8127fedb>] pci_get_hp_params+0x36/0x164
 RSP <ffff881036e638a8>
CR2: 00000000000002c8
---[ end trace 846b555b0bbcaaf1 ]---
Comment 1 Zhou, Chao 2014-11-03 01:59:47 UTC
the first bad commit is:

commit 6cd33649fa83d97ba7b66f1d871a360e867c5220
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Aug 27 14:29:47 2014 -0600

    PCI: Add pci_configure_device() during enumeration

    Some platforms can tell the OS how to configure PCI devices, e.g., how to
    set cache line size, error reporting enables, etc.  ACPI defines _HPP and
    _HPX methods for this purpose.

    This configuration was previously done by some of the hotplug drivers using
    pci_configure_slot().  But not all hotplug drivers did this, and per the
    spec (ACPI rev 5.0, sec 6.2.7), we can also do it for "devices not
    configured by the BIOS at system boot."

    Move this configuration into the PCI core by adding pci_configure_device()
    and calling it from pci_device_add(), so we do this for all devices as we
    enumerate them.

    This is based on pci_configure_slot(), which is used by hotplug drivers.
    I omitted:

      - pcie_bus_configure_settings() because it configures MPS and MRRS, which
        requires global knowledge of the fabric and must be done later, and

      - configuration of subordinate devices; that will happen when we call
        pci_device_add() for those devices.

    Because pci_configure_slot() was only done by hotplug drivers, this initial
    version of pci_configure_device() only configures hot-added devices,
    ignoring anything added during boot.

    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Acked-by: Yinghai Lu <yinghai@kernel.org>
Comment 2 Yinghai Lu 2014-11-03 17:06:05 UTC
https://lkml.org/lkml/2014/10/29/880
Comment 3 Zhou, Chao 2014-11-27 07:41:58 UTC
test the bug with commit:d3fccc7ef831d1d829b4da5eaa081db55b1e38f3(kvm.git master branch)
kernel version: 3.18.0-rc6+
after loading igbvf, the host works fine.
Comment 4 Zhou, Chao 2014-11-27 07:43:31 UTC
this commit fixed the bug:
commit 32f638fc11db0526c706454d9ab4339d55ac89f3
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Thu Oct 30 10:17:25 2014 -0600

    PCI: Don't oops on virtual buses in acpi_pci_get_bridge_handle()
    
    acpi_pci_get_bridge_handle() returns the ACPI handle for the bridge device
    (either a host bridge or a PCI-to-PCI bridge) leading to a PCI bus.  But
    SR-IOV virtual functions can be on a virtual bus with no bridge leading to
    it.  Return a NULL acpi_handle in this case instead of trying to
    dereference the NULL pointer to the bridge.
    
    This fixes a NULL pointer dereference oops in pci_get_hp_params() when
    adding SR-IOV VF devices on virtual buses.
    
    [bhelgaas: changelog, add comment in code]
    Fixes: 6cd33649fa83 ("PCI: Add pci_configure_device() during enumeration")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=87591
    Reported-by: Chao Zhou <chao.zhou@intel.com>
    Reported-by: Joerg Roedel <joro@8bytes.org>
    Signed-off-by: Yinghai Lu <yinghai@kernel.org>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>