Bug 60111
Summary: | NULL pointer deref in ASPM alloc_pcie_link_state() on non-compliant topology | ||
---|---|---|---|
Product: | Drivers | Reporter: | Bjorn Helgaas (bjorn) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | REOPENED --- | ||
Severity: | normal | CC: | fancer.lancer |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | https://lkml.kernel.org/r/1371668174-32115-1-git-send-email-rkrcmar@redhat.com | ||
Kernel Version: | 3.10 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg with aspm bug
lspci -vv with pcie_aspm=off patch to deal with this topology boot log with issue log of successful boot lspci -vvnnt; lspci -vv; logs Patch to fix the issue |
Description
Bjorn Helgaas
2013-06-25 16:50:14 UTC
Created attachment 106031 [details]
dmesg with aspm bug
Created attachment 106041 [details]
lspci -vv with pcie_aspm=off
After opening this bz, I found this Red Hat report of the same issue: https://bugzilla.redhat.com/show_bug.cgi?id=972381 Reproduced with v3.10-rc1 with the following qemu invocation: /usr/local/bin/qemu-system-x86_64 -M q35 -enable-kvm -m 512 \ -device x3130-upstream,bus=pcie.0,id=upstream \ -device xio3130-downstream,bus=upstream,id=downstream,chassis=1 \ -drive file=ubuntu.img,if=none,id=mydisk \ -device ide-drive,drive=mydisk,bus=ide.0 \ -drive file=scratch.img,id=disk1 \ -device virtio-blk-pci,bus=downstream,id=virtio-disk1,drive=disk1 \ -nographic -kernel ~/linux/arch/x86/boot/bzImage \ -append "console=ttyS0,115200n8 root=/dev/sda1 ignore_loglevel" Created attachment 107346 [details]
patch to deal with this topology
This patch *would* avoid the issue, but it's a fair bit of code and
comments that future code readers would have to wade through, and the
only benefit is for topologies that are completely invalid and will
never be seen in real hardware. Therefore, I'm inclined to do nothing
right now.
I think that we could also avoid this issue by reworking our ASPM
support, and if we were to do anything, that's what I would prefer.
For example, consider the following (legal and common) topology:
00:1c.0 Root Port (bridge leading to [bus 01-03])
01:00.0 Upstream Port (bridge leading to [bus 02-03])
02:00.0 Downstream Port (bridge leading to [bus 03])
When we initialize ASPM link state for 02:00.0, we look all the way up
to the link state for 00:1c.0 (the
"pdev->bus->parent->self->link_state" in alloc_pcie_link_state()). I
think that's crazy. ASPM is fundamentally a feature of a *link*, and
the only link directly related to 02:00.0 is the one leading
*downstream* to device 03:00.0.
I know there are latency questions where we do care about upstream
links, and there's commit 46bbdfa44c, which is what originally added
this "pdev->bus->parent->self->link_state" expression. It suggests
that we need to check the whole hierarchy up to the root port. But
there's not enough information in the changelog to really explain it.
So I strongly support rework that would simplify ASPM management
overall. I just don't want to add point fixes that only handle
special cases, especially when they don't occur in the real world.
I'm closing this as invalid because the simulated machine where the problem occurs has an invalid PCIe topology (an Upstream Port with no Downstream Port or Root Port above it). As far as I know, there is no valid topology, e.g., a real hardware machine in the field, that would cause this failure. Reopening because Serge Semin <fancer.lancer@gmail.com> reported two systems that hit this issue: http://lkml.kernel.org/r/1475746455-20665-1-git-send-email-fancer.lancer@gmail.com Serge, can you please attach complete dmesg logs and "lspci -vv" output for these systems? Created attachment 240981 [details]
boot log with issue
Created attachment 240991 [details]
log of successful boot
Boot log when issue is fixed by the provided patch
Created attachment 241001 [details]
lspci -vvnnt; lspci -vv; logs
Log of lspci utility output on the system with fixed issue. As you can see ASPM is enabled everywhere except 1:00.0 Upstream port.
Created attachment 241011 [details]
Patch to fix the issue
I don't know what the previous patch did to fix the problem, but mine just checks whether the parent bus of having secondary link current device is not root. So we have two cases: 1) Traditional PCIe bus topology: |RC ---------------- bus 0 (root bus - root complex) |[RP] | | bus 1 (leading to a first Upstream/Downstream device) | |US0 --------------- bus 2 (device virtual bus) |[DS2] In accordance with the algorithm it creates "struct pcie_link_state" for RP and DS2. So when we get "pdev->bus->parent->self->link_state" for DS2, we retrieve RP's link state structure. 2) Root-port-less PCIe bus topology: |RC+RP (Host-PCIe bridges is merged with root port) | | bus 1 (root bus) | |US0 --------------- bus 2 (device virtual bus) |[DS2] First of all algorithm doesn't create link state structure for root port since there is no one. So when algorithm tries to enumerate [DS2] and create link state structure, it calls "pdev->bus->parent->self->link_state". Since "RC+RP" is not ordinary PCI device, but Host-PCI bridge, bus 1 turns out to be root bus without upstream device, so "pdev->bus->parent->self"-pointer of the bus is NULL. That's how NULL dereference happens. So the fix is easy. Just check whether the parent bus isn't root bus. Of course, such method has a drawback. When we apply the attached patch, bus 1 turns out to be considered as root bus. It means, that PCIe link of the bus can't change state, so ASPM service is not enabled for US0 port. As you can see ASPM is disabled for device 1:00.0 of the attached "lspci -vv" log. As a consequence if PCIe bus has nothing, but US0 device directly connected to the merged RC+RP, ASPM will be de-facto disabled. (In reply to Radim Krčmář from comment #2) > Created attachment 106041 [details] > lspci -vv with pcie_aspm=off Notes from dmesg (comment #1) and lspci (comment #2): Linux version 3.10.0-0.rc6.git0.2.fc20.x86_64 (mockbuild@buildvm-02.phx2.fedoraproject.org) (gcc version 4.8.1 20130612 (Red Hat 4.8.1-2) (GCC) ) #1 SMP Mon Jun 17 15:02:16 UTC 2013 PCI host bridge to bus 0000:00 00:03.0 Upstream Port bridge to [bus 01-02] 01:00.0 Downstream Port bridge to [bus 02] 02:00.0 virtio SCSI (non-PCIe) BUG: unable to handle kernel NULL pointer dereference at 0000000000000088 IP: [<ffffffff813218f1>] pcie_aspm_init_link_state+0x701/0x7d0 No ASPM possible anywhere. 01:00.0 should be the upstream end of a Link where ASPM could potentially be used, but the downstream device (02:00.0) is not a PCIe device. We should fix ASPM to handle this gracefully instead of crashing. (In reply to fancer from comment #9) > Created attachment 241001 [details] > lspci -vvnnt; lspci -vv; logs Notes from dmesg (comment #7) and lspci (comment #9): Linux version 3.19.12-cnccu (builder@baikal.int) (gcc version 5.2.0 (crosstool-NG ) ) #0 SMP Wed Sep 7 11:04:07 MSK 2016 PCI host bridge to bus 0000:01 01:00.0 Upstream Port bridge to [bus 02-06] 02:02.0 Downstream Port bridge to [bus 03-05] 02:04.0 Downstream Port bridge to [bus 06] 03:00.0 Upstream Port bridge to [bus 04-05] 03:00.1 Endpoint 04:00.0 Downstream Port bridge to [bus 05] 06:00.0 Legacy Endpoint CPU 0 Unable to handle kernel paging request at virtual address 00000060, epc == 80316d78, ra == 80316d28 ASPM is possible on: - 02:02.0 (link to 03:00.0, 03:00.1; both PCIe devices) - 02:04.0 (link to 06:00.0) ASPM is not possible on 04:00.0 (PCIe downstream port but no device on the other end). Again, we should fix ASPM to handle this gracefully instead of crashing. I don't have a proposal yet. |