Bug 84281
Summary: | [BISECTED] LSI PCI FC Adapter not enumerated | ||
---|---|---|---|
Product: | Drivers | Reporter: | Bjorn Helgaas (bjorn) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | dirk |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | http://lkml.kernel.org/r/ghiol53r9u.fsf@lena.gouders.net | ||
Kernel Version: | 3.14 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
lspci (showing all devices)
dmesg (working) dmesg (failing) Output of dmidecode AIDA64 report (txt) AIDA64 report (html) dmesg with Win2k/XP BIOS setting dmesg with ACPI SRAT Table = disable BIOS setting dmesg with Secured Setup Configurations = Yes BIOS setting |
Description
Bjorn Helgaas
2014-09-11 17:24:44 UTC
Created attachment 149811 [details] lspci (showing all devices) Extracted from Dirk's email: http://lkml.kernel.org/r/ghwq9k3m6z.fsf@lena.gouders.net Created attachment 149891 [details] dmesg (working) Dmesg log from 3.14.17, which works correctly, with PCI debug messages turned on. From http://lkml.kernel.org/r/ghioku5l29.fsf@lena.gouders.net (timestamps removed). Created attachment 149901 [details] dmesg (failing) Dmesg log from 1820ffdccb9b4398 (commit identified by bisection as the first bad commit), with PCI debug messages turned on. From http://lkml.kernel.org/r/ghioku5l29.fsf@lena.gouders.net (timestamps removed). Created attachment 150211 [details]
Output of dmidecode
Analysis (I wrote this up based on diagnosis by Andreas Noever): Dirk tested a Tyan VX50 (B4985) with this device that worked like this prior to 1820ffdccb9b: bus: [bus 00-7f] on node 0 link 1 ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-07]) pci 0000:00:0e.0: PCI bridge to [bus 0a] pci_bus 0000:0a: busn_res: can not insert [bus 0a] under [bus 00-07] (conflicts with (null) [bus 00-07]) pci 0000:0a:00.0: [1000:0646] type 00 class 0x0c0400 (FC adapter) Note that the root bridge [bus 00-07] aperture is wrong; this is a BIOS defect in the PCI0 _CRS method. But prior to 1820ffdccb9b, we didn't enforce that aperture, and the FC adapter worked fine at 0a:00.0. After 1820ffdccb9b, we notice that 00:0e.0's aperture is not contained in the root bridge's aperture, so we reconfigure it so it *is* contained: pci 0000:00:0e.0: bridge configuration invalid ([bus 0a-0a]), reconfiguring pci 0000:00:0e.0: PCI bridge to [bus 06-07] This effectively moves the FC device from 0a:00.0 to 07:00.0, which should be legal. But when we enumerate bus 06, the FC device doesn't respond, so we don't find anything. This is probably a defect in the FC device. [Oops, the FC device moves from 0a:00.0 to *06:00.0*] Created attachment 151071 [details]
AIDA64 report (txt)
Created attachment 151081 [details]
AIDA64 report (html)
Created attachment 151091 [details]
dmesg with Win2k/XP BIOS setting
Output of dmesg when started with BIOS setting:
Advanced->Installed O/S->Win2k/XP
More tests with the VX50: * The BIOS has a choice entry "Advanced -> Installed O/S" with the following possible choices: * Other * Win95 * Win98 * WinMe * Win2k/XP * Linux Our setting is "Linux" but today I tested all others to verify if this setting is relevant to this issue. None of these choices helped and comparing timestamp-cleaned files, besides sorting, CPU and process numbers, I don't see differences, so I upload just one of those files and will provide others if wanted. * I tested two other toggled settings: 1) Advanced -> Hammer Configurations -> ACPI SRAT Table = disable 2) Advanced -> Secured Setup Configurations = Yes Both also did not help but caused different dmesg output that I also attach. * I created an AIDA64 report on Win2008 (Win98 failed to start after installation). I could not manage the FC adapter with the management software and could not identify it in the hardware list or AIDA report but will leave serious evaluation to more competent persons. Created attachment 151101 [details]
dmesg with ACPI SRAT Table = disable BIOS setting
Created attachment 151111 [details]
dmesg with Secured Setup Configurations = Yes BIOS setting
Win98 doesn't start, and Windows Server 2008 starts but the FC adapter doesn't work at all. The initial configuration from BIOS is the same as for Linux: ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-07]) pci 0000:00:0e.0: PCI bridge to [bus 0a] The AIDA64 report in comment #7 shows that Windows reconfigured the 00:0e.0 bridge to fit within the host bridge's [bus 00-07] range: B00 D0E F00: nVIDIA nForce Pro 2200 (CK8-04 Pro) - PCI Express Root Port Offset 010: 00 00 00 00 00 00 00 00 00 07 07 00 31 31 00 20 The secondary (at 0x19) and subordinate (at 0x1a) bus numbers are both 07. There is no device on bus 07. The [1000:0646] LSI FC adapter is not visible at all. This is essentially the same behavior that prompted this bug report. The only real difference is that Linux reprogrammed the bridge to [bus 06-07], while Windows set it to [bus 07]. This should be fixed by 12d8706963f0 (Revert "PCI: Make sure bus number resources stay within their parents bounds"), which appeared in v3.17-rc7. In 12d8706963f0, I mentioned other possible fixes: 1) Add a quirk to fix the _CRS information based on what amd_bus.c read from the hardware 2) Reset the FC device after we change its bus number I'm not sure 1) is a good idea because it makes the hardware configuration out of sync with the platform's idea of it. But this would be a corner case quirk and maybe good enough for this one platform. For 2), it seems a little too aggressive to always reset devices when bus numbers change, but we could conceivably have a quirk to say "this device needs to be reset on bus number changes." That would let us handle this broken LSI device even on other platforms. |