Bug 15362 - MPT Fusion SCSI drives no longer appear - suspect PCI bus scan bug
Summary: MPT Fusion SCSI drives no longer appear - suspect PCI bus scan bug
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Bjorn Helgaas
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-02-20 15:04 UTC by Sean M. Pappalardo
Modified: 2013-01-02 16:58 UTC (History)
7 users (show)

See Also:
Kernel Version: 2.6.30, 2.6.32, 3.5.5
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
2.6.26 dmesg (40.52 KB, text/plain)
2010-02-20 15:04 UTC, Sean M. Pappalardo
Details
2.6.26 lspci -n (1.03 KB, text/plain)
2010-02-20 15:05 UTC, Sean M. Pappalardo
Details
2.6.26 lspci -v (9.93 KB, text/plain)
2010-02-20 15:06 UTC, Sean M. Pappalardo
Details
2.6.32 dmesg (48.29 KB, text/plain)
2010-02-20 15:07 UTC, Sean M. Pappalardo
Details
2.6.32 lspci -n (645 bytes, text/plain)
2010-02-20 15:07 UTC, Sean M. Pappalardo
Details
dmesg log from 3.5.5 (51.12 KB, text/plain)
2012-10-04 19:02 UTC, Sean M. Pappalardo
Details
ACPI dump (under kernel 3.5.5) (97.98 KB, text/plain)
2012-10-04 19:04 UTC, Sean M. Pappalardo
Details
3.5.5 lspci -n output (645 bytes, text/plain)
2012-10-04 19:05 UTC, Sean M. Pappalardo
Details
AIDA64 report (447.67 KB, text/html)
2012-10-09 09:30 UTC, Sean M. Pappalardo
Details
AIDA64 report (HP xw9300 Windows Vista 32 SP2, segmentation ON) (450.26 KB, text/html)
2012-10-10 22:35 UTC, Bjorn Helgaas
Details
AIDA64 report (HP xw9300 Windows Vista 32 SP2, segmentation OFF) (449.74 KB, text/html)
2012-10-10 23:53 UTC, Bjorn Helgaas
Details
dmesg from 3.5.5 with 'ACPI Bus Segmentation' disabled (50.67 KB, text/plain)
2012-10-11 09:21 UTC, Sean M. Pappalardo
Details
ignore _SEG on xw9300 (3.06 KB, patch)
2012-10-11 22:32 UTC, Bjorn Helgaas
Details | Diff
dmesg log after patch application (55.73 KB, text/plain)
2012-10-15 13:43 UTC, Sean M. Pappalardo
Details
3.5.5 lspci -n output after patch (909 bytes, text/plain)
2012-10-15 13:44 UTC, Sean M. Pappalardo
Details

Description Sean M. Pappalardo 2010-02-20 15:04:14 UTC
Created attachment 25130 [details]
2.6.26 dmesg

Since upgrading to the 2.6.30 kernel on an AMD64 platform, drives attached to my LSI/MPT SCSI controller are no longer visible. If I boot using the 2.6.26-2 kernel, it works fine. The SCSI controller doesn't even show up in lspci in kernel versions above 2.6.26. (I do have the controller's BIOS disabled however, but I understand that doesn't matter since the kernel will poll it anyway, as 2.6.26 does. I tested with the controller's BIOS enabled too and it doesn't make a difference.)

This has already been submitted to Debian as bug #543308, but evidence points to a bug in the kernel PCI bus scanning code, since the following PCI devices show up on 2.6.26-2 but not 2.6.30-1 and up:

0001:40:01.0 0604: 1022:7450 (rev 12)
0001:40:01.1 0800: 1022:7451 (rev 01)
0001:40:02.0 0604: 1022:7450 (rev 12)
0001:40:02.1 0800: 1022:7451 (rev 01)
0001:61:06.0 0100: 1000:0030 (rev 07)
0001:61:06.1 0100: 1000:0030 (rev 07)
0002:80:00.0 0580: 10de:005e (rev a3)
0002:80:01.0 0580: 10de:00d3 (rev a3)
Comment 1 Sean M. Pappalardo 2010-02-20 15:05:37 UTC
Created attachment 25131 [details]
2.6.26 lspci -n
Comment 2 Sean M. Pappalardo 2010-02-20 15:06:08 UTC
Created attachment 25132 [details]
2.6.26 lspci -v
Comment 3 Sean M. Pappalardo 2010-02-20 15:07:10 UTC
Created attachment 25133 [details]
2.6.32 dmesg
Comment 4 Sean M. Pappalardo 2010-02-20 15:07:48 UTC
Created attachment 25134 [details]
2.6.32 lspci -n
Comment 5 Sean M. Pappalardo 2010-02-20 15:10:08 UTC
I have dmesg and lspci -n and -v from 2.6.30 as well but they're similar to 2.6.32. Just let me know if you want them too.
Comment 6 Matthew Wilcox 2010-02-20 15:22:09 UTC
OK, this is clearly a PCI issue, nothing to do with SCSI.

It looks like everything outside domain 0 is now not found.  This seems to be due to the MMCONFIG access option not being used.
Comment 7 Sean M. Pappalardo 2010-02-20 15:48:01 UTC
Ah, I just found a workaround: my BIOS offers the option to disable ACPI bus segmentation. Its help text explicitly mentions this issue of PCI-X devices not showing up on Linux. When I do that, the devices show up again in the later kernels.

The question is: is this a solution or do the newer kernels need fixing?
Comment 8 Matthew Wilcox 2010-02-20 17:42:22 UTC
I don't consider this an acceptable solution (and I wish BIOS people would talk to us instead of adding options to disable features).

I see this message in your dmesg:

[    0.358390] PCI: BIOS Bug: MCFG area at e0000000 is not reserved in ACPI motherboard resources

I don't suppose there's an updated BIOS version, is there?

There is a kernel boot option -- try specifying pci=check_enable_amd_mmconf
though I'm not familiar with it, and don't know whether it'll help this situation.
Comment 9 Sean M. Pappalardo 2010-02-22 07:57:10 UTC
No, there's no more recent BIOS. I have v2.09a which is the latest for this machine and is from 8 Jan 2007. (What's strange is that this hasn't been an issue at all for me until 2.6.30, so how did they know to add that ACPI segmentation disable option so long ago????)

I tried the boot option you gave and it didn't make any difference.

Anything else you'd like me to try? And will there be any problems later on if I use that segmentation disable thing for now (or would it mess up Windows (I dual-boot)) or should I just stick to 2.6.26 until this is sorted out?

Thanks alot for your time.
Comment 10 Andrew Morton 2010-02-24 19:56:24 UTC
(In reply to comment #6)
> OK, this is clearly a PCI issue, nothing to do with SCSI.
> 

How come 2.6.26 worked OK?  ACPI changes?
Comment 11 Marc Bejarano 2010-04-27 18:49:13 UTC
sean: any chance your willing  to bisect this to hopefully get the momentum going on this, again?
Comment 12 Sean M. Pappalardo 2010-04-27 19:50:50 UTC
Sure...what do you mean by bisect?
Comment 13 Jesse Barnes 2010-05-18 21:52:54 UTC
git help bisect should give you the overview.
Comment 14 Sean M. Pappalardo 2012-06-27 14:06:40 UTC
I do still plan to work on bisecting this. I only recently started building my own kernel (on a different system,) so am now familiar with the process using incremental patches.
Comment 15 Bjorn Helgaas 2012-06-27 14:32:29 UTC
> I do still plan to work on bisecting this.

Can you also attach a dmesg log from a current kernel, e.g., 3.4 or
newer?  We now print a lot more information during PCI enumeration.

But I guess the problem is that 2.6.26 finds devices in domains 1 and
2, while 2.6.32 does not.  I think MMCONFIG is the only config access
method we have for domains other than 0.  That suggests that MMCONFIG
used to work but doesn't any more.  The dmesg logs claim that we're
not using MMCONFIG in either 2.6.26 or 2.6.32 though, so I don't know
why we found anything in 2.6.26.
Comment 16 Yinghai Lu 2012-06-27 20:33:21 UTC
On Wed, Jun 27, 2012 at 7:32 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> I do still plan to work on bisecting this.
>
> Can you also attach a dmesg log from a current kernel, e.g., 3.4 or
> newer?  We now print a lot more information during PCI enumeration.
>
> But I guess the problem is that 2.6.26 finds devices in domains 1 and
> 2, while 2.6.32 does not.  I think MMCONFIG is the only config access
> method we have for domains other than 0.  That suggests that MMCONFIG
> used to work but doesn't any more.  The dmesg logs claim that we're
> not using MMCONFIG in either 2.6.26 or 2.6.32 though, so I don't know
> why we found anything in 2.6.26.

in short: the bios is broken, it return wrong segment in DSDT.

in both case, only pci_conf1 is used.  CPU is not new enough.

after comparing the code 2.6.26 and 2.6.32.  2.6.26 is not checking
seg in pci_conf1_read. but 2.6.32 check that...

2.6.26:
static int pci_conf1_read(unsigned int seg, unsigned int bus,
                          unsigned int devfn, int reg, int len, u32 *value)
{
        unsigned long flags;

        if ((bus > 255) || (devfn > 255) || (reg > 255)) {
                *value = -1;
                return -EINVAL;
        }

2.6.32
20 static int pci_conf1_read(unsigned int seg, unsigned int bus,
21                           unsigned int devfn, int reg, int len, u32 *value)
22 {
23         unsigned long flags;
24
25         if (seg || (bus > 255) || (devfn > 255) || (reg > 4095)) {
26                 *value = -1;
27                 return -EINVAL;
28         }

so it happens to work on 2.6.26.

please get to get new BIOS from your vendor.
or you need to override your DSDT.

Thanks

Yinghai
Comment 17 Sean M. Pappalardo 2012-06-27 21:08:44 UTC
I just called HP and the system is too old for them to put any resources on making a new BIOS. So in light of that and your comment Yinghai, this is in fact not a kernel bug (rather an unintended consequence of a fix,) and my only option is to use the BIOS-provided workaround (Disable ACPI bus segmentation.)

Let me know if any of that is incorrect.

Thank you all very much for your time and patience on this.
Comment 18 Bjorn Helgaas 2012-07-02 18:54:46 UTC
> in short: the bios is broken, it return wrong segment in DSDT.

I *think* what Yinghai is saying is:

  - MMCONFIG is not used either in 2.6.26 or 2.6.32.
  - BIOS reports these host bridges via DSDT PNP0A08 devices:
        [PCI0] leading to segment 0000 bus 00
        [PCI1] leading to segment 0001 bus 40
        [PCI2] leading to segment 0002 bus 80
  - Buses 40 and 80 are actually in segment 0, not segments 1 and 2.
  - When we enumerate bus 40 and bus 80, we pass seg=1 and seg=2,
respectively, to pci_conf1_read(), but 2.6.26 ignores seg.  For
example, when we think we're reading 0001:40:01.0 config space, 2.6.26
actually reads 0000:40:01.0 config space instead.
  - In 2.6.32, instead of ignoring seg, we return an error if it is
not zero.  Therefore, we fail to find anything on bus 40 and bus 80.

Sean, what system and BIOS version is this?  (The 3.4.x dmesg log or
the "dmidecode" output will contain this information.)  I don't expect
HP to change the BIOS, and it wouldn't be reasonable to require users
to debug this issue and upgrade their BIOS in any case.

But I would like to read the release notes or help text that mentions
this issue.  If all the buses were in fact in segment 0, the DSDT
would typically not have any _SEG methods at all, because segment 0 is
the default.  Yinghai is assuming that HP went to the trouble to *add*
_SEG methods that returned incorrect values.  But the fact that HP was
aware of the issue and provided the BIOS "disable ACPI bus
segmentation" option makes it less likely that this is the case.

Also, the system was very likely tested with Windows, and the fact
that the BIOS option is to *disable* segmentation suggests that the
default is "segmentation enabled."  So my guess is that segmentation
does work with Windows.  Sean, can you confirm or deny that?  The
AIDA64 tool (free trial version at http://www.aida64.com/) generates a
report with useful information.

I agree with Jonathan's assertion here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=543308#87 that the
BIOS switch is not adequate.  Neither is a patched DSDT.

I think it's likely that Windows works with segmentation, using
MMCONFIG, and that Linux is a bit too quick to disable MMCONFIG in
this case.
Comment 19 Sean M. Pappalardo 2012-07-06 13:56:12 UTC
Yes, ACPI bus segmentation is enabled by default and Windows works fine with it (even with the SCSI card's BIOS disabled.) The system is also supported _by HP_ running Red Hat Enterprise Linux 3, 4 and 5. (I happen to run Debian though.)

http://h20000.www2.hp.com/bizsupport/TechSupport/DriverDownload.jsp?lang=en&cc=us&prodNameId=459220&taskId=135&prodTypeId=12454&prodSeriesId=459226&lang=en&cc=us

I will get you dmesg output from kernel 3.4 in a few days. (I need the machine right now for building the Windows versions of Mixxx for our upcoming v1.11.0.)
Comment 20 Bjorn Helgaas 2012-08-23 23:35:04 UTC
Ping, it would be good if we could figure out a way to make progress on this.  Would it be possible to get the 3.4 (or 3.5 now that it's out) dmesg log?
Comment 21 Bjorn Helgaas 2012-10-01 20:01:27 UTC
Ping, Sean, could you collect a dmesg log from a current kernel, e.g., 3.6?
Comment 22 Bjorn Helgaas 2012-10-03 21:12:59 UTC
I think the BIOS switch you're talking about is the "ACPI Bus Segmentation" option, which I found mentioned in several HP docs (Google search for '"ACPI bus segmentation" site:hp.com"), including Doc ID c00555221.

I suspect this problem affects the xw9300 (Sean, is that what you have?)  The xw9400 has the same switch, but a friend booted a current kernel on xw9400 with segmentation enabled, and it worked fine.

I'm closing this because:
  - current kernel seems to work on xw9400
  - current kernel might be broken on xw9300 but we don't have the information to work on fixing it

If anybody still cares about this, please collect a complete dmesg log and acpidump from xw9300 with segmentation enabled and attach them here, and we'll reopen it.
Comment 23 Sean M. Pappalardo 2012-10-04 19:02:41 UTC
Created attachment 82151 [details]
dmesg log from 3.5.5

dmesg log from latest kernel where the problem continues to appear
Comment 24 Sean M. Pappalardo 2012-10-04 19:04:02 UTC
Created attachment 82161 [details]
ACPI dump (under kernel 3.5.5)
Comment 25 Sean M. Pappalardo 2012-10-04 19:05:02 UTC
Created attachment 82171 [details]
3.5.5 lspci -n output
Comment 26 Sean M. Pappalardo 2012-10-04 19:09:46 UTC
Correct, this is an xw9300 which uses first-gen Opteron CPUs (e.g. 2xx models.) The xw9400 uses 2nd-gen ones (23xx models)

Sorry, I only now had time to collect the data. Life has been extremely busy.
Comment 27 Bjorn Helgaas 2012-10-05 16:43:14 UTC
Thanks, Sean!

Here's the MMCONFIG info from the MCFG table:

PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xe0000000-0xe3ffffff] (base 0xe0000000)
PCI: MMCONFIG for domain 0002 [bus 80-ff] at [mem 0xe8000000-0xefffffff] (base 0xe0000000)

There are no ACPI _CBA methods (another source of MMCONFIG info), and the legacy PCI config accessors only support segment 0, so I don't see how any OS could possibly discover devices on segment 1.

If you have Windows on this box, could you collect and attach an AIDA64 report (free trial version at http://www.aida64.com/) with segmentation enabled?
Comment 28 Sean M. Pappalardo 2012-10-09 09:30:43 UTC
Created attachment 82691 [details]
AIDA64 report

Sorry, I obtained this awhile ago and thought I had attached it.
Comment 29 Bjorn Helgaas 2012-10-09 23:02:53 UTC
Your AIDA64 report is from Windows XP.  Here's what I gleaned from it:

   Linux addr  VEND:DEV   Win addr  Description
  ------------ ---------  --------  ----------------------------
       00:00.0 10de:005e    0/ 0/0  nVidia HyperTransport Bridge
       00:01.0 10de:0051    0/ 1/0  nVidia LPC Bridge
       00:01.1 10de:0052    0/ 1/1  nVidia SMBus Controller
       00:02.0 10de:005a    0/ 2/0  nVidia OHCI USB1.1
       00:02.1 10de:005b    0/ 2/1  nVidia EHCI USB2.0
       00:04.0 10de:0059    0/ 4/0  nVidia Audio Codec
       00:06.0 10de:0053    0/ 6/0  nVidia Parallel ATA
       00:07.0 10de:0054    0/ 7/0  nVidia SATA Controller
       00:08.0 10de:0055    0/ 8/0  nVidia SATA Controller
       00:09.0 10de:005c    0/ 9/0  nVidia PCI-PCI Bridge
       00:0a.0 10de:0057    0/10/0  nVidia LAN
       00:0e.0 10de:005d    0/14/0  nVidia PCIe Root Port
       00:18.0 1022:1100    0/24/0  AMD HyperTransport Config
       00:18.1 1022:1101    0/24/1  AMD Address Map
       00:18.2 1022:1102    0/24/2  AMD DRAM Controller
       00:18.3 1022:1103    0/24/3  AMD Misc Control
       00:19.0 1022:1100    0/25/0  AMD HyperTransport Config
       00:19.1 1022:1101    0/25/1  AMD Address Map
       00:19.2 1022:1102    0/25/2  AMD DRAM Controller
       00:19.3 1022:1103    0/25/3  AMD Misc Control
       05:05.0 104c:8023    5/ 5/0  TI 1394 OHCI
       0a:00.0 10de:00ce   10/ 0/0  nVidia Quadro FX 1400 Video
* 0001:40:01.0 1022:7450   64/ 1/0  AMD PCI-X Tunnel
* 0001:40:01.1 1022:7451   64/ 1/1  AMD IOAPIC
* 0001:40:02.0 1022:7450   64/ 2/0  AMD PCI-X Tunnel
* 0001:40:02.1 1022:7451   64/ 2/1  AMD IOAPIC
* 0001:61:06.0 1000:0030   97/ 6/0  LSI 53C1030 SCSI
* 0001:61:06.1 1000:0030   97/ 6/1  LSI 53C1030 SCSI
* 0002:80:00.0 10de:005e  128/ 0/0  nVidia HyperTransport Bridge
* 0002:80:01.0 10de:00d3  128/ 1/0  nVidia LPC Bridge

The ones marked with "*" are the devices not seen by 2.6.30 and later.

It's interesting that Windows believes these are all in PCI domain 0.  The bus numbers match what Linux sees (the Windows numbers are decimal while Linux uses hex).

A couple Microsoft presentations I found with Google suggest that Windows didn't support PCI segment groups (a.k.a. domains) until Vista.  So it may be that Windows XP is just ignoring the _SEG methods in the PNP0A08 devices.

Sean, do you happen to have Vista on this box?  I'd like to know how it behaves.  I wonder if it works like Linux does and only finds the "*" devices when segmentation is disabled in the BIOS.
Comment 30 Sean M. Pappalardo 2012-10-10 15:25:52 UTC
I only have Windows XP x64 and Linux installed (Debian Squeeze to be precise.) I can try to make a Windows 7 live CD from the installation disc from one of my newer PCs if that will help, but I doubt AIDA64 will run in that environment. Are there any OS-provided tools that will give you what you need? Some PowerShell script perhaps?
Comment 31 Bjorn Helgaas 2012-10-10 22:35:21 UTC
Created attachment 82861 [details]
AIDA64 report (HP xw9300 Windows Vista 32 SP2, segmentation ON)

A friend at HP collected this report with Windows Vista 32 SP2 on an xw9300 with "ACPI Bus Segmentation" turned ON.

There are two interesting things:
  1) The PCI device addresses look exactly the same as they do under Windows XP.  There's no mention of a PCI segment other than 0.
  2) The MCFG dump only mentions PCI segment 0, buses 00h-3fh.  This is the same as on your system, Sean, but it's interesting that Linux found an entry for domain 0002 [bus 80-ff] that Windows doesn't mention.

I'll cook up a patch to ignore these _SEG descriptors that seem incorrect.
Comment 32 Bjorn Helgaas 2012-10-10 23:53:05 UTC
Created attachment 82881 [details]
AIDA64 report (HP xw9300 Windows Vista 32 SP2, segmentation OFF)

Again, collected by a friend at HP with Windows Vista 32 SP2 on an xw9300, this time with "ACPI Bus Segmentation" turned OFF.

The PCI addresses again look the same as under Windows XP and with Vista segmentation enabled.

It's interesting that this doesn't mention the MCFG table at all.  Sean, can you attach a dmesg log from a Linux with segmentation disabled in the BIOS?  It's possible the BIOS doesn't build the MCFG table in that case.
Comment 33 Sean M. Pappalardo 2012-10-11 09:21:06 UTC
Created attachment 82901 [details]
dmesg from 3.5.5 with 'ACPI Bus Segmentation' disabled
Comment 34 Sean M. Pappalardo 2012-10-11 09:23:58 UTC
Yes, sir. That indeed appears to be the case, since searching for 'MCFG' in this dmesg finds nothing.
Comment 35 Bjorn Helgaas 2012-10-11 22:32:33 UTC
Created attachment 83011 [details]
ignore _SEG on xw9300

Sean, would you mind trying this patch?  It is based on v3.6, but will probably apply to v3.5.5 as well.  Attach the dmesg log here if all goes well.  Thanks!
Comment 36 Sean M. Pappalardo 2012-10-15 13:43:17 UTC
Created attachment 83521 [details]
dmesg log after patch application

Seems to work fine with 'ACPI Bus Segmentation' enabled in the BIOS and the patch applied. Thank you very much for all your work on this!
Comment 37 Sean M. Pappalardo 2012-10-15 13:44:44 UTC
Created attachment 83531 [details]
3.5.5 lspci -n output after patch

For reference, the lspci -n output after the patch appears to match Windows now.
Comment 38 Florian Mickler 2012-12-22 09:36:09 UTC
A patch referencing this bug report has been merged in Linux v3.8-rc1:

commit 1f09b09b4de0e120800e49d806d264e7446ed446
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Mon Oct 29 17:26:54 2012 -0600

    x86/PCI: Ignore _SEG on HP xw9300

Note You need to log in before you can comment on or make changes to this bug.