Bug 41802

Summary: 3.1-rc2 boot failure, 2.6.35.14 worked - ProLiant DL385 G1
Product: Drivers Reporter: John 'Warthog9' Hawley (warthog9)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED OBSOLETE    
Severity: normal CC: bjorn, jbarnes, lenb, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.1-rc2 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Serial Output from Machine as it boots

Description John 'Warthog9' Hawley 2011-08-26 21:34:58 UTC
Created attachment 70432 [details]
Serial Output from Machine as it boots

I've got a machine that seems to be unable to boot with 3.1-rc2, but does seem to boot fine with a 2.6.35.14 derived kernel (from Fedora 14).

Machine in question is a ProLiant DL385 G1, which is about the oldest machine I have in production.  Upon booting the machine with a 3.1-rc2 kernel (in fact the same one currently running on hera/master), it boots but then stops dead, with the only information on the serial port being:

[...]
[    0.362147] ACPI: Interpreter enabled
[    0.362958] ACPI: (supports S0 S4 S5)
[    0.364958] ACPI: Using IOAPIC for interrupt routing
[    0.370203] ACPI: No dock devices found.
[    0.370956] HEST: Table not found.
[    0.371956] PCI: Ignoring host bridge windows from ACPI; if necessary, use "pci=use_crs" and report a bug
[    0.373965] ACPI: PCI Root Bridge [CFG0] (domain 0000 [bus 00-03])

Rebooting and adding pci=use_crs to the kernel command line boots further, but the tg3 driver / network bring up code (from anaconda) seems to hinder fully booting, it gets into a bringup, fail, try again loop.  The tg3 issue could just be a firmware issue.

Either way I'm reporting the issue as the booting kernel implied.  Attaching the serial log from the machine in question.  I'm happy to provide anything else that might be needed.
Comment 1 Bjorn Helgaas 2011-08-26 22:35:18 UTC
Hi John, thanks for the report!  Can you also attach the log from the "pci=use_crs" boot and a complete dmesg log, /proc/iomem, and "lspci -vv" from the Fedora 14 kernel?  That box probably has an iLO for both serial and VGA console, which is a PCI device; I wonder if something's going wrong there.  I'm on my way out of town for the weekend, so I apologize in advance for not being able to work on it right away.
Comment 2 Len Brown 2011-08-26 22:50:27 UTC
John, did the 3.0.0 kernel work?
If no, any clues when things broke after 2.6.35?
Comment 3 Bjorn Helgaas 2011-09-06 05:10:01 UTC
Ping.  Bug 30552 is another case (also an AMD box) where "pci=use_crs" helps things.  The problem in bug 30552 is that amd_bus.c doesn't give us complete host bridge information, and I'd like to figure out whether this problem is similar.
Comment 4 Bjorn Helgaas 2012-08-23 23:24:28 UTC
If this is still reproducible on a current kernel, please reopen with the info requested in comment #1.