Bug 41802 - 3.1-rc2 boot failure, 2.6.35.14 worked - ProLiant DL385 G1
Summary: 3.1-rc2 boot failure, 2.6.35.14 worked - ProLiant DL385 G1
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-26 21:34 UTC by John 'Warthog9' Hawley
Modified: 2012-08-23 23:24 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.1-rc2
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Serial Output from Machine as it boots (13.30 KB, text/x-log)
2011-08-26 21:34 UTC, John 'Warthog9' Hawley
Details

Description John 'Warthog9' Hawley 2011-08-26 21:34:58 UTC
Created attachment 70432 [details]
Serial Output from Machine as it boots

I've got a machine that seems to be unable to boot with 3.1-rc2, but does seem to boot fine with a 2.6.35.14 derived kernel (from Fedora 14).

Machine in question is a ProLiant DL385 G1, which is about the oldest machine I have in production.  Upon booting the machine with a 3.1-rc2 kernel (in fact the same one currently running on hera/master), it boots but then stops dead, with the only information on the serial port being:

[...]
[    0.362147] ACPI: Interpreter enabled
[    0.362958] ACPI: (supports S0 S4 S5)
[    0.364958] ACPI: Using IOAPIC for interrupt routing
[    0.370203] ACPI: No dock devices found.
[    0.370956] HEST: Table not found.
[    0.371956] PCI: Ignoring host bridge windows from ACPI; if necessary, use "pci=use_crs" and report a bug
[    0.373965] ACPI: PCI Root Bridge [CFG0] (domain 0000 [bus 00-03])

Rebooting and adding pci=use_crs to the kernel command line boots further, but the tg3 driver / network bring up code (from anaconda) seems to hinder fully booting, it gets into a bringup, fail, try again loop.  The tg3 issue could just be a firmware issue.

Either way I'm reporting the issue as the booting kernel implied.  Attaching the serial log from the machine in question.  I'm happy to provide anything else that might be needed.
Comment 1 Bjorn Helgaas 2011-08-26 22:35:18 UTC
Hi John, thanks for the report!  Can you also attach the log from the "pci=use_crs" boot and a complete dmesg log, /proc/iomem, and "lspci -vv" from the Fedora 14 kernel?  That box probably has an iLO for both serial and VGA console, which is a PCI device; I wonder if something's going wrong there.  I'm on my way out of town for the weekend, so I apologize in advance for not being able to work on it right away.
Comment 2 Len Brown 2011-08-26 22:50:27 UTC
John, did the 3.0.0 kernel work?
If no, any clues when things broke after 2.6.35?
Comment 3 Bjorn Helgaas 2011-09-06 05:10:01 UTC
Ping.  Bug 30552 is another case (also an AMD box) where "pci=use_crs" helps things.  The problem in bug 30552 is that amd_bus.c doesn't give us complete host bridge information, and I'd like to figure out whether this problem is similar.
Comment 4 Bjorn Helgaas 2012-08-23 23:24:28 UTC
If this is still reproducible on a current kernel, please reopen with the info requested in comment #1.

Note You need to log in before you can comment on or make changes to this bug.