Kernel Bug Tracker – Bug 11054
Boot hangs in x86_64 OKI ANIMA 3300 laptop
Last modified: 2012-05-22 12:45:18 UTC
Latest working kernel version: commit 5eb7f9fa847b8ab6e4864bfb8cb45f370844a47c (somewhere after 2.6.25-rc7)
Earliest failing kernel version: commit 12c22d6ef299ccf0955e5756eb57d90d7577ac68 (somewhere after 2.6.25-rc7)
Distribution: Gentoo Linux
Hardware Environment: x86_64 OKI ANIMA 3300 laptop
Software Environment: Kernel.
Problem Description: The boot hangs after printing that MSI signaling has been enabled on a pair of PCI bridges. Sometimes the console gets garbled when the hang happens, sometimes not.
I bisected the problem to this specific commit:
Author: Linus Torvalds <firstname.lastname@example.org>
Date: Wed Mar 26 11:22:40 2008 -0700
Revert "PCI: remove transparent bridge sizing"
This reverts commit 8fa5913d54f3b1e09948e6a0db34da887e05ff1f, which
caused various interesting problems for people, including wrong resource
allocations. See for example bugzilla entry "2.6.25-rc2: ohci1394
problem (MMIO broken)" at
I applied a reversion of that commit onto the mainline head (2.6.26-rc9) and it now boots flawlessly (rt61pci still doesn't work reliably, but I found the system stable when using a PCMCIA ath5k card).
Steps to reproduce: Boot with bad version and hangs; boot with good version and it works.
I don't know about the other interesting problems caused by the commit reverted
in the problematic commit. I know I need it un-reverted to boot my machine.
Perhaps some kind of quirk is needed.
Created attachment 16878 [details]
Detailed lspci output
I've attached the detailed output of lspci on my machine running some 2.6.24 kernel.
I added debug messages to see which was the transparent bridge that required the patch, and this is the specific device info:
00:10.0 PCI bridge: nVidia Corporation MCP51 PCI Bridge (rev a2) (prog-if 01 [Subtractive decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Bus: primary=00, secondary=04, subordinate=08, sec-latency=64
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: c3000000-c30fffff
Prefetchable memory behind bridge: fff00000-000fffff
Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr+ DiscTmrStat- DiscTmrSERREn-
Capabilities: [b8] Subsystem: Gammagraphx, Inc. Device 0000
Capabilities: [8c] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
00: de 10 6f 02 07 01 b0 00 a2 01 04 06 00 00 81 00
10: 00 00 00 00 00 00 00 00 00 04 08 40 f0 00 80 02
20: 00 c3 00 c3 f0 ff 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 b8 00 00 00 00 00 00 00 ff 00 04 02
40: 00 00 03 00 01 00 02 00 05 00 00 00 00 00 44 00
50: 00 00 fe 3f 00 00 00 00 ff 1f ff 1f 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 a8
90: 00 00 e0 fe 00 00 00 00 00 00 00 00 00 00 00 00
a0: 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 ff ff 00 00 0d 8c 00 00 00 00 00 00
c0: 00 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
From Juan Jesus's email (http://lkml.org/lkml/2008/11/12/60):
After reading a little bit on how PCI and PCI-to-PCI bridges work, and
how they're handled in linux (http://tldp.org/LDP/tlk/dd/pci.html), I
now know that ranges in the bridge (either I/O, mmio or prefetchable
mem) are simply disabled when start < end, and that's the original
configuration that the BIOS enforces when the bridge is not sized by
After inserting kprintf()'s, I see that the hang happens actually while
the positive decoding of the I/O range in the bridge is being activated
in pci_setup_bridge(), sometime in between the writes to the I/O
base/limit registers of the bridge; I don't remember exactly which was
the last pci_Write_config_dword() that allowed the next kprintf() to
succeed. I'll look at it tonight again, but I suspect that the final
enabling write (the one that updates the PCI_IO_BASE_UPPER16 register
with its final value) was the one hanging the machine.
The I/O range being activated was the one in the range 0x1000-0x1fff,
apparently correctly sized to accomodate the two I/O ranges
(0x1000-0x10ff, 0x1400-0x14ff) assigned to the CardBus bridge on the
One theory is that my system has something actually mapped to that I/O
range in the root PCI bus. When only subtractive decoding is in place
(the I/O range isn't activated), access to the secondary bus behind the
PCI-to-PCI bridge is done when the transaction isn't claimed by any
device in the root bus, after what the PCI docs describe as a 4-cycle
timeout. When the I/O range is activated, that range is positively
decoded by the bridge, which tries to claim the transaction before the
timeout. Perhaps two devices (the bridge and the unknown device on the
root bus) conflict when claiming the same transaction?
Another possibility could be that activating the I/O range disables the
negative decoding in the secondary-to-primary sense of the bridge for
that I/O range. Perhaps some device behind the bridge depends on being
able to forward transactions to the primary bus on that I/O range, but
it's disallowed after the range is configured. For me this seems rather
unlikely, because of the nature of the devices behind the bridge.
I'll look at it more closely again, and I will test whether commenting
out the I/O range sizing (leaving the other ranges to be sized) is
enough to allow the system to run. If so, is there any way to use a
system-specific quirk in order to remove the PCI-to-PCI bridge I/O range
from being sized/activated?
Created attachment 18961 [details]
Disassembled DSDT contents
From my e-mail: http://permalink.gmane.org/gmane.linux.kernel.pci/1991
PCI bus conflict hang: how to avoid the allocation of an I/O range.
From: GARCIA DE SORIA LUCENA, JUAN JESUS <juanj.g_soria <at> grupobbva.com>
Subject: PCI bus conflict hang: how to avoid the allocation of an I/O range.
Date: 2008-11-17 14:05:22 GMT
Author: Linus Torvalds <torvalds <at> linux-foundation.org>
Date: Wed Mar 26 11:22:40 2008 -0700
Revert "PCI: remove transparent bridge sizing"
My laptop began hanging when booting, and I filed
I had to disable the sizing of transparent bridges until, after a
conversation in the kernel mailing list, I think I've found the root of
A CardBus bridge is on the secondary bus of a transparent bridge. By
default it gets assigned two I/O ranges: 0x1000-0x10ff and
0x-1400-0x14ff, which is translated to the transparent bridge positively
forwarding the range 0x1000-0x1fff. There are no more I/O resources
allocated behind the transparent PCI to PCI bridge.
I suspect there's "something" (some device unknown by the kernel)
decoding I/O accesses in the primary PCI bus, in the 0x1000-0x1fff
range. This device must be causing bus conflicts with the range
allocated to the PCI to PCI bridge. Not sizing the transparent bridge
wouldn't configure any I/O range in it for positive decoding, thus
avoiding the conflict.
The system hangs when the bridge register for the IO base/limit (lower
16 bits, since it's 16 bit only) gets written to with the value
If I force the range to be allocated to be above 0x4000, everything
works flawlessly. I've been able to do so by two means:
1. Changing the definition of PCIBIOS_MIN_IO in
arch/x86/include/asm/pci.h from 0x1000 to 0x4000. This forces the
CardBus ranges to be allocated above the problematic area, making the
bridge forward 0x4000-0x4fff I/O addresses. BTW, PCIBIOS_MIN_CARDBUS_IO
is defined to be 0x4000 in the same header, but it's only used in
drivers/pcmcia/yenta_socket.c, not apparently when assigning resources
to the CardBus bridge in the functions pci_setup_cardbus() or
pci_bus_size_cardbus() in drivers/pci/setup-bus.c. I suppose that making
the CardBus bridge I/O range allocation respect the defined
PCIBIOS_MIN_CARDBUS_IO limit would fix my issue, but I don't know
whether that's "the right fix" (TM).
2. I've managed to boot a stock Ubuntu Intrepid Ibex x86_64 kernel by
supplying the parameter "pci=cbiosize=8k" to the grub command line. It
doesn't work with a smaller size. With 8k the CardBus bridge I/O ranges
are big enough that they have to be allocated above the "problem area"
because of natural alignment restrictions.
So far I've got what I really wanted (to be able to use my laptop with
modern distributions without having to recompile each kernel version),
although to do so I'm depending on the fact that a kernel parameter
intended for a different use will alter I/O range alignment one PCI to
PCI bridge away.
I write to ask whether the definition of PCIBIOS_MIN_CARDBUS_IO was
indeed intended to affect my case too (in which case what is happening
is the result of a kernel bug that should be fixed) or not. And if it's
not a bug, I'd like to know if there exists any reliable way to
pre-allocate a given I/O range (0x1000-0x1fff in my case) so that it
won't be assigned to PCI busses/devices (without the need to recompile
every kernel version).
Moving the full version info here to unmess the formatting of the lists
2.6.25 + 12c22d6ef299ccf0955e5756eb57d90d7577ac68 up to mainline
Closing as obsolete, please re-open and update the kernel version to a recent one if still seen