Bug 26732 - Problem: PCIE hot-plug resource assignments hangs kernel during boot
Summary: Problem: PCIE hot-plug resource assignments hangs kernel during boot
Status: RESOLVED INVALID
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Bjorn Helgaas
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-01-14 19:39 UTC by Kushal Koolwal
Modified: 2011-06-02 23:27 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.32 onwards including 2.6.37
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg 2.6.32 - kernel hangs during early boot (12.29 KB, text/plain)
2011-01-14 19:39 UTC, Kushal Koolwal
Details
dmesg 2.6.31 - kernel boots fine (25.57 KB, text/plain)
2011-01-14 19:40 UTC, Kushal Koolwal
Details
dmesg 2.6.32 - suspected commit reverted kernel boots fine now (24.86 KB, text/plain)
2011-01-14 19:42 UTC, Kushal Koolwal
Details
2.6.31 kernel config file (55.24 KB, application/octet-stream)
2011-01-14 19:42 UTC, Kushal Koolwal
Details
2.6.32 kernel config file (56.32 KB, application/octet-stream)
2011-01-14 19:44 UTC, Kushal Koolwal
Details
lspci -vvvxxx output for 2.6.31 kernel (26.92 KB, text/plain)
2011-01-14 19:46 UTC, Kushal Koolwal
Details
2.6.39 dmesg showing kernel hang (12.02 KB, text/plain)
2011-06-01 23:58 UTC, Kushal Koolwal
Details
2.6.39 dmesg log with "ignore_loglevel" (20.95 KB, text/plain)
2011-06-02 16:22 UTC, Kushal Koolwal
Details

Description Kushal Koolwal 2011-01-14 19:39:43 UTC
Created attachment 43572 [details]
dmesg 2.6.32 - kernel hangs during early boot

This is my first time reporting a Linux kernel bug so I apologize in advance if I made any mistakes while reporting or did not live up to the expectations.

We have a x86 board based on the Intel Atom/Menlow Platform. All Linux kernel hangs during early boot phase (PCI enumeration) from 2.6.32 onwards (please see attachment dmesg-2.6.32.txt captured using serial console redirection). Note that I have verified this problem with the 2.6.37 release also.

The 2.6.31 kernel works fine (see attachment dmesg-2.6.31.txt).

Upon performing a git bisect between 2.6.31 and 2.6.32 the following commit was reported as the "first bad" commit.
******************************************************************************
debian:/usr/src/linux-2.6# git bisect good
28760489a3f1e136c5ae8581c0fa8f63511f2f4c is first bad commit
commit 28760489a3f1e136c5ae8581c0fa8f63511f2f4c
Author: Eric W. Biederman <ebiederm@aristanetworks.com>
Date:   Wed Sep 9 14:09:24 2009 -0700

    PCI: pcie: Ensure hotplug ports have a minimum number of resources
    
    In general a BIOS may goof or we may hotplug in a hotplug controller.
    In either case the kernel needs to reserve resources for plugging
    in more devices in the future instead of creating a minimal resource
    assignment.
    
    We already do this for cardbus bridges I am just adding a variant
    for pcie bridges.
    
    v2: Make testing for pcie hotplug bridges based on a flag.
    
        So far we only set the flag for pcie but a header_quirk
        could easily be added for the non-standard pci hotplug
        bridges.
    
    Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

:040000 040000 78e8a6a1e09897d5c41e4cbaeff7d6a8e16d4c5b 4997bed0a33bd5f7353cb014427f7ff121f96712 M	drivers
:040000 040000 34dc4b5b81d929b03321af21b45a35d965b9d434 6e2646c422ae237989a1138b9e24122edb5305f2 M	include
******************************************************************************

So I made sure that the above commit is indeed the only one that is causing the Linux kernel to hang by doing following:

******************************************************************************
debian:/usr/src/linux-2.6# git checkout v2.6.32
Checking out files: 100% (25064/25064), done.
Note: moving to "v2.6.32" which isn't a local branch
If you want to create a new branch from this checkout, you may do so
(now or later) by using -b with the checkout command again. Example:
  git checkout -b <new_branch_name>
HEAD is now at 22763c5... Linux 2.6.32
versaportal:/usr/src/epm-24_hang/linux-2.6# git-revert 28760489a3f1e136c5ae8581c0fa8f63511f2f4c
warning: too many files, skipping inexact rename detection
Auto-merged drivers/pci/pci.c
Auto-merged drivers/pci/probe.c
Auto-merged include/linux/pci.h
Finished one revert.
Created commit bbc6df5: Revert "PCI: pcie: Ensure hotplug ports have a minimum number of resources"
 4 files changed, 5 insertions(+), 49 deletions(-)
debian:/usr/src/linux-2.6# git log
******************************************************************************

And indeed I was then able to boot from this 2.6.32 kernel omitting that one particular commit (see dmesg-2.6.32-revertbadcommit.txt).

System Description
==================
We have two PCI-E ports on the system. On one of the PCI-E port an Intel 82574L Ethernet is connected downstream and on the another PCI-E port a Pericom (PI7C9X111SL) PCIe-to-PCI bridge connected downstream which provides up to 4 PCI slots. By default the system does not have PCI device connected to any of those PCI slots. It is in this state we see a Linux kernel (2.6.32+) hang. However, as soon as we put any PCI card in one of those slots the Linux kernel boots fine.

BIOS Description
================
The BIOS does not assign resources (ranges to forward) to bridges with no downstream devices. It also does not assign resources to bridges who's only downstream devices are more bridges with no downstream devices. Linux does not seem to be comfortable with this, perhaps because of a hotplug possibility, so it tries to allocate resources to the PCIe root port who's only attached device is the Pericom bridge, with no devices past that.

When the BIOS enumerates the bridge due to there being an actual device present, it picks ranges from the top of memory space instead of the bottom.

Experiment
==========
To really confirm that this entire issue is because of a hot plug capability reported by the PCI-E ports I just explicitly told the Linux that the ports are not hot pluggable un the 28760489a3f1e136c5ae8581c0fa8f63511f2f4c commit
- setting pdev->is_hotplug_bridge = 1; 
+ setting pdev->is_hotplug_bridge = 0; 

and indeed now the kernel boots fine.

Additional Details
==================
1. The problem is reproducible exactly in the same manner every single time.

2. We have been able to reproduce this problem on the original Intel Menlow Customer Enabling Reference board.

3. Our system has Phoenix BIOS loaded on it.

Please let me know if you need any more information. I would be more than happy to provide it.
Comment 1 Kushal Koolwal 2011-01-14 19:40:37 UTC
Created attachment 43582 [details]
dmesg 2.6.31 - kernel boots fine
Comment 2 Kushal Koolwal 2011-01-14 19:42:13 UTC
Created attachment 43592 [details]
dmesg 2.6.32 - suspected commit reverted kernel boots fine now
Comment 3 Kushal Koolwal 2011-01-14 19:42:51 UTC
Created attachment 43602 [details]
2.6.31 kernel config file
Comment 4 Kushal Koolwal 2011-01-14 19:44:07 UTC
Created attachment 43612 [details]
2.6.32 kernel config file

Both the config files are essentially the same. However make command during kernel compilation explicitly asked to take action on certain new items that were introduced in the 2.6.32 kernel. I select the default action for most of the items.
Comment 5 Kushal Koolwal 2011-01-14 19:46:47 UTC
Created attachment 43622 [details]
lspci -vvvxxx output for 2.6.31 kernel
Comment 6 Kushal Koolwal 2011-01-14 20:21:09 UTC
I forgot to mention one more piece of information. We do not see this problem on Windows XP and Windows 7.
Comment 7 Bjorn Helgaas 2011-05-27 16:31:38 UTC
I'm sorry this report has been neglected.  I assume it's still an issue?  If so, would it be possible to attach a serial console log from a current kernel, e.g., 2.6.39?
Comment 8 Kushal Koolwal 2011-06-01 21:35:23 UTC
It seems that the conflicting IO address range was 0x1000-0x1fff. For some reasons Linux kernel seems to hang upon discovering this IO range. To resolve the issue we modified our BIOS/firmware code to move the I/O base addresses for the ACPI Power Management Block and the SM Bus controller which were defined at 0x1000 and 0x2000 respectively.

If you would like to debug this issue further at the Linux kernel level I would be more than happy to attach the serial console output from 2.6.39. Let me know.

Thanks for checking back.

Also it seems that this issue might be related to:
https://bugzilla.kernel.org/show_bug.cgi?id=36462
Comment 9 Bjorn Helgaas 2011-06-01 21:47:07 UTC
Heh, that's really funny that you found bug #36362 already.  I bet it is related, especially since you mention the fixed hardware that you had in the 0x1000 range, which got assigned to the 1c.0 bridge.

It would be useful if you could attach the console output from 2.6.39, booted with "ignore_loglevel".  That will show more details about the ACPI/PNP devices we find and the PCI resource assignment.
Comment 10 Kushal Koolwal 2011-06-01 23:58:27 UTC
Created attachment 60522 [details]
2.6.39 dmesg showing kernel hang

Attached is the full 2.6.39 dmesg log from the serial output showing Linux kernel hang with the unmodified BIOS.

Also there was a typo in my previous comment. To solve the problem we moved the I/O base addresses for the ACPI Power Management Block which was initially defined at 0x1000 to it's new location i.e. 0x2000 to make Linux kernel happy.

It seems that there is no way for the BIOS to tell the Linux kernel that certain I/O space (0x1000-0x1fff in this case) is reserved.
Comment 11 Bjorn Helgaas 2011-06-02 05:09:36 UTC
You didn't use the "ignore_loglevel" kernel parameter, so we don't see the PNP resources ... we see that you have 7 devices, but not the resources they use.

Moving the PM block to 0x2000 manages to avoid the problem for now, but it doesn't actually *solve* anything, it just moves the landmine elsewhere.  If we do any more PCI allocation, we could still step on it.

ACPI is the mechanism the BIOS is supposed to use to tell the kernel that I/O space like this is reserved.  It's just a bad Linux bug that we happen to ignore most of that information.  I think most of the time we're lucky because hardware like this is below 0x1000, and we have "#define PCIBIOS_MIN_IO 0x1000" that keeps PCI from allocating anything down there.

It happens to be PCI that trips over this, but it's really a PNP bug.
Comment 12 Kushal Koolwal 2011-06-02 16:22:19 UTC
Created attachment 60562 [details]
2.6.39 dmesg log with "ignore_loglevel"

My bad. Attached is the 2.6.39 dmesg log with "ignore_loglevel" option. Please let me know if you need any more information.
Comment 13 Bjorn Helgaas 2011-06-02 17:10:06 UTC
Wait, you said the problem happens when you have the ACPI PM block at 0x1000, didn't you?  That PM block would be described in the FADT.  My Lenovo laptop also has a PNP0C02 device that describes it, so my /proc/ioports looks like this:

  1000-107f : pnp 00:03  <-- this comes from the PNP0C02 device
    1000-1003 : ACPI PM1a_EVT_BLK  <-- these come from the FADT fields
    1004-1005 : ACPI PM1a_CNT_BLK
    1008-100b : ACPI PM_TMR
    1010-1015 : ACPI CPU throttle
    1020-102f : ACPI GPE0_BLK
    1050-1050 : ACPI PM2_CNT_BLK

I don't know if it's actually a spec requirement to have a PNP0C02 device for the PM areas or not, but if you can add one (or just add the FADT fields to the one you already have), I think it will prevent the problem.

Linux has a special case for PNP0C02 devices -- we bind the driver earlier than normal, and the driver claims the resources.  This is done early enough that PCI allocations will see the already-claimed PNP0C02 resources and avoid them.
Comment 14 Kushal Koolwal 2011-06-02 20:26:05 UTC
I think you are right. It seems that we have two PNP0C02 devices in our ACPI DSDT but the conflicting IO base address was not included in the resources list which probably we should.

Currently, I do not have any time-line as to when we will be able to test this fix in our BIOS, so meanwhile, you can consider this issue as a low priority (or may be even close it?). If this fix does not work then I can just re-open this issue.

Thanks for all your help!
Comment 15 Bjorn Helgaas 2011-06-02 23:27:52 UTC
OK, I'm going to close this as "invalid" on the assumption that a BIOS fix will resolve it.  If it doesn't, please re-open and we'll take another look.

Note You need to log in before you can comment on or make changes to this bug.