Bug 42853

Summary: SilverSeraph: Kernel Hangs on Windowing PCI Brdige
Product: ACPI Reporter: Elmar Stellnberger (estellnb)
Component: OtherAssignee: acpi_other
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: blocking CC: alan, bjorn, lenb, rjw
Priority: P1    
Hardware: i386   
OS: Linux   
URL: https://bugzilla.novell.com/show_bug.cgi?id=748616
Kernel Version: 3.1.0-1.2-default Subsystem:
Regression: Yes Bisected commit-id:
Attachments: hwinfo --all
DSDT.dat (SilverSeraph)
screenshot of hang (initcall_debug, ignore_loglevel)
regression with 3.2.0-4-486

Description Elmar Stellnberger 2012-03-03 13:12:05 UTC
Created attachment 72533 [details]
hwinfo --all

error: hang during boot process
last known working kernel: 2.6.37.1-1.2-default
not working: 3.1.0-1.2-default
reproducible: always

last screen output:
PCI bridge to (bus 02-02)
bridge window (io 0x3000-0x3fff)
              (mem 0xe000000?..0xe0fffff?)
              (mem 0x84000000..0x8bffffff)
Comment 1 Elmar Stellnberger 2012-05-12 09:43:25 UTC
  The range of running kernels could further be infringed to 2.6.37.6-0.11-default and 3.0.0-1-686-pae. Boot movies with and without the initcall_debug option were uploaded at http://www.elstel.org/uploads/silver:kernel-3.1/ (as usb dbg was not supported by the interface of the machine).
  Rafael Wysocki and Bjorn Helgaas have further analyzed the problem:

Thanks!  Your comment #32 video shows:

    pci 0000:02:04.0: pci_set_power_state: state 0
    pci 0000:02:04.0: __pci_start_power_transition: state 0
    pci 0000:02:04.0: pci_platform_power_transition: state 0
    pci 0000:02:04.0: acpi_pci_set_power_state: state 0
    acpi device:05: __acpi_bus_set_power: state 0 current state 3

I think we must be stuck evaluating _PS0 for device:05.  Can you try this patch
to confirm?  I put an image here: http://helgaas.com/linux/suse748616/bzImage2

It looks like _PS0 does run a method (CAIN) with a loop that could conceivably
not terminate, but my ASL-fu is pretty weak.  Rafael is going to have to jump
in here.

I wouldn't worry (yet) about the errors you get when recompiling your DSDT. 
Linux used to work with that DSDT, and presumably Windows does, so we should be
able to get Linux to work again.

Here's the code (with the debug patch):

    dev_info(&device->dev, "%s: evaluating %s\n", __func__, object_name);
    status = acpi_evaluate_object(device->handle, object_name, NULL, NULL);
    dev_info(&device->dev, "%s: %s status %d\n", __func__, object_name,
status);

It looks like writing 1 to PMS0 actually clears it, so this probably is a
status flag of some sort.  If clearing it doesn't work, it probably means a
signal is continuously asserted.

Or, if it is mappend incorrectly, CAIN() may be writing into a RAM location
instead of the register and in that case the loop will be infinite.
Comment 2 Elmar Stellnberger 2012-05-12 09:46:19 UTC
Created attachment 73258 [details]
DSDT.dat (SilverSeraph)
Comment 3 Elmar Stellnberger 2012-08-27 11:29:22 UTC
Created attachment 78541 [details]
screenshot of hang (initcall_debug, ignore_loglevel)

Now that with kernel 3.4.6-1.1-desktop looks much better. No more direct hang on windowing the CardBus-bridge but still yet 'setting the latency timer to 64' one line after initializing the PCI-bridge.
Comment 4 Alan 2012-08-28 11:19:02 UTC
The setting latency timer to 64 is expected  - its just informational stuff.
Comment 5 Elmar Stellnberger 2012-08-28 18:39:13 UTC
Hmm, but it keeps hanging in the next empty line.
Comment 6 Elmar Stellnberger 2012-09-06 15:33:56 UTC
confirmed for 3.4.6-2.10.1 (exactly the same).
Comment 7 Elmar Stellnberger 2012-09-06 21:51:46 UTC
vanilla-3.6.rc4-1.1.i686 works without any hangs. I think Rafael has fixed the problem.
Comment 8 Elmar Stellnberger 2015-03-19 16:08:42 UTC
Created attachment 171191 [details]
regression with 3.2.0-4-486

Ooops; there is a regression with 3.2.0-4-486 #1 Debian 3.2.65-1 i686 GNU/Linux.
last known good: 3.17.1-4-desktop #1 SMP, vanilla
Comment 9 Elmar Stellnberger 2015-03-19 16:10:56 UTC
Note: The regression may be related with fixing bug 63171, comment 54.
Comment 10 Bjorn Helgaas 2015-03-19 17:40:42 UTC
Elmar, I'm going to reassign this to ACPI and add Rafael to the CC: list, based on the guesses in comment #1.

But I'm confused about what kernels are involved here.  If 3.2.0 fails and 3.17.1 works, that sounds like a good thing, not a bug to be reported.

So there must be more going on.  Can you give more details about exactly what these kernels are?  Are they Debian kernels?  Can you reproduce the working and failing results with upstream kernels?
Comment 11 Elmar Stellnberger 2015-03-19 21:58:25 UTC
In deed already resolved: worked with Linux archiso 3.18.6-1-ARCH #1 SMP PREEMPT Sat Feb 7 08:59:29 CET 2015 i686 as well. By theory some additional patch by Debian could also have caused the malaise. Nonetheless the s2ram - bug 63171 still reports a memory corruption; - see for my latest posting there.