Bug 14669 - Operation “PM: Adding info for pci” slows down boot process - Acer Extensa 5620
Summary: Operation “PM: Adding info for pci” slows down boot process - Acer Extensa 5620
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: ACPI
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-23 12:37 UTC by Raffaello D. Di Napoli
Modified: 2012-05-24 07:23 UTC (History)
4 users (show)

See Also:
Kernel Version: >=2.6.29
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Output of acpidump (188.68 KB, application/octet-stream)
2009-11-23 12:37 UTC, Raffaello D. Di Napoli
Details
Customized .config used (55.41 KB, text/plain)
2009-11-23 12:39 UTC, Raffaello D. Di Napoli
Details
Stock .config used (includes necessary minimal modifications) (58.27 KB, text/plain)
2009-11-23 12:41 UTC, Raffaello D. Di Napoli
Details
dmesg (65.29 KB, text/plain)
2009-11-24 02:06 UTC, Raffaello D. Di Napoli
Details
lspci -vv output (25.42 KB, text/plain)
2009-11-26 01:09 UTC, Raffaello D. Di Napoli
Details
Customized .config used (updated) (55.44 KB, text/plain)
2009-11-26 01:53 UTC, Raffaello D. Di Napoli
Details
dmesg (intelfb, acpi=on) (61.66 KB, text/plain)
2009-11-26 01:55 UTC, Raffaello D. Di Napoli
Details
dmesg (intelfb, acpi=off) (47.66 KB, text/plain)
2009-11-26 01:56 UTC, Raffaello D. Di Napoli
Details
dmesg (intelfb, acpi=off, irqpoll) (46.03 KB, text/plain)
2009-11-26 02:01 UTC, Raffaello D. Di Napoli
Details
dmesg (vga, acpi=off) (47.14 KB, text/plain)
2009-11-26 02:06 UTC, Raffaello D. Di Napoli
Details
dmesg (vga, acpi=on) (61.30 KB, text/plain)
2009-11-26 02:07 UTC, Raffaello D. Di Napoli
Details
dmesg (intelfb, acpi=on, -AC +battery) (61.71 KB, text/plain)
2009-11-26 02:13 UTC, Raffaello D. Di Napoli
Details
dmesg 2.6.32.3 (acpi=on) (60.33 KB, text/plain)
2010-01-13 17:32 UTC, Raffaello D. Di Napoli
Details
dmesg 2.6.32.3 (acpi=on, initcall_debug) (103.71 KB, text/plain)
2010-01-13 17:38 UTC, Raffaello D. Di Napoli
Details
debug patch (1.94 KB, patch)
2010-01-14 07:11 UTC, Zhang Rui
Details | Diff
dmesg 2.6.32.3 (acpi=on, initcall_debug, with proposed patch #24547) (105.44 KB, text/plain)
2010-01-14 17:13 UTC, Raffaello D. Di Napoli
Details
debug patch 2 (2.05 KB, patch)
2010-01-26 07:16 UTC, Zhang Rui
Details | Diff
dmesg 2.6.32.3 (acpi=on, initcall_debug, with proposed patch #24712) (263.83 KB, text/plain)
2010-01-26 10:30 UTC, Raffaello D. Di Napoli
Details
dmesg 2.6.32.3 (acpi=on, initcall_debug, acpi_platform_notify debugging) (150.28 KB, text/plain)
2010-01-26 11:26 UTC, Raffaello D. Di Napoli
Details
dmesg 2.6.34 (acpi=on, initcall_debug) (116.24 KB, text/plain)
2010-07-03 18:57 UTC, Raffaello D. Di Napoli
Details

Description Raffaello D. Di Napoli 2009-11-23 12:37:24 UTC
Created attachment 23878 [details]
Output of acpidump

Hi all, I have a nuisance with booting ACPI-enabled kernels on my notebook.
Since I don’t know much about kernel operation, all I can do is show you the issue I’m talking about:

[    0.552322] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIB._PRT]
[    0.753070] PM: Adding info for pci:0000:00:00.0
...
[    4.372027] PM: Adding info for No Bus:0000:0f
[    4.372115] ACPI: PCI Interrupt Link [LNKA] (IRQs 10 *11)

I get 30 “PM: Adding info...” messages, and as you can see in this dmesg excerpt, they take up a considerable amount of time (about 3.6 seconds, that’s a lot compared to the full kernel boot, which takes about 7.8 seconds). Compare to my old Athlon XP, which boots the same kernel version in 2.8 seconds.

Also, the reason I consider this a bug is that the screen stays black until after the last of those “PM: ...” messages. In order to understand what was going on, I had to check out dmesg, and recompile enabling “Verbose Power Management debugging”; I figured that would help, since the black-out happened between the two “ACPI: PCI Interrupt” messages.


Configuration information:

I’m currently running kernel 2.6.30 with the Gentoo patch set (2.6.30-gentoo-r8, latest stable), but I got the same results with a few stock kernels, versions ranging from 2.6.29.6 to 2.6.31.6 (latest stable).

I tried each kernel using both a very customized config file (tailored for my computer) and the default config file, only adding necessary drivers. In both cases, I get the same delay. This was mostly to rule out the impression I have I misconfigured my kernel at some point.

I use no initramd/initrd.

If relevant, the customized config file comes from a customization process that resulted in no modules being necessary upon boot. Every module that would be loaded during the boot process, I embedded in the kernel; drivers/features that are loaded on demand, I compiled as modules.


System information:

The computer is an Acer Extensa 5620 with Intel i965GM-E and Core 2 Duo T5550; I once updated the BIOS (before switching to Linux), but I haven’t checked in about 8 months for further updates. I can do that if necessary.


Further information:

Please notice I don’t experience this delay with any other Linux system, either real (my old Athlon XP on nForce2 400 Ultra and my gf’s Turion 64 notebook) or virtualized.
Comment 1 Raffaello D. Di Napoli 2009-11-23 12:39:44 UTC
Created attachment 23879 [details]
Customized .config used
Comment 2 Raffaello D. Di Napoli 2009-11-23 12:41:42 UTC
Created attachment 23880 [details]
Stock .config used (includes necessary minimal modifications)
Comment 3 Raffaello D. Di Napoli 2009-11-23 12:48:09 UTC
I forgot to mention, the kernel is loaded by GRUB with:
kernel /boot/linux-2.6.30-gentoo-r8 vga=0x360 video=intelfb:mode=1280x800 ro root=/dev/sda1 resume=/dev/sda2

“linux-2.6.30-gentoo-r8” varies, of course.
The /boot directory now is in the / partition, but even before I merged them, the behavior was the same (I know it’s irrelevant... just in case).

Adding the noacpi (or was it acpi=off? I tried that a week ago) option does cause the delay to disappear, but it’s a solution I’m not interested in.
Comment 4 Zhang Rui 2009-11-24 01:46:45 UTC
please attach the full dmesg output after boot.
Comment 5 Raffaello D. Di Napoli 2009-11-24 02:06:38 UTC
Created attachment 23904 [details]
dmesg

Ha, sure. I obviously forgot to attach this one.
The timings don’t match exactly my earlier post, I guess I restarted the computer meanwhile. Nothing changes, anyway.
Comment 6 Len Brown 2009-11-25 06:36:36 UTC
it appears that the time-consuming region is all PCI PM messages:

[    0.754070] PM: Adding info for pci:0000:00:00.0
[    0.955069] PM: Adding info for pci:0000:00:02.0
[    1.156067] PM: Adding info for pci:0000:00:02.1
[    1.357069] PM: Adding info for pci:0000:00:1a.0
[    1.558072] PM: Adding info for pci:0000:00:1a.1
[    1.759070] PM: Adding info for pci:0000:00:1a.7
[    1.960069] PM: Adding info for pci:0000:00:1b.0
[    2.161070] PM: Adding info for pci:0000:00:1c.0
[    2.362070] PM: Adding info for pci:0000:00:1c.1
[    2.563070] PM: Adding info for pci:0000:00:1c.2
[    2.764070] PM: Adding info for pci:0000:00:1d.0
[    2.965073] PM: Adding info for pci:0000:00:1d.1
[    3.166071] PM: Adding info for pci:0000:00:1d.2
[    3.367071] PM: Adding info for pci:0000:00:1d.7
[    3.568071] PM: Adding info for pci:0000:00:1e.0
[    3.769074] PM: Adding info for pci:0000:00:1f.0
[    3.970072] PM: Adding info for pci:0000:00:1f.1
[    4.171072] PM: Adding info for pci:0000:00:1f.2
[    4.372073] PM: Adding info for pci:0000:00:1f.3
[    4.372091] PM: Adding info for pci:0000:02:00.0
Comment 7 Raffaello D. Di Napoli 2009-11-25 10:18:20 UTC
Yeah, I also pointed that out in the opening message. The thing is, these PM messages only appear in dmesg if I compile with verbose PM debugging, and even then, the screen will stay black on boot the whole duration of this series of messages.
The black screen is actually what prompted me to open a bug report, otherwise I would’ve thought it was just working the way it’s meant to.

To the assignee: Jesse, if you want me to perform any tests, or if you want the output of any more diagnostic programs, just ask.
If told how to, I can also debug the kernel (I’m a regular programmer from i686 asm to Python, but no “serious OS” kernel stuff yet).
Comment 8 Len Brown 2009-11-25 22:33:00 UTC
Please attach the dmesg resulting from the "acpi=off" boot test.

please attach the output from lspci -vv

> video=intelfb:mode=1280x800

I don't know if the intelfb driver is related to your blank screen,
but as you've got the i915 driver built into the kernel, i would think
that you would not need/use the intelfb driver.

for comparison, here is the analogous output on my ich9 based system,
which is apparently running 100x faster...

[    0.220492] PM: Adding info for pci:0000:00:00.0
[    0.222289] PM: Adding info for pci:0000:00:01.0
[    0.224084] PM: Adding info for pci:0000:00:02.0
[    0.225887] PM: Adding info for pci:0000:00:19.0
[    0.227894] PM: Adding info for pci:0000:00:1a.0
[    0.229472] PM: Adding info for pci:0000:00:1a.1
[    0.231259] PM: Adding info for pci:0000:00:1a.2
[    0.233048] PM: Adding info for pci:0000:00:1a.7
[    0.234841] PM: Adding info for pci:0000:00:1b.0
[    0.236942] PM: Adding info for pci:0000:00:1c.0
[    0.238419] PM: Adding info for pci:0000:00:1c.1
[    0.240204] PM: Adding info for pci:0000:00:1c.3
[    0.241993] PM: Adding info for pci:0000:00:1d.0
[    0.243790] PM: Adding info for pci:0000:00:1d.1
[    0.245576] PM: Adding info for pci:0000:00:1d.2
[    0.247372] PM: Adding info for pci:0000:00:1d.7
[    0.249159] PM: Adding info for pci:0000:00:1e.0
[    0.250957] PM: Adding info for pci:0000:00:1f.0
[    0.252822] PM: Adding info for pci:0000:00:1f.2
[    0.254533] PM: Adding info for pci:0000:00:1f.3
[    0.256092] PM: Adding info for pci:0000:01:00.0
Comment 9 Len Brown 2009-11-25 23:01:36 UTC
do you get similar timing when booting on AC vs battery?
Comment 10 Raffaello D. Di Napoli 2009-11-26 01:09:24 UTC
Created attachment 23941 [details]
lspci -vv output
Comment 11 Raffaello D. Di Napoli 2009-11-26 01:53:51 UTC
Created attachment 23943 [details]
Customized .config used (updated)

Updated to reflect a few very minor changes I did today (should be irrelevant to the bug report). Sorry, I wasn’t thinking I should freeze my kernel for the bug report, I promise I’ll stop fidgeting with make menuconfig :)
Comment 12 Raffaello D. Di Napoli 2009-11-26 01:55:59 UTC
Created attachment 23944 [details]
dmesg (intelfb, acpi=on)

I redid this with the rebuilt kernel, so to allow for comparison with the other dmesg’s that follow.
Comment 13 Raffaello D. Di Napoli 2009-11-26 01:56:45 UTC
Created attachment 23945 [details]
dmesg (intelfb, acpi=off)
Comment 14 Raffaello D. Di Napoli 2009-11-26 02:01:23 UTC
Created attachment 23946 [details]
dmesg (intelfb, acpi=off, irqpoll)

I run this additional test because the kernel told me so near the stackdump in the acpi=off run.

I don’t really know what the irqpoll flag does (other than what I can infer from its name), but it completely killed disk performance. Just so you know, in case that’s not normal. It seemed like it was reading one sector at a time.
Comment 15 Raffaello D. Di Napoli 2009-11-26 02:06:53 UTC
Created attachment 23947 [details]
dmesg (vga, acpi=off)

Stackdump occurred here, too. I couldn’t bother trying with irqpoll again, though, as the vga vs. intelfb seemed to make no difference performance-wise.
Comment 16 Raffaello D. Di Napoli 2009-11-26 02:07:55 UTC
Created attachment 23948 [details]
dmesg (vga, acpi=on)
Comment 17 Raffaello D. Di Napoli 2009-11-26 02:13:01 UTC
Created attachment 23949 [details]
dmesg (intelfb, acpi=on, -AC +battery)

I might be wrong, but it seems like it made no difference whatsoever.
In case you’re wondering what I believe you can see in the dmesg logs, I normally run without the battery, so to preserve it.
Comment 18 Raffaello D. Di Napoli 2009-11-26 02:45:24 UTC
The black screen is indeed related to intelfb; without it, I see the messages before the slow series of instructions (nothing seems to happen during them, tough).
Comment 19 Len Brown 2009-11-26 05:24:26 UTC
> The black screen is indeed related to intelfb

lets ignore intelfb's black screen issue.
intelfb simply is not interesting.

> irqpoll

lets ignore the acpi=off interrupt issue too --
the test achieved its purpose.

> dmesg (intelfb, acpi=on)
> dmesg (intelfb, acpi=off)

The dmesg in comment #12 and comment #13 clearly show that the
"PM: Adding info for pci" slow-down happens only when ACPI is enabled.

acpi=off:

[    0.019787] PM: Adding info for pci:0000:00:00.0
[    0.019827] PM: Adding info for pci:0000:00:02.0
[    0.019864] PM: Adding info for pci:0000:00:02.1
[    0.019901] PM: Adding info for pci:0000:00:1a.0
[    0.019937] PM: Adding info for pci:0000:00:1a.1
[    0.019973] PM: Adding info for pci:0000:00:1a.7
[    0.020022] PM: Adding info for pci:0000:00:1b.0
[    0.020061] PM: Adding info for pci:0000:00:1c.0
[    0.020098] PM: Adding info for pci:0000:00:1c.1
[    0.020134] PM: Adding info for pci:0000:00:1c.2
[    0.020170] PM: Adding info for pci:0000:00:1d.0
[    0.020209] PM: Adding info for pci:0000:00:1d.1
[    0.020246] PM: Adding info for pci:0000:00:1d.2
[    0.020283] PM: Adding info for pci:0000:00:1d.7
[    0.020319] PM: Adding info for pci:0000:00:1e.0
[    0.020359] PM: Adding info for pci:0000:00:1f.0
[    0.020396] PM: Adding info for pci:0000:00:1f.1
[    0.020433] PM: Adding info for pci:0000:00:1f.2
[    0.020470] PM: Adding info for pci:0000:00:1f.3
[    0.020509] PM: Adding info for pci:0000:02:00.0


acpi enabled:

[    0.754070] PM: Adding info for pci:0000:00:00.0
[    0.955069] PM: Adding info for pci:0000:00:02.0
[    1.156067] PM: Adding info for pci:0000:00:02.1
[    1.357070] PM: Adding info for pci:0000:00:1a.0
[    1.558072] PM: Adding info for pci:0000:00:1a.1
[    1.759070] PM: Adding info for pci:0000:00:1a.7
[    1.960070] PM: Adding info for pci:0000:00:1b.0
[    2.161070] PM: Adding info for pci:0000:00:1c.0
[    2.362071] PM: Adding info for pci:0000:00:1c.1
[    2.563071] PM: Adding info for pci:0000:00:1c.2
[    2.764071] PM: Adding info for pci:0000:00:1d.0
[    2.965075] PM: Adding info for pci:0000:00:1d.1
[    3.166072] PM: Adding info for pci:0000:00:1d.2
[    3.367072] PM: Adding info for pci:0000:00:1d.7
[    3.568072] PM: Adding info for pci:0000:00:1e.0
[    3.769075] PM: Adding info for pci:0000:00:1f.0
[    3.970072] PM: Adding info for pci:0000:00:1f.1
[    4.171073] PM: Adding info for pci:0000:00:1f.2
[    4.372073] PM: Adding info for pci:0000:00:1f.3
[    4.372091] PM: Adding info for pci:0000:02:00.0
Comment 20 Raffaello D. Di Napoli 2009-11-26 10:04:34 UTC
> lets ignore intelfb's black screen issue.
> intelfb simply is not interesting.

Plus, it’s only an issue because something *slow* happens while the screen is black. If the duration of the black out was the same as in acpi=off (excluding the interrupt issue), the screen would stay black for 0.001 seconds instead of 3.6, that’s quite different.


> lets ignore the acpi=off interrupt issue too --
> the test achieved its purpose.
>
> The dmesg in comment #12 and comment #13 clearly show that the
> "PM: Adding info for pci" slow-down happens only when ACPI is enabled.

Yeah, that’s what I meant in comment #3. The only thing is, when I originally tried acpi=off, I was misled in judgment by the interrupt-related slowdown (more than 2 seconds), so I failed to notice the 3.6 to 0.001 improvement.

Please tell me if you need me to do anything more.
Comment 21 Raffaello D. Di Napoli 2009-12-13 14:30:07 UTC
Err, I noticed this bug seems to have been forgotten, possibly due to being stuck to NEEDINFO, that’s no longer the case (it’s only been NEEDINFO for real for a couple days overall).

Could anyone please mark it as OPEN or something similar?
The only option I could set it to that seems fit is VERIFIED, but I don’t know its exact meaning and don’t want to cause confusion.
Comment 22 Zhang Rui 2010-01-13 06:59:37 UTC
does the problem still exists in the latest kernel, say 2.6.32?

If yes, please attach the dmesg output with boot option "initcall_debug".

>Please notice I don’t experience this delay with any other Linux system,
>either real (my old Athlon XP on nForce2 400 Ultra and my gf’s Turion 64
>notebook) or virtualized.

but the problem always exists on this laptop, I mean all the Linux kernels you've tries do have the same delay problem, right?
Comment 23 Raffaello D. Di Napoli 2010-01-13 17:32:58 UTC
Created attachment 24540 [details]
dmesg 2.6.32.3 (acpi=on)

First run with the 2.6.32.3 kernel, for comparison results. No improvements.
Comment 24 Raffaello D. Di Napoli 2010-01-13 17:38:08 UTC
Created attachment 24541 [details]
dmesg 2.6.32.3 (acpi=on, initcall_debug)

This is the required test run with the 2.6.32.3 kernel.
I believe the relevant lines are:

[    0.740064] PM: Adding info for pci:0000:00:00.0
...
[    4.340085] PM: Adding info for No Bus:0000:0f
[    4.340085] initcall acpi_pci_root_init+0x0/0x25 returned 0 after 3716799 usecs
[    4.340085] calling  acpi_pci_link_init+0x0/0x3f @ 1

So, almost four seconds are spent in acpi_pci_root_init. I hope this is getting where you expected...
Comment 25 Raffaello D. Di Napoli 2010-01-13 17:44:37 UTC
(In reply to comment #22)
> does the problem still exists in the latest kernel, say 2.6.32?

As it’s evident from the two dmesgs I just attached, yes.


> If yes, please attach the dmesg output with boot option "initcall_debug".

Done. Hope that helps.


> but the problem always exists on this laptop, I mean all the Linux kernels
> you've tries do have the same delay problem, right?

So far, yes. As I said in the opening post, when I started to assume it was really a bug (by comparison with similar or slower machines), I did try a few older and newer kernels, from 2.6.29.6 to 2.6.31.6, with no change in behavior.
Comment 26 Zhang Rui 2010-01-14 07:11:06 UTC
Created attachment 24547 [details]
debug patch

please attach the dmesg output after applying this patch.
Comment 27 Raffaello D. Di Napoli 2010-01-14 17:13:19 UTC
Created attachment 24561 [details]
dmesg 2.6.32.3 (acpi=on, initcall_debug, with proposed patch #24547)

Done, here’s the relevant part with your additional printks:

[    0.540316] Rui: acpi_pci_root_add->acpi_pci_bridge_scan
[    0.540321] Rui: acpi_pci_root_add done
[    0.540325] Rui: in acpi_pci_root_start, device PNP0A08:00
[    0.740064] PM: Adding info for pci:0000:00:00.0
[    0.740087] Rui: pci_bus_add_device 0000:00:00.0
[    0.940066] PM: Adding info for pci:0000:00:02.0
[    0.940087] Rui: pci_bus_add_device 0000:00:02.0
[    1.140062] PM: Adding info for pci:0000:00:02.1
[    1.140083] Rui: pci_bus_add_device 0000:00:02.1
[    1.340064] PM: Adding info for pci:0000:00:1a.0
[    1.340084] Rui: pci_bus_add_device 0000:00:1a.0
[    1.540064] PM: Adding info for pci:0000:00:1a.1
[    1.540084] Rui: pci_bus_add_device 0000:00:1a.1
[    1.740064] PM: Adding info for pci:0000:00:1a.7
[    1.740084] Rui: pci_bus_add_device 0000:00:1a.7
[    1.940064] PM: Adding info for pci:0000:00:1b.0
[    1.940085] Rui: pci_bus_add_device 0000:00:1b.0
[    2.140065] PM: Adding info for pci:0000:00:1c.0
[    2.140085] Rui: pci_bus_add_device 0000:00:1c.0
[    2.340067] PM: Adding info for pci:0000:00:1c.1
[    2.340087] Rui: pci_bus_add_device 0000:00:1c.1
[    2.540065] PM: Adding info for pci:0000:00:1c.2
[    2.540085] Rui: pci_bus_add_device 0000:00:1c.2
[    2.740065] PM: Adding info for pci:0000:00:1d.0
[    2.740085] Rui: pci_bus_add_device 0000:00:1d.0
[    2.940065] PM: Adding info for pci:0000:00:1d.1
[    2.940085] Rui: pci_bus_add_device 0000:00:1d.1
[    3.140068] PM: Adding info for pci:0000:00:1d.2
[    3.140088] Rui: pci_bus_add_device 0000:00:1d.2
[    3.340066] PM: Adding info for pci:0000:00:1d.7
[    3.340086] Rui: pci_bus_add_device 0000:00:1d.7
[    3.540066] PM: Adding info for pci:0000:00:1e.0
[    3.540086] Rui: pci_bus_add_device 0000:00:1e.0
[    3.740067] PM: Adding info for pci:0000:00:1f.0
[    3.740087] Rui: pci_bus_add_device 0000:00:1f.0
[    3.940068] PM: Adding info for pci:0000:00:1f.1
[    3.940089] Rui: pci_bus_add_device 0000:00:1f.1
[    4.140067] PM: Adding info for pci:0000:00:1f.2
[    4.140087] Rui: pci_bus_add_device 0000:00:1f.2
[    4.340067] PM: Adding info for pci:0000:00:1f.3
[    4.340087] Rui: pci_bus_add_device 0000:00:1f.3
[    4.340087] Rui: scan subordinate bus
[    4.340087] PM: Adding info for pci:0000:02:00.0
[    4.340087] Rui: pci_bus_add_device 0000:02:00.0
[    4.340087] Rui: scan subordinate bus
[    4.340087] PM: Adding info for No Bus:0000:02
[    4.340087] PM: Adding info for pci:0000:04:00.0
[    4.340087] Rui: pci_bus_add_device 0000:04:00.0
[    4.340087] Rui: scan subordinate bus
[    4.340087] PM: Adding info for No Bus:0000:04
[    4.340087] Rui: scan subordinate bus
[    4.340087] PM: Adding info for No Bus:0000:06
[    4.340087] PM: Adding info for pci:0000:0f:06.0
[    4.340087] Rui: pci_bus_add_device 0000:0f:06.0
[    4.340087] PM: Adding info for pci:0000:0f:06.1
[    4.340087] Rui: pci_bus_add_device 0000:0f:06.1
[    4.340087] PM: Adding info for pci:0000:0f:06.2
[    4.340087] Rui: pci_bus_add_device 0000:0f:06.2
[    4.340087] PM: Adding info for pci:0000:0f:06.3
[    4.340087] Rui: pci_bus_add_device 0000:0f:06.3
[    4.340087] Rui: scan subordinate bus
[    4.340087] Rui: scan subordinate bus
[    4.340087] PM: Adding info for No Bus:0000:10
[    4.340087] PM: Adding info for No Bus:0000:0f
[    4.340087] Rui: acpi_pci_root_start done
[    4.340087] initcall acpi_pci_root_init+0x0/0x25 returned 0 after 3716799 usecs
[    4.340087] calling  acpi_pci_link_init+0x0/0x3f @ 1


Thank you for your interest in this issue.
Comment 28 Zhang Rui 2010-01-26 07:16:04 UTC
Created attachment 24712 [details]
debug patch 2

This is really weird.
I don't see any code that may cause this 200ms delay.
please remove the first debug patch and apply this one, and attach the dmesg output after boot.
Comment 29 Raffaello D. Di Napoli 2010-01-26 10:30:46 UTC
Created attachment 24716 [details]
dmesg 2.6.32.3 (acpi=on, initcall_debug, with proposed patch #24712)

Done. I had a hunk rejected, weird. This is the diff, after I hand-applied the hunk:
_____
--- drivers/base/core.c.wo-hunk3	2010-01-26 10:22:40.283310267 +0100
+++ drivers/base/core.c	2010-01-26 10:29:25.868306904 +0100
@@ -907,8 +907,11 @@
 		dev->init_name = NULL;
 	}
 
-	if (!dev_name(dev))
+	printk("Rui: in device_add2\n");
+	if (!dev_name(dev)) {
+		error = -EINVAL;
 		goto name_error;
+	}
 
 	pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
 
_____

Anyway, back to the .2 seconds delay, I attached the dmesg. Which, by the way, required me to enlarge the kernel log size :)
Now I hope it’s more clear to you what happens. This is the spot:

[    0.549335] Rui: in device_add4
[    0.749050] Rui: in device_add5

Same delay for every occurrence of device_add4 to device_add5.
Comment 30 Raffaello D. Di Napoli 2010-01-26 11:26:36 UTC
Created attachment 24718 [details]
 dmesg 2.6.32.3 (acpi=on, initcall_debug, acpi_platform_notify debugging)

Okay, I took the freedom to do myself what I guess would’ve been your next proposal: I looked at core.c, and found the only line between printk 3 and 4 was a call to platform_notify, which I gathered in my case could only be a pointer to acpi_platform_notify; so I undid your last patch, and added a few printk’s to drivers/acpi/glue.c, here:

static int acpi_platform_notify(struct device *dev)
{
	struct acpi_bus_type *type;
	acpi_handle handle;
	int ret = -EINVAL;

	printk("RDDN: in acpi_platform_notify\n");
	if (!dev->bus || !dev->parent) {
		/* bridge devices genernally haven't bus or parent */
		ret = acpi_find_bridge_device(dev, &handle);
		printk("RDDN: in acpi_platform_notify: acpi_find_bridge_device\n");
		goto end;
	}
	type = acpi_get_bus_type(dev->bus);
	printk("RDDN: in acpi_platform_notify: acpi_get_bus_type\n");
	if (!type) {
		DBG("No ACPI bus support for %s\n", dev_name(dev));
		ret = -EINVAL;
		goto end;
	}
	if ((ret = type->find_device(dev, &handle)) != 0)
		DBG("Can't get handler for %s\n", dev_name(dev));
	printk("RDDN: in acpi_platform_notify: type->find_device\n");
      end:
	if (!ret) {
		acpi_bind_one(dev, handle);
		printk("RDDN: in acpi_platform_notify: acpi_bind_one\n");
	}



Attached is the resulting dmesg, and this is the relevant excerpt:

[    0.743095] RDDN: in acpi_platform_notify
[    0.743095] RDDN: in acpi_platform_notify: acpi_get_bus_type
[    0.943048] RDDN: in acpi_platform_notify: type->find_device
[    0.943055] RDDN: in acpi_platform_notify: acpi_bind_one

Does this help?
Comment 31 Zhang Rui 2010-06-29 05:29:49 UTC
hmm, does the problem still exist in the latest upstream kernel?
Comment 32 Raffaello D. Di Napoli 2010-07-03 18:57:07 UTC
Created attachment 27009 [details]
dmesg 2.6.34 (acpi=on, initcall_debug)

I take it by “latest” you mean 2.6.34, please tell me otherwise.

Yes, the problem is still there, and it looks exactly the same.

I couldn’t bother to inject additional traces like I did in acpi_platform_notify for 2.6.32.3, but I will if you think they might be helpful.

Thank you for being still interested in this bug.
Comment 33 Santi 2011-02-15 23:53:45 UTC
Has there been any progress on this bug.

Disabling ACPI is not an option as performance of laptop is adversely affected.
Comment 34 Raffaello D. Di Napoli 2011-02-16 00:54:12 UTC
(In reply to comment #33)
> Has there been any progress on this bug.

None so far. Currently using 2.6.36 with Gentoo patchset 5, still same delay, same messages, although I haven’t checked with initcall_debug lately. But I can’t see why anything should’ve changed, if the delay is still there.


> Disabling ACPI is not an option as performance of laptop is adversely
> affected.

Yeah, I never considered that option. I simply put up with having the kernel take over 9 seconds to get to init, and curse Acer for their buggy BIOS (I assume that’s where the problem is).


This bug shouldn’t really be NEEDINFO, since there’s no information I haven’t provided, but I don’t know what to change it to. VERIFIED sound the fittest, but I think that’d imply that you devs actually managed to reproduce it, which isn’t the case as far as I know.

Thanks for the bump.
Comment 35 Zhang Rui 2012-01-18 01:53:11 UTC
It's great that kernel bugzilla is back.

can you please verify if the problem still exists in the latest upstream
kernel?
Comment 36 Raffaello D. Di Napoli 2012-01-18 04:27:40 UTC
In all these months, the computer has run 2.6.36, 37, 38, 39, 3.0.6 (now), and the issue is still there.
I will try the latest upstream as I understand it’s the procedure, but I don’t really expect any difference at all.
Comment 37 Raffaello D. Di Napoli 2012-01-24 01:44:38 UTC
I tried 3.1.6, still the same behavior (no change at all).

Upon suggestion arrived in form of a private e-mail, I tried a change in drivers/acpi/acpica/nsxfname.c, to skip execution of the _STA method on all devices:

--- nsxfname.c.orig	2011-10-24 03:10:05.000000000 -0400
+++ nsxfname.c	2012-01-23 00:17:55.953371956 -0500
@@ -376,7 +376,11 @@
 
 		/* Execute the Device._STA method */
 
+#ifdef CONFIG_ACPICA_DEVSTA_OK
+		status = AE_OK;
+#else
 		status = acpi_ut_execute_STA(node, &info->current_status);
+#endif
 		if (ACPI_SUCCESS(status)) {
 			valid |= ACPI_VALID_STA;
 		}

(I also created a config option to enable/disable it as necessary.)

This effectively removed the 4s delay upon boot. Now, that’s interesting.

So I investigated on this mysterious (to me) _STA method.
I undid the patch above, and instead added a printk BEFORE the execution of _STA, that also printed the device name (I’ll omit the patch for brevity, it’s trivial). This revealed a real surprise:

...
[    0.684241] _STA on P_D0
[    0.684354] _STA on P_D1
[    0.884069] _STA on SATA
[    1.084159] _STA on PRT0
[    1.084296] _STA on PRT1
[    1.084408] _STA on PRT2
...
[    1.093202] _STA on LPCB
[    1.093286] _STA on PATA
[    1.093368] _STA on SATA
[    1.292136] _STA on SBUS
[    1.292249] _STA on MCHC
[    1.292335] _STA on PDRC
...
[    1.293885] _STA on LPCB
[    1.293969] _STA on PATA
[    1.294051] _STA on SATA
[    1.492149] _STA on SBUS
[    1.492262] _STA on MCHC
[    1.492347] _STA on PDRC
...

... and so on. See the pattern: every call to _STA on SATA, causes a delay of approx. 200ms. Does this number sound familiar? :)

So I went further, and decompiled my BIOS’s DSDT. This revealed something I really did NOT expect:

...
Device (SATA)
{
...
    Method (_STA, 0, NotSerialized)
    {
        Sleep (0xC8)
        Return (0x0F)
    }
...

Do I need to clarify that 0xc8 is 200? :)

Now, I think my only real option (avoiding the hack above) is to modify and recompile my DSDT, eliminating the silly Sleep call.
Can anybody come up with a reason why there should be a sleep in _STA? Synchronization issues?
Just to be safe, I’ll also check Acer’s website to see if they mention ever adding such thing in the BIOS revision log.

I can attach the whole DSDT if you want, or the dmesg with the revealing printk if you want. Or just ask me anything, I’ll be glad to experiment more.
Comment 38 Zhang Rui 2012-02-02 06:25:46 UTC
wonderful findings.
I didn't realize that the delay is caused by the AML code.
It would be great if you can attach the full acpidump output.
Comment 39 Zhang Rui 2012-05-24 07:23:57 UTC
bug closed as there is no response from the bug reporter.
please feel free to reopen it if you can provide the info request in comment #38.

Note You need to log in before you can comment on or make changes to this bug.