Bug 85311

Summary: PCI dmesg spam when opening lid
Product: Drivers Reporter: Pali Rohár (pali)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED CODE_FIX    
Severity: normal CC: alan, alexdeucher, bjorn, lbdkmjdf, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.17-rc6 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg log
full dmesg 4.0-rc6
Remove res_to_dev_res() debug message
lspci -v -v

Description Pali Rohár 2014-09-30 10:25:06 UTC
Created attachment 152131 [details]
dmesg log

Every time when I open LID on my notebook I see tons of new error lines in dmesg with kernel 3.17-rc6. And more lines come from radeon card (dmesg log is attached). I do not see these log lines with 3.13 kernel, so this is regression.
Comment 1 Alex Deucher 2014-09-30 13:33:19 UTC
Does radeon.runpm=0 "fix" this?  I think the kernel is trying to access the device while it is powered off.  Seems like a possible bug in the kernel runtime pm code.
Comment 2 Pali Rohár 2014-10-01 16:11:54 UTC
Yes, with radeon.runpm=0 radeon lines are not written anymore. But acpi and i915 are still present.
Comment 3 Alex Deucher 2014-10-01 16:22:19 UTC
When the radeon device is powered down, all the PCI registers and BARs read back as ones. which is why you are seeing the radeon errors.  Apparently the pci runtime pm code does not properly ask the driver to power the card up before accessing those registers.  Unfortunately, I'm not that familiar with the Linux runtime pm code.  I have no idea about the i915 and apci messages.
Comment 4 Pali Rohár 2014-10-19 15:50:52 UTC
Ok, so to who or where to report this problem?
Comment 5 Alan 2014-10-23 15:55:49 UTC
Moving to PCI
Comment 6 Pali Rohár 2014-10-28 12:37:34 UTC
Ok, do you need some more logs or some other tests?
Comment 7 Bjorn Helgaas 2015-03-09 21:55:18 UTC
Pali, would you mind attaching a complete dmesg log and "lspci -vv" output, please?
Comment 8 Pali Rohár 2015-04-05 21:45:28 UTC
@Bjorn Helgaas: Now I'm testing 4.0-rc6 kernel version and I do not see those "radeon 0000:01:00.0: Max Payload Size 16384" messages anymore when opening LID. I still see lines from i915, but it is irrelevant to radeon. So looks like problem with radeon is fixed...
Comment 9 Pali Rohár 2015-04-06 10:06:08 UTC
Anyway, I'm still getting lot of pci messages in dmesg after opening LID. And I think this is problem/regression (those are not generated with 3.13 kernels).

I'm attaching full dmesg log as you wanted. LID was opened at line [   62.015885].
Comment 10 Pali Rohár 2015-04-06 10:07:14 UTC
Created attachment 173171 [details]
full dmesg 4.0-rc6
Comment 11 Jose P. 2015-07-24 04:52:13 UTC
Any updates on this?
Comment 12 Pali Rohár 2015-07-29 09:18:59 UTC
I already attached full dmesg log. What else is needed? Maybe changing NEEDINFO status?
Comment 13 Jose P. 2016-05-30 21:00:08 UTC
Is anything being done about these spammy messages?

I'm running kernel 4.6, and I still get them (same as I wrote here: https://bugzilla.kernel.org/show_bug.cgi?id=79701#c55 ), and not only when opening the LID of my laptop, but all the time. For example, when laptop-mode reloads, when adjusting the brightness of the screen, and so on.

My laptop is a HP dv6-6174la, an AMD APU+dGPU system.

Not sure if it helps, but even Linus found a similar problem:  https://lists.freedesktop.org/archives/intel-gfx/2015-April/065146.html
Comment 14 Bjorn Helgaas 2016-12-28 23:39:34 UTC
Hi Pali, Jose, can you please test with a recent kernel?

63e22924f5dc ("PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED") appeared in v4.6-rc1, and I think it should have fixed the "BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment" messages.

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=63e22924f5dc
Comment 15 Pali Rohár 2016-12-29 08:39:58 UTC
Hi! Now I tested 4.9 kernel. When opening LID I get these lines in dmesg:

[  521.002853] pci_bus 0000:02: Allocating resources
[  521.002879] pcieport 0000:00:1c.0: bridge window [io  0x1000-0x0fff] to [bus 02] add_size 1000
[  521.002886] pcieport 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 02] add_size 200000 add_align 100000
[  521.002890] pcieport 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff] to [bus 02] add_size 200000 add_align 100000
[  521.002904] pci_bus 0000:03: Allocating resources
[  521.002930] pcieport 0000:00:1c.2: bridge window [io  0x1000-0x0fff] to [bus 03] add_size 1000
[  521.002934] pcieport 0000:00:1c.2: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 03] add_size 200000 add_align 100000
[  521.002948] pci_bus 0000:04: Allocating resources
[  521.002976] pci_bus 0000:05: Allocating resources
[  521.003004] pci_bus 0000:06: Allocating resources
[  521.003095] pcieport 0000:00:1c.7: bridge window [io  0x1000-0x0fff] to [bus 06] add_size 1000
[  521.003099] pcieport 0000:00:1c.7: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 06] add_size 200000 add_align 100000
[  521.003114] pcieport 0000:00:1c.0: res[14]=[mem 0x00100000-0x000fffff] res_to_dev_res add_size 200000 min_align 100000
[  521.003118] pcieport 0000:00:1c.0: res[14]=[mem 0x00100000-0x002fffff] res_to_dev_res add_size 200000 min_align 100000
[  521.003121] pcieport 0000:00:1c.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000
[  521.003125] pcieport 0000:00:1c.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000
[  521.003128] pcieport 0000:00:1c.2: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000
[  521.003131] pcieport 0000:00:1c.2: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000
[  521.003135] pcieport 0000:00:1c.7: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000
[  521.003138] pcieport 0000:00:1c.7: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000
[  521.003142] pcieport 0000:00:1c.0: res[13]=[io  0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000
[  521.003145] pcieport 0000:00:1c.0: res[13]=[io  0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000
[  521.003148] pcieport 0000:00:1c.2: res[13]=[io  0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000
[  521.003151] pcieport 0000:00:1c.2: res[13]=[io  0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000
[  521.003154] pcieport 0000:00:1c.7: res[13]=[io  0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000
[  521.003157] pcieport 0000:00:1c.7: res[13]=[io  0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000
[  521.003169] pcieport 0000:00:1c.0: BAR 14: assigned [mem 0xcf200000-0xcf3fffff]
[  521.003186] pcieport 0000:00:1c.0: BAR 15: assigned [mem 0xcf400000-0xcf5fffff 64bit pref]
[  521.003199] pcieport 0000:00:1c.2: BAR 15: assigned [mem 0xcf600000-0xcf7fffff 64bit pref]
[  521.003210] pcieport 0000:00:1c.7: BAR 15: assigned [mem 0xcf800000-0xcf9fffff 64bit pref]
[  521.003216] pcieport 0000:00:1c.0: BAR 13: assigned [io  0x2000-0x2fff]
[  521.003221] pcieport 0000:00:1c.2: BAR 13: assigned [io  0x3000-0x3fff]
[  521.003225] pcieport 0000:00:1c.7: BAR 13: assigned [io  0x4000-0x4fff]
[  521.004497] acpi device:41: Cannot transition to power state D3hot for parent in (unknown)
Comment 16 Pali Rohár 2016-12-29 12:46:29 UTC
Info about that device:41:

$ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/device:41/adr
0x00000001
$ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/device:41/path
\_SB_.PCI0.SDHC.WI01
$ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/device:41/power_state
D0
$ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/device:41/status
15

And about parent of device:41:
$ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/status
0

Looks like parent is disabled, but device:41 is not disabled? It smells...
Comment 17 Bjorn Helgaas 2016-12-29 17:39:45 UTC
Created attachment 249131 [details]
Remove res_to_dev_res() debug message

Thanks, Pali!  I think the main point of this bug is extra messages when opening the lid.  I'm going to focus on those for now, and leave the device:41 power state question for later (we should probably open a different bugzilla just for that).

This patch should remove some of the messages.  I'll have to look closer at the rest.  Would you mind attaching the complete "lspci -vv" output (as root) and the complete dmesg log with this patch applied?  It's based on v4.10-rc1, but it should apply to v4.9 as well.
Comment 18 Pali Rohár 2017-01-02 18:11:49 UTC
Created attachment 249661 [details]
lspci -v -v

lspci output is attached, hope it helps.

I will try to patch and compile new kernel later.
Comment 19 Bjorn Helgaas 2017-01-25 22:19:49 UTC
I'll include the comment #17 patch in v4.11.  If there's still too much dmesg spam, please attach a dmesg and we can work on that.
Comment 20 Bjorn Helgaas 2019-04-04 21:55:47 UTC
I think this was resolved by 25e77388e1ab ("PCI: Remove res_to_dev_res() debug message"), which appeared in v4.11.

I'm going to close this, but if there's still too much spam, please open a new bugzilla and attach a recent dmesg log.