Bug 85311
Summary: | PCI dmesg spam when opening lid | ||
---|---|---|---|
Product: | Drivers | Reporter: | Pali Rohár (pali) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | alan, alexdeucher, bjorn, lbdkmjdf, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.17-rc6 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
dmesg log
full dmesg 4.0-rc6 Remove res_to_dev_res() debug message lspci -v -v |
Does radeon.runpm=0 "fix" this? I think the kernel is trying to access the device while it is powered off. Seems like a possible bug in the kernel runtime pm code. Yes, with radeon.runpm=0 radeon lines are not written anymore. But acpi and i915 are still present. When the radeon device is powered down, all the PCI registers and BARs read back as ones. which is why you are seeing the radeon errors. Apparently the pci runtime pm code does not properly ask the driver to power the card up before accessing those registers. Unfortunately, I'm not that familiar with the Linux runtime pm code. I have no idea about the i915 and apci messages. Ok, so to who or where to report this problem? Moving to PCI Ok, do you need some more logs or some other tests? Pali, would you mind attaching a complete dmesg log and "lspci -vv" output, please? @Bjorn Helgaas: Now I'm testing 4.0-rc6 kernel version and I do not see those "radeon 0000:01:00.0: Max Payload Size 16384" messages anymore when opening LID. I still see lines from i915, but it is irrelevant to radeon. So looks like problem with radeon is fixed... Anyway, I'm still getting lot of pci messages in dmesg after opening LID. And I think this is problem/regression (those are not generated with 3.13 kernels). I'm attaching full dmesg log as you wanted. LID was opened at line [ 62.015885]. Created attachment 173171 [details]
full dmesg 4.0-rc6
Any updates on this? I already attached full dmesg log. What else is needed? Maybe changing NEEDINFO status? Is anything being done about these spammy messages? I'm running kernel 4.6, and I still get them (same as I wrote here: https://bugzilla.kernel.org/show_bug.cgi?id=79701#c55 ), and not only when opening the LID of my laptop, but all the time. For example, when laptop-mode reloads, when adjusting the brightness of the screen, and so on. My laptop is a HP dv6-6174la, an AMD APU+dGPU system. Not sure if it helps, but even Linus found a similar problem: https://lists.freedesktop.org/archives/intel-gfx/2015-April/065146.html Hi Pali, Jose, can you please test with a recent kernel? 63e22924f5dc ("PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED") appeared in v4.6-rc1, and I think it should have fixed the "BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment" messages. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=63e22924f5dc Hi! Now I tested 4.9 kernel. When opening LID I get these lines in dmesg: [ 521.002853] pci_bus 0000:02: Allocating resources [ 521.002879] pcieport 0000:00:1c.0: bridge window [io 0x1000-0x0fff] to [bus 02] add_size 1000 [ 521.002886] pcieport 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 02] add_size 200000 add_align 100000 [ 521.002890] pcieport 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff] to [bus 02] add_size 200000 add_align 100000 [ 521.002904] pci_bus 0000:03: Allocating resources [ 521.002930] pcieport 0000:00:1c.2: bridge window [io 0x1000-0x0fff] to [bus 03] add_size 1000 [ 521.002934] pcieport 0000:00:1c.2: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 03] add_size 200000 add_align 100000 [ 521.002948] pci_bus 0000:04: Allocating resources [ 521.002976] pci_bus 0000:05: Allocating resources [ 521.003004] pci_bus 0000:06: Allocating resources [ 521.003095] pcieport 0000:00:1c.7: bridge window [io 0x1000-0x0fff] to [bus 06] add_size 1000 [ 521.003099] pcieport 0000:00:1c.7: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 06] add_size 200000 add_align 100000 [ 521.003114] pcieport 0000:00:1c.0: res[14]=[mem 0x00100000-0x000fffff] res_to_dev_res add_size 200000 min_align 100000 [ 521.003118] pcieport 0000:00:1c.0: res[14]=[mem 0x00100000-0x002fffff] res_to_dev_res add_size 200000 min_align 100000 [ 521.003121] pcieport 0000:00:1c.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000 [ 521.003125] pcieport 0000:00:1c.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000 [ 521.003128] pcieport 0000:00:1c.2: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000 [ 521.003131] pcieport 0000:00:1c.2: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000 [ 521.003135] pcieport 0000:00:1c.7: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000 [ 521.003138] pcieport 0000:00:1c.7: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000 [ 521.003142] pcieport 0000:00:1c.0: res[13]=[io 0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000 [ 521.003145] pcieport 0000:00:1c.0: res[13]=[io 0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000 [ 521.003148] pcieport 0000:00:1c.2: res[13]=[io 0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000 [ 521.003151] pcieport 0000:00:1c.2: res[13]=[io 0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000 [ 521.003154] pcieport 0000:00:1c.7: res[13]=[io 0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000 [ 521.003157] pcieport 0000:00:1c.7: res[13]=[io 0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000 [ 521.003169] pcieport 0000:00:1c.0: BAR 14: assigned [mem 0xcf200000-0xcf3fffff] [ 521.003186] pcieport 0000:00:1c.0: BAR 15: assigned [mem 0xcf400000-0xcf5fffff 64bit pref] [ 521.003199] pcieport 0000:00:1c.2: BAR 15: assigned [mem 0xcf600000-0xcf7fffff 64bit pref] [ 521.003210] pcieport 0000:00:1c.7: BAR 15: assigned [mem 0xcf800000-0xcf9fffff 64bit pref] [ 521.003216] pcieport 0000:00:1c.0: BAR 13: assigned [io 0x2000-0x2fff] [ 521.003221] pcieport 0000:00:1c.2: BAR 13: assigned [io 0x3000-0x3fff] [ 521.003225] pcieport 0000:00:1c.7: BAR 13: assigned [io 0x4000-0x4fff] [ 521.004497] acpi device:41: Cannot transition to power state D3hot for parent in (unknown) Info about that device:41: $ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/device:41/adr 0x00000001 $ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/device:41/path \_SB_.PCI0.SDHC.WI01 $ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/device:41/power_state D0 $ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/device:41/status 15 And about parent of device:41: $ cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/INT33C6:00/status 0 Looks like parent is disabled, but device:41 is not disabled? It smells... Created attachment 249131 [details]
Remove res_to_dev_res() debug message
Thanks, Pali! I think the main point of this bug is extra messages when opening the lid. I'm going to focus on those for now, and leave the device:41 power state question for later (we should probably open a different bugzilla just for that).
This patch should remove some of the messages. I'll have to look closer at the rest. Would you mind attaching the complete "lspci -vv" output (as root) and the complete dmesg log with this patch applied? It's based on v4.10-rc1, but it should apply to v4.9 as well.
Created attachment 249661 [details]
lspci -v -v
lspci output is attached, hope it helps.
I will try to patch and compile new kernel later.
I'll include the comment #17 patch in v4.11. If there's still too much dmesg spam, please attach a dmesg and we can work on that. I think this was resolved by 25e77388e1ab ("PCI: Remove res_to_dev_res() debug message"), which appeared in v4.11. I'm going to close this, but if there's still too much spam, please open a new bugzilla and attach a recent dmesg log. |
Created attachment 152131 [details] dmesg log Every time when I open LID on my notebook I see tons of new error lines in dmesg with kernel 3.17-rc6. And more lines come from radeon card (dmesg log is attached). I do not see these log lines with 3.13 kernel, so this is regression.