Bug 208597 - runtime pm broken on Lenovo P1G2 with Nouveau since "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
Summary: runtime pm broken on Lenovo P1G2 with Nouveau since "PCI/PM: Assume ports wit...
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-17 11:24 UTC by Karol Herbst
Modified: 2023-01-17 23:00 UTC (History)
4 users (show)

See Also:
Kernel Version: 5.7.8
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
lspci -vv (114.28 KB, text/plain)
2020-07-17 11:24 UTC, Karol Herbst
Details
dmesg (103.06 KB, text/plain)
2020-07-17 11:25 UTC, Karol Herbst
Details
Wait for maximum 1100ms if child device has PCI PM disabled (784 bytes, patch)
2020-07-21 14:25 UTC, Mika Westerberg
Details | Diff

Description Karol Herbst 2020-07-17 11:24:19 UTC
Created attachment 290329 [details]
lspci -vv

With commit "afaff825e3a436f9d1e3986530133b1c91b54cd1" runtime pm seems to be broken. Reverting it on top of the 5.7 branch fixes the issue. Problem seems to occur when the GPU is getting accessed after it was runtime suspended.

git bisect log:

git bisect start
# bad: [a92b984a110863b42a3abf32e3f049b02b19e350] clk: samsung:
exynos5433: Add IGNORE_UNUSED flag to sclk_i2s1
git bisect bad a92b984a110863b42a3abf32e3f049b02b19e350
# good: [4da858c086433cd012c0bb16b5921f6fafe3f803] Merge branch
'linux-5.7' of git://github.com/skeggsb/linux into drm-fixes
git bisect good 4da858c086433cd012c0bb16b5921f6fafe3f803
# good: [d5dfe4f1b44ed532653c2335267ad9599c8a698e] Merge tag
'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
git bisect good d5dfe4f1b44ed532653c2335267ad9599c8a698e
# good: [b24e451cfb8c33ef5b8b4a80e232706b089914fb] ipv6: fix
IPV6_ADDRFORM operation logic
git bisect good b24e451cfb8c33ef5b8b4a80e232706b089914fb
# good: [d843ffbce812742986293f974d55ba404e91872f] nvmet: fix memory
leak when removing namespaces and controllers concurrently
git bisect good d843ffbce812742986293f974d55ba404e91872f
# good: [be66f10a60e3ec0b589898f78a428bcb34095730] staging: wfx: fix
output of rx_stats on big endian hosts
git bisect good be66f10a60e3ec0b589898f78a428bcb34095730
# good: [a4482984c41f5cc1d217aa189fe51bbbc0500f98] s390/qdio:
consistently restore the IRQ handler
git bisect good a4482984c41f5cc1d217aa189fe51bbbc0500f98
# good: [bec32a54a4de62b46466f4da1beb9ddd42db81b8] f2fs: fix potential
use-after-free issue
git bisect good bec32a54a4de62b46466f4da1beb9ddd42db81b8
# bad: [044aaaa8b1b15adb397ce423a6d97920a46b3893] habanalabs: increase
timeout during reset
git bisect bad 044aaaa8b1b15adb397ce423a6d97920a46b3893
# good: [6fe8ed270763a6a2e350bf37eee0f3857482ed48] arm64: dts: qcom:
db820c: Fix invalid pm8994 supplies
git bisect good 6fe8ed270763a6a2e350bf37eee0f3857482ed48
# good: [363e8bfc96b4e9d9e0a885408cecaf23df468523] tty: n_gsm: Fix
waking up upper tty layer when room available
git bisect good 363e8bfc96b4e9d9e0a885408cecaf23df468523
# bad: [afaff825e3a436f9d1e3986530133b1c91b54cd1] PCI/PM: Assume ports
without DLL Link Active train links in 100 ms
git bisect bad afaff825e3a436f9d1e3986530133b1c91b54cd1
# good: [be0ed15d88c65de0e28ff37a3b242e65a782fd98] HID: Add quirks for
Trust Panora Graphic Tablet
git bisect good be0ed15d88c65de0e28ff37a3b242e65a782fd98
# first bad commit: [afaff825e3a436f9d1e3986530133b1c91b54cd1] PCI/PM:
Assume ports without DLL Link Active train links in 100 ms
Comment 1 Karol Herbst 2020-07-17 11:25:17 UTC
Created attachment 290331 [details]
dmesg
Comment 2 Kai-Heng Feng 2020-07-17 12:29:10 UTC
I wonder if removing "Linux-Lenovo-NV-HDMI-Audio" helps?
And maybe add the 100ms delay to a larger number can help...
Comment 3 Mika Westerberg 2020-07-21 14:25:26 UTC
Created attachment 290437 [details]
Wait for maximum 1100ms if child device has PCI PM disabled

Can you try this patch on (on top of v5.8-rc6 or the one that does not have the DLL link active commit reverted) and see if it works around the issue?
Comment 4 Marcin Zajaczkowski 2020-08-03 08:34:42 UTC
FYI. 5.7.11 fixed the problem with suspend on Hyperbook NH5/Clevo NH55RCQ with GeForce GTX 1660 Mobile (TU116M). That revert (https://github.com/torvalds/linux/commit/e0b8a866eba09fceea7bab7732eee7ad1077732e), which refers this issue seems to be the most probable reason of the situation improvement, thanks.

If you agree that it might be related and you have some further patches which can impact my hardware I can test it.

More details in the nouveau issue tracker - https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/540
Comment 5 Bjorn Helgaas 2023-01-17 23:00:33 UTC
Since the nouveau issue tracker (#540) was closed as fixed in v5.8.0, I'm closing this as well.  Please reopen with any details if necessary.

Note You need to log in before you can comment on or make changes to this bug.