Created attachment 303512 [details]
output of git bisect log
After commit 5e85eba6f50dc288c22083a7e213152bcc4b8208 "PCI/ASPM: Refactor L1 PM Substates Control Register programming" my Laptop does not resume PCI devices back from suspend.
My Laptop is a Tuxedo Infinitybook S 14 v5, as far as I can tell they use a Clevo L140CU Mainboard.
The main symptom is:
iwlwifi 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible
nvme 0000:03:00.0: Unable to change power state from D3hot to D0, device inaccessible
after that, the level of interaction I still have with the laptop varies, but It cannot run dmesg and it cannot do a clean reboot. The issue occurs on every suspend/resume cycle.
Created attachment 303513 [details]
Created attachment 303514 [details]
output of netconsole after the issue happens
Created attachment 303515 [details]
lspci -vv, as root user
Created attachment 303516 [details]
journal before suspend
this is the output of "journalctl -k -b -1"
Is this still broken in 6.1.4?
(In reply to Artem S. Tashkinov from comment #5)
> Is this still broken in 6.1.4?
Yes, same symptoms.
It's being discussed on LKML as well, thanks!
This should be fixed by
which will appear in v6.2-rc8.
Please reopen if you can reproduce the problem on v6.2-rc8 or later.
This still affects my system on v6.2.2 (with zen patches). I have to restart NetworkManager.service to get any networking working again. Additionally, startup got a lot slower after updating to v6.2.2.
To clarify, the startup problem seems to be caused by network manager taking too long.
Raghav, thanks very much for your report. I'm not sure the problem you're seeing is the same as what Thomas reported here. Thomas reported that his system was completely unusable after suspend/resume, and the only thing he could really do was reboot (and even that didn't work reliably).
In your case, it sounds like the system is slow and something is wrong with NetworkManager. Is this a regression? If it used to work, and simply upgrading the kernel to v6.2.2 caused problems, then we should look for a kernel issue.
If it seems like a kernel issue, can you please open a new report with more details (distro details, dmesg log, "sudo lspci -vv" output)? Most kernel subsystems don't pay attention to bugzilla, so email to email@example.com and whatever other list seems relevant is probably best. Maybe firstname.lastname@example.org if you think it's network-related, or email@example.com if you think it's PCI-related.
Ah, thank you for your response. I actually managed to fix it by passing ibt=off to the kernel cmdline, as that feature was causing issues with systemd service bringup. Thank you again for taking the time to help me here.
Hmmm. That's horrible. "ibt=off" isn't documented at all, and even if it were, users should not be required to diagnose the slowdown and somehow figure out to use "ibt=off" to avoid it, so I would definitely consider "ibt=off" as an interim *workaround*, but not an actual *fix*.
I did find a couple bug reports that mention "ibt=off" as a workaround:
Both are related to the nvidia driver, and it looks like you should see "Missing ENDBR" in your dmesg log if you are seeing the same problem. So, if you're seeing that problem, I guess using "ibt=off" is OK.
But if you're seeing something different, i.e., you're not using nvidia, please report it to firstname.lastname@example.org and Peter Zijlstra <email@example.com> (and cc: me). We would want to see the complete dmesg log to help figure this out.
I came across this through some discussions around that, like this one: https://bbs.archlinux.org/viewtopic.php?id=276805. Before this, I had tried out a few other kernel versions, and I had this same issue in 5.19 as well, I believe. The one I've linked has to do with 5.18. I do use NVidia hardware, which is what could have caused this. I'm not sure if I saw the string you describe when I was looking at my dmesg.