Created attachment 305259 [details] dmesg-6.6.0-rc6-v2-suspend.txt When suspending and resuming from RAM on the Lenovo V15 G4 AMN, multiple NVME IOMMU page faults occur, showing up in dmesg as repeated errors: nvme 0000:01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000b address=0xb6674000 flags=0x0000] The system is unstable afterwards: of three attempts, one resulted in an unusable file system (and read or write attempts resulted in IO errors), one appeared fine, but still logged the above errors (see attachment dmesg-6.6.0-rc6-v2-suspend.txt), and one could not restart the wifi card (dmesg-6.6.0-rc6-v2-suspend-2.txt). This was discovered while investigating bug 218003 (malfunctioning keyboard caused by uninitialized PIC), but it's believed to be a separate issue.
Created attachment 305260 [details] dmesg-6.6.0-rc6-v2-suspend-2.txt
Can you experiment with adding the system into this quirk list? https://github.com/torvalds/linux/blob/master/drivers/platform/x86/amd/pmc/pmc-quirks.c#L25 It's not exactly the same bug, but it is similarish.
Created attachment 305262 [details] add-s2idle-quirk.patch This is the patch I've used to add it to the quirks list. Suspend to RAM seems to work fine now, and the page faults no longer appear in the dmesg output (attached, below).
Created attachment 305263 [details] dmesg-6.6.0-rc6-quirks.txt
Well that's great! I think we need something from Lenovo to confirm what values represent all those Mendocino systems. We don't want that patch to apply to "ALL Lenovo laptops" from this year, only the Mendocino ones that are affected by this issue. @Mark, Can you please confirm the strings for the Mendocino ones?
I'll try...though as a note this is a BU I don't deal with though so might take me a bit to track down. As a note - looking at https://psref.lenovo.com/Product/Lenovo/Lenovo_V15_G4_AMN If you go to the models tab and click on Machine type there are 82YU and 83CQ model numbers for this platform - so at least for this one probably good to cover both. Wish there was a way of filtering on CPU across all the portfolio's. Mark
Thanks Mark! The other way to attack this is of course to fix the BIOS. This appears to be the same issue you fixed on all those other systems, but I suspected this one wasn't in the list getting the fix. If that way is preferable, we can close this issue from Linux side and you can go that way too.
Well either way we have to identify the systems :) For my reference: Internal ticket is LO-2698 I have noted that preferred fix is to fix BIOS - but as these platform(s) aren't in the Linux program I can't promise anything.
Not guaranteed conclusive - but there are no Mendocino used on Think platforms, AIO, SMB or desktop. The only ones we found referenced are Ideapad1 and V15 Checking psref.lenovo.com only matches I could find for 7x20 (which I think is what Mendocino is) are: V15 AMN (82YU and 83CQ model) Ideapad1 14 AMN7 (82VF model) Ideapad1 15 AMN7 (82VG & 82X5 model) Chances of me getting FW fixes for these platforms is very very low - so recommend going with the kernel patch. Mark
Thanks Mark! David, do you mind squeezing all those into your patch and sending it out to the mailing lists?
(In reply to Mark Pearson from comment #9) > Checking psref.lenovo.com only matches I could find for 7x20 (which I think > is what Mendocino is) are: > > V15 AMN (82YU and 83CQ model) > Ideapad1 14 AMN7 (82VF model) > Ideapad1 15 AMN7 (82VG & 82X5 model) Isn't V14 G4 AMN (82YT, 83GE) also affected? IdeaPad Slim 3 14AMN8 (82XN)? IdeaPad Slim 3 14AMN8 (82XQ)? I'm starting to suspect that "AMN" stands for "AMD MendociNo". (In reply to Mario Limonciello (AMD) from comment #10) > David, do you mind squeezing all those into your patch and sending it out to > the mailing lists? Happy to do it, but it's my first kernel patch since the early 2000s, so I'll have to read up on the process first. :-) Any pointers greatly appreciated.
> Happy to do it, but it's my first kernel patch since the early 2000s, so I'll > have to read up on the process first. :-) Any pointers greatly appreciated. Thanks! https://www.kernel.org/doc/html/latest/process/submitting-patches.html
> I'm starting to suspect that "AMN" stands for "AMD MendociNo". Oh man - can't believe I didn't notice that! I think you're right. I have no experience with Ideapad naming (I've always thought ThinkBook was strange...I think it follows similar) I checked on PSREF and it gave the platforms you found. I think you're good to go.
Merged as https://github.com/torvalds/linux/commit/3bde7ec13c971445faade32172cb0b4370b841d9
Note the fixes for this have landed in 6.5.10 now, which should be available through your distro for Arch and Fedora users soon.
The fixes also landed in 6.1.61, for distros that follow that older branch. And, as a note of caution, make sure the module containing this code is loaded, otherwise you'll still see those page faults. Ask me how I know... For the 6.5+ kernels, the module is called 'amd-pmc', while in 6.1 this code still lived in 'thinpad_acpi'.