Bug 218900
Summary: | amdgpu: Fatal error during GPU init | ||
---|---|---|---|
Product: | Drivers | Reporter: | Jean-Christophe Guillain (jean-christophe) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED CODE_FIX | ||
Severity: | blocking | CC: | alexdeucher, bp, dreamlike_clinking040, i.r.e.c.c.a.k.u.n+bugzilla.kernel.org, jd.girard, mario.limonciello, regressions, suravee.suthikulpanit, vasant.hegde |
Priority: | P3 | ||
Hardware: | AMD | ||
OS: | Linux | ||
Kernel Version: | 6.10.0-rc1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | c4cb23111103a841c2df30058597398443bcad5f |
Attachments: |
Full logs of the boot.
Check Enhanced PPR support before enabling PPR Full dmesg after applying Vasant's patch Complete dmesg |
Description
Jean-Christophe Guillain
2024-05-27 14:52:00 UTC
Created attachment 306354 [details]
Full logs of the boot.
I added the full log of the boot process showing all the errors.
Can you bisect? https://docs.kernel.org/admin-guide/bug-bisect.html Bisecting: 5720 revisions left to test after this (roughly 13 steps) I'll try, but it will take some time. My machine is not very powerful... Possibly the same as this report: https://lore.kernel.org/all/20240527192159.GEZlTdV7OoOuJrHmI0@fat_crate.local/ Created attachment 306364 [details]
Check Enhanced PPR support before enabling PPR
Hi, Attached patch should fix this issue. Can you please test it? I will send proper patch to mailing list soon. -Vasant Also can you please attach full dmesg? I want to see IOMMU feature list and confirm what I am doing is right. -Vasant Hi, I plan to finish the bisection today, and I'll test your patch. jC (In reply to Jean-Christophe Guillain from comment #8) > Hi, > > I plan to finish the bisection today, and I'll test your patch. > You mean bisecting for this issue? If so we know the culprit commit. Issue is happening because IOMMU driver tried to enable PPR bit in DTE without checking Enhanced PPR support in EFR register. -Vasant I applied your patch to the 6.10.0-rc1 kernel, and I confirm that it fixes this bug. Thank you very much ! jC (full dmesg attached) Created attachment 306367 [details]
Full dmesg after applying Vasant's patch
(I still finished my bisection, and as you said, c4cb23111103a841c2df30058597398443bcad5f is the first bad commit.) Thanks Jean for testing. I will send patch with your Tested-by today. -Vasant *** Bug 218921 has been marked as a duplicate of this bug. *** (In reply to Vasant Hegde from comment #5) > Created attachment 306364 [details] > Check Enhanced PPR support before enabling PPR I applied your patch on top of rc2 and also confirm that it works. Thank you. (In reply to Hanabishi from comment #15) > (In reply to Vasant Hegde from comment #5) > > Created attachment 306364 [details] > > Check Enhanced PPR support before enabling PPR > > I applied your patch on top of rc2 and also confirm that it works. > Thank you. Thanks Hanabishi for testing. FYI. Patches merged into -rc3. -Vasant I seem to have a similar problem on 6.10-rc5 after suspend. I get a black screen on resume. [ 269.157149] amdgpu 0000:02:00.0: amdgpu: reserve 0x400000 from 0xf41f800000 for PSP TMR [ 269.159956] iommu ivhd0: AMD-Vi: Event logged [ILLEGAL_DEV_TABLE_ENTRY device=0000:02:00.0 pasid=0x00000 address=0x131400000 flags=0x0180] [ 269.159960] AMD-Vi: DTE[0]: 6190000000000003 [ 269.159962] AMD-Vi: DTE[1]: 00001001049e000b [ 269.159963] AMD-Vi: DTE[2]: 200000013c610013 [ 269.159963] AMD-Vi: DTE[3]: 0000000000000000 [ 269.160104] amdgpu 0000:02:00.0: amdgpu: failed to load ucode SDMA0(0x1) [ 269.160108] amdgpu 0000:02:00.0: amdgpu: psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xF) Created attachment 306495 [details]
Complete dmesg
Unfortunately there was another big in suspend/resume path. Can you please test with below patch? https://lore.kernel.org/linux-iommu/ZnqzXyCU8bn32j4-@8bytes.org/T/#m1cd1520facb8b758efdf7a8c0261f9ee2ec217d7 -Vasant Yes, I confirm the patch "iommu/amd: Fix GT feature enablement again" applied to 6.10-rc5 fixes resume on my machine. Thanks for prompt reply! (In reply to Vasant Hegde from comment #19) > Unfortunately there was another big in suspend/resume path. Can you please > test with below patch? > > https://lore.kernel.org/linux-iommu/ZnqzXyCU8bn32j4-@8bytes.org/T/ > #m1cd1520facb8b758efdf7a8c0261f9ee2ec217d7 > > > > -Vasant Can confirm this patch also fixes my suspend/resume issue, thanks! (In reply to dreamlike_clinking040 from comment #21) > (In reply to Vasant Hegde from comment #19) > > Can confirm this patch also fixes my suspend/resume issue, thanks! Thanks a lot. -Vasant |