Bug 208825 - lspci triggers NULL pointer dereference on AMD Renoir 4800H/5600M laptop
Summary: lspci triggers NULL pointer dereference on AMD Renoir 4800H/5600M laptop
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-08-06 00:33 UTC by Jon Tourville
Modified: 2020-09-14 19:13 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.8.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments
lspci triggers NULL pointer dereference on AMD Renoir laptop (196.05 KB, text/plain)
2020-08-06 00:33 UTC, Jon Tourville
Details

Description Jon Tourville 2020-08-06 00:33:47 UTC
Created attachment 290791 [details]
lspci triggers NULL pointer dereference on AMD Renoir laptop

Running Arch Linux with 5.8.0 kernel built from linux-mainline on a Dell G5 15 SE 5505 laptop with a AMD 4800H Renoir APU and 5600M discrete GPU.

On a fresh install of Arch, running lspci triggers an oops and NULL pointer dereference. The oops is not triggered if the kernel is booted with amdgpu.runpm=0, so it appears to be power management-related. The oops kicks off with the following errors (full dmesg and lspci -vvv output attached):

[   93.485414] [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
[   93.485452] [drm] PSP is resuming...
[   93.514696] [drm] reserve 0x900000 from 0x800f400000 for PSP TMR
[   93.684656] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
[   93.704673] amdgpu: SMU is resuming...
[   95.835970] amdgpu: failed send message:     RunBtc (58) 	param: 0x00000000 response 0xffffffc2
[   95.835971] amdgpu: RunBtc failed!
[   95.836016] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62
[   95.836053] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-62).
[   95.851331] snd_hda_intel 0000:03:00.1: refused to change power state from D3hot to D0
[   95.956286] snd_hda_intel 0000:03:00.1: CORB reset timeout#2, CORBRP = 65535
Comment 1 Jon Tourville 2020-09-08 20:16:33 UTC
Appears to be resolved as of 5.8.6 or 5.8.7
Comment 2 Alex Deucher 2020-09-14 05:52:14 UTC
Can you bisect and determine what patch fixed it?
Comment 3 Jon Tourville 2020-09-14 19:13:17 UTC
I am now unable to reproduce even on versions <5.8.6, which I know still had the problem. So I am thinking it may have been a firmware update or something else that resolved the issue for me.

Note You need to log in before you can comment on or make changes to this bug.