Bug 209939 - radeontop causes kernel panic
Summary: radeontop causes kernel panic
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-29 13:22 UTC by Janpieter Sollie
Modified: 2020-11-04 07:41 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.9.1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel .config file of 3 PCs (400.00 KB, application/x-tar)
2020-10-29 13:22 UTC, Janpieter Sollie
Details

Description Janpieter Sollie 2020-10-29 13:22:23 UTC
Created attachment 293297 [details]
kernel .config file of 3 PCs

(view 3 .config files)
> PC1: problem pc, Ryzen 2400GE APU with Vega 11 and 5.9.1 kernel (Xorg
> running)
> PC2: working pc, ryzen V1605 APU with vega 8 and 5.8.14 kernel (Xorg running)
> PC3: working pc, Threadripper 1950 + Fiji GPU and 5.9.1 kernel (CLI only)

As the subject states: on PC1, the kernel can't handle the radeontop program, one way or another, these methods work / do not on PC1:
> - while hardware-accelerated content is running, panic
> - When in console mode, it's fine
> - when switching from console to X, it's fine for a few moments
> - when trying it early (X running sddm, radeontop via ssh), panic

with *panic*, I mean: the PC does not react anymore: the num lock trigger is no longer working, no input is accepted, the clock on the GUI does not change anymore, no SSH.

I tried everything:
> - pstore is empty
> - dd if=/dev/kmsg of=/dev/sdb1 & while [ 1]; do echo s > /proc/sysrq-trigger;
> sleep 10; done & radeontop (and pulling it out of this partition afterwards)

The mainboard does not have a RS232 port, so debugging this way is not possible;
also, I doubt I'd be able to use KDB if the screen stucks at GUI mode ...

If I can do anything to gather more info, let me know
Comment 1 Alex Deucher 2020-10-29 13:31:38 UTC
Does setting amdgpu.runpm=0 on the kernel command line in grub fix the issue?  How are you running radeontop?  If you are running it such that it tries to access MMIO space directly rather than going through the kernel, that could cause an issue.
Comment 2 Janpieter Sollie 2020-10-29 20:21:01 UTC
I am running radeontop the usual way - without arguments, default compile.
amdgpu.runpm=0 has no effect
Comment 3 Alex Deucher 2020-10-29 21:00:27 UTC
Does setting amdgpu.ppfeaturemask=0xffff3fff on the kernel command line in grub fix it?
Comment 4 Janpieter Sollie 2020-10-30 07:53:31 UTC
sorry, no, still the same ...
just to be sure, if I do this, this overrides settings in /etc/modprobe.d/amdgpu.conf, right?
Comment 5 Janpieter Sollie 2020-11-04 07:41:47 UTC
Also tried (thanks to hint from Gentoo) netconsole
when using netconsole, no output is logged: while the kernel buffer from before 'radeontop' is printed correctly, no other output is passed during "kernel panic", apparently the kernel does not live long enough to push it to netconsole, or it's a bug in radeontop causing hardware freeze

Note You need to log in before you can comment on or make changes to this bug.