Bug 215652 - kernel 5.17-rc fail to load radeon DRM "modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)"
Summary: kernel 5.17-rc fail to load radeon DRM "modprobe: ERROR: could not insert 'ra...
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: PPC-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-03-02 23:04 UTC by Erhard F.
Modified: 2022-03-18 01:08 UTC (History)
4 users (show)

See Also:
Kernel Version: 5.17-rc5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel dmesg (kernel 5.17-rc5, CONFIG_DRM_RADEON=m, Talos II) (73.83 KB, text/plain)
2022-03-02 23:04 UTC, Erhard F.
Details
kernel dmesg (kernel 5.17-rc5, CONFIG_DRM_RADEON=y, Talos II) (77.50 KB, text/plain)
2022-03-02 23:07 UTC, Erhard F.
Details
kernel .config (kernel 5.17-rc5, CONFIG_DRM_RADEON=m, Talos II) (110.14 KB, text/plain)
2022-03-02 23:07 UTC, Erhard F.
Details
kernel dmesg (kernel 5.17-rc7, CONFIG_DRM_RADEON=m, Talos II) (61.44 KB, text/plain)
2022-03-10 13:19 UTC, Erhard F.
Details

Description Erhard F. 2022-03-02 23:04:42 UTC
Created attachment 300520 [details]
kernel dmesg (kernel 5.17-rc5, CONFIG_DRM_RADEON=m, Talos II)

Kernel 5.17-rc5 has problems loading the radeon KMS module on my Talos II:

 # modprobe -v radeon
insmod /lib/modules/5.17.0-rc5-P9+/kernel/drivers/gpu/drm/drm_kms_helper.ko 
modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)

dmesg show no further output then.

When I build a kernel with CONFIG_DRM_RADEON=y I get this in dmesg:
[...]
ATOM BIOS: X1550
[drm] Generation 2 PCI interface, using max accessible memory
radeon 0000:01:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used)
radeon 0000:01:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF
[drm] Detected VRAM RAM=512M, BAR=256M
[drm] RAM width 128bits DDR
radeon 0000:01:00.0: dma_iommu_get_required_mask: returning bypass mask 0xfffffffffffffff
[drm] radeon: 512M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.
[drm] GART: num cpu pages 131072, num gpu pages 131072
[drm] radeon: 1 quad pipes, 1 z pipes initialized.
[drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
radeon 0000:01:00.0: WB enabled
radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000000
radeon 0000:01:00.0: radeon: MSI limited to 32-bit
[drm] radeon: irq initialized.
[drm] Loading R500 Microcode
radeon 0000:01:00.0: Direct firmware load for radeon/R520_cp.bin failed with error -2
radeon_cp: Failed to load firmware "radeon/R520_cp.bin"
[drm:.r100_cp_init] *ERROR* Failed to load firmware!
radeon 0000:01:00.0: failed initializing CP (-2).
radeon 0000:01:00.0: Disabling GPU acceleration

So it complains about not finding the relevant firmware. But the firmware being located in everything should be ok. This used to work on kernels 5.16.x and before.

 # ls -al /lib/firmware/radeon/R520_cp.bin
-rw-r--r-- 1 root root 2048  2. Mär 20:43 /lib/firmware/radeon/R520_cp.bin


Some data about the machine:
 # lspci 
0000:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0000:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV516 [Radeon X1300/X1550 Series]
0000:01:00.1 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] RV516 [Radeon X1300/X1550 Series] (Secondary)
0001:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0001:01:00.0 Non-Volatile memory controller: Phison Electronics Corporation Device 5008 (rev 01)
0002:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
0004:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0004:01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0004:01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0005:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0005:01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04)
0005:02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)
0030:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0031:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0032:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0033:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)

 # lspci -s 0000:01:00.0 -vv
0000:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV516 [Radeon X1300/X1550 Series] (prog-if 00 [VGA controller])
	Subsystem: PC Partner Limited / Sapphire Technology RV516 [Radeon X1300/X1550 Series]
	Device tree node: /sys/firmware/devicetree/base/pciex@600c3c0000000/pci@0/vga@0
	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 42
	NUMA node: 0
	IOMMU group: 0
	Region 0: Memory at 6000000000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at 600c000020000 (64-bit, non-prefetchable) [size=64K]
	Region 4: I/O ports at <unassigned> [disabled]
	Expansion ROM at 600c000000000 [disabled] [size=128K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Express (v1) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset- SlotPowerLimit 0.000W
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s (ok), Width x16 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Kernel driver in use: radeon
Comment 1 Erhard F. 2022-03-02 23:07:28 UTC
Created attachment 300521 [details]
kernel dmesg (kernel 5.17-rc5, CONFIG_DRM_RADEON=y, Talos II)
Comment 2 Erhard F. 2022-03-02 23:07:59 UTC
Created attachment 300522 [details]
kernel .config (kernel 5.17-rc5, CONFIG_DRM_RADEON=m, Talos II)
Comment 3 Alex Deucher 2022-03-03 02:38:41 UTC
If you are using an initrd, the firmware must be present on the initrd for the driver to find it when it loads.  Please make sure the firmware is available in your initrd.
Comment 4 Erhard F. 2022-03-03 10:33:08 UTC
No I don't use an initrd. The kernel to boot the Talos is on a boot partition, modules are loaded from root partition. Which worked in 5.16 and before. Now I am getting this "could not insert 'radeon': Unknown symbol in module".

Though this may explain the 2nd error I got when building radeon statically into the kernel. I'll add the firmware in-kernel and see if the error message changes and whether it actually works then.
Comment 5 Erhard F. 2022-03-03 16:33:39 UTC
Ok, changed my config to include the firmware via
CONFIG_EXTRA_FIRMWARE="radeon/R520_cp.bin"
CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"

With CONFIG_DRM_RADEON=y the machine boots now as expected with no radeon error messages.

With CONFIG_DRM_RADEON=m radeon does not load and I still get "modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)" trying to load it manually via modprobe -v radeon.
Comment 6 Alex Deucher 2022-03-03 17:49:08 UTC
(In reply to Erhard F. from comment #5)
> 
> With CONFIG_DRM_RADEON=m radeon does not load and I still get "modprobe:
> ERROR: could not insert 'radeon': Unknown symbol in module, or unknown
> parameter (see dmesg)" trying to load it manually via modprobe -v radeon.

You need to make sure the firmware is in your initrd.  When the kernel loads, it loads from the initrd.  There is no filesystem mounted yet when the radeon driver is loaded so the firmwares need to be in the initrd.
Comment 7 Erhard F. 2022-03-03 18:21:49 UTC
(In reply to Alex Deucher from comment #6)
> You need to make sure the firmware is in your initrd.  When the kernel
> loads, it loads from the initrd.  There is no filesystem mounted yet when
> the radeon driver is loaded so the firmwares need to be in the initrd.
As I said I am not using an initrd and this config worked on <5.17-rc. So something must have changed that I now get "modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)". I'll try a bisect and see if I can dig out more.
Comment 8 Erhard F. 2022-03-10 13:19:29 UTC
Created attachment 300550 [details]
kernel dmesg (kernel 5.17-rc7, CONFIG_DRM_RADEON=m, Talos II)

Seems this is issue already fixed in -rc7.

v5.17-rc7 boots on the Talos II again with radeon drm loaded from disk without an initrd or firmware being built in.

Out of curiosity I'll do a bisect next week anyhow to check out which commit fixed the issue.

But feel free to close here if it is not appropriate to hold this bug open any longer.
Comment 9 Alex Deucher 2022-03-10 14:35:51 UTC
Only the person that filed the bug can close it.  If it's fixed for you, please close it.  Thanks!
Comment 10 Erhard F. 2022-03-18 01:08:35 UTC
I did not get out a meaningful result out of my reverse bisect... But v5.17.0-rc7 abd v5.17.0-rc8 do not show this issue.

So closing here.

Note You need to log in before you can comment on or make changes to this bug.