Bug 205521 - 5.3.11 update broke AMDGPU Raven Ridge
Summary: 5.3.11 update broke AMDGPU Raven Ridge
Status: RESOLVED UNREPRODUCIBLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 blocking
Assignee: drivers_video-dri
URL: https://bugzilla.redhat.com/show_bug....
Keywords:
Depends on:
Blocks:
 
Reported: 2019-11-14 06:15 UTC by Luya Tshimbalanga
Modified: 2019-11-15 16:07 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.3.11 and up
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg reporting broken amd raven ridge firmware (220.95 KB, text/plain)
2019-11-14 06:15 UTC, Luya Tshimbalanga
Details
dmesg with the latest git snapshot (322.92 KB, text/plain)
2019-11-14 06:16 UTC, Luya Tshimbalanga
Details

Description Luya Tshimbalanga 2019-11-14 06:15:28 UTC
Created attachment 285903 [details]
dmesg reporting broken amd raven ridge firmware

AMD Raven Ridge firware is currently broken with the recent stable kernel release resulting a blank screen on boot and preventing booting on the login screen either graphical and text mode.

Extract from boot:

nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Direct firmware load for amdgpu/raven_gpu_info.bin failed with error -2
nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Failed to load gpu_info firmware "amdgpu/raven_gpu_info.bin"
nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Fatal error during GPU init
Comment 1 Luya Tshimbalanga 2019-11-14 06:16:43 UTC
Created attachment 285905 [details]
dmesg with the latest git snapshot

Recent kernel git snapshot is also affected:
Comment 2 Luya Tshimbalanga 2019-11-14 07:40:03 UTC
Added similar report from freedesktop.org
Comment 3 Alex Deucher 2019-11-14 13:56:16 UTC
(In reply to Luya Tshimbalanga from comment #0)
> 
> nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Direct firmware load for
> amdgpu/raven_gpu_info.bin failed with error -2
> nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Failed to load gpu_info
> firmware "amdgpu/raven_gpu_info.bin"
> nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Fatal error during GPU init

The kernel is not able to find the firmware image.  If you are using an initrd, please make sure to includes the firmwares in the initrd.  If you are building the diver into the kernel, you need to build the firmware into the kernel as well.
Comment 4 Luya Tshimbalanga 2019-11-14 16:58:15 UTC
(In reply to Alex Deucher from comment #3)
> (In reply to Luya Tshimbalanga from comment #0)
> > 
> > nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Direct firmware load for
> > amdgpu/raven_gpu_info.bin failed with error -2
> > nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Failed to load gpu_info
> > firmware "amdgpu/raven_gpu_info.bin"
> > nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Fatal error during GPU init
> 
> The kernel is not able to find the firmware image.  If you are using an
> initrd, please make sure to includes the firmwares in the initrd.  If you
> are building the diver into the kernel, you need to build the firmware into
> the kernel as well.

It is a Fedora kernel. I don't know how that happened with a simple update and I included the dmesg for investigation. I linked the Fedora bug report as well for reference.
Comment 5 Luya Tshimbalanga 2019-11-15 16:07:49 UTC
I am closing this report for now as I reinstalled the system. The update proceeded normally with the result:

sudo lsinitrd /boot/initramfs-5.3.11-300.fc31.x86_64.img | grep raven                                                        
-rw-r--r--   2 root     root        86528 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_asd.bin
-rw-r--r--   1 root     root         9344 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_ce.bin
-rw-r--r--   1 root     root          316 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_gpu_info.bin
-rw-r--r--   1 root     root        17536 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_me.bin
-rw-r--r--   2 root     root       268048 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_mec2.bin
-rw-r--r--   2 root     root            0 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_mec.bin
-rw-r--r--   1 root     root        21632 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_pfp.bin
-rw-r--r--   1 root     root        38324 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_rlc.bin
-rw-r--r--   1 root     root        17408 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_sdma.bin
-rw-r--r--   1 root     root       343456 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_vcn.bin
-rw-r--r--   1 root     root        78336 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_asd.bin
-rw-r--r--   1 root     root         9344 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_ce.bin
-rw-r--r--   1 root     root        23152 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_dmcu.bin
-rw-r--r--   2 root     root          316 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_gpu_info.bin
-rw-r--r--   1 root     root        39084 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_kicker_rlc.bin
-rw-r--r--   1 root     root        17536 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_me.bin
-rw-r--r--   2 root     root       268048 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_mec2.bin
-rw-r--r--   2 root     root            0 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_mec.bin
-rw-r--r--   1 root     root        21632 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_pfp.bin
-rw-r--r--   1 root     root        39084 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_rlc.bin
-rw-r--r--   2 root     root        17408 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_sdma.bin
-rw-r--r--   2 root     root       341728 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_vcn.bin

It appears dracut somehow managed to not install the firmware prior to the failure. I can no longer reproduce it with a reinstall.

Note You need to log in before you can comment on or make changes to this bug.