# Context I'm using Xubuntu 20.04 I compiled Kernel 5.18.11+ myself (shows bug) I compiled Kernel 5.13.7+ myself (does not show bug) My GPU is AMD Radeon 6800 XT 16GB, I don't have an iGPU (CPU is Ryzen 5900X) Mesa is: OpenGL renderer string: AMD Radeon RX 6800 XT (sienna_cichlid, LLVM 14.0.1, DRM 3.46, 5.18.11+) OpenGL core profile version string: 4.6 (Core Profile) Mesa 22.0.5 - kisak-mesa PPA OpenGL core profile shading language version string: 4.60 # Steps to reproduce 1. Turn on the PC 2. On *some* occasions X11 will crash, taking down the keyboard; leaving the computer in a seemingly frozen state while displaying tty with the last info messages 3. As a workaround, I can login via ssh and type `sudo service lightdm restart` and the X11 server will start and everything starts working perfectly fine # Diagnostic It seems X11 doesn't wait for amdgpu to be up. This can be seen by checking /var/log/Xorg.0.log (attached): [ 7.718] (II) modesetting: Driver for Modesetting Kernel Drivers: kms [ 7.718] (II) FBDEV: driver for framebuffer: fbdev [ 7.718] (II) VESA: driver for VESA chipsets: vesa [ 7.718] (WW) xf86OpenConsole: setpgid failed: Operation not permitted [ 7.718] (WW) xf86OpenConsole: setsid failed: Operation not permitted [ 7.719] (EE) open /dev/dri/card0: No such file or directory [ 7.719] (WW) Falling back to old probe method for modesetting [ 7.719] (EE) open /dev/dri/card0: No such file or directory Visually speaking, I *think* that X11 tries to init while tty is still in VESA mode before/during switching to 1920x1080 AFAIK, systemd is responsible for waiting the GPU drivers are up. Does anybody know where I should look? Does systemd need an update? Could this be a libDRM issue? I currently have installed 2.4.110 in /usr/lib and libdrm 2.4.111 compiled from source in /usr/local/lib I could try bisecting but unfortunately the reproducibility isn't "always" which makes it hard to debug. All of this has been working fine with Kernel 5.13.7+ Cheers
Created attachment 301479 [details] Xorg log when it fails
Created attachment 301480 [details] Xorg log when it succeeds
Maybe the driver or firmware is not available in your initrd so the driver can't be loaded during boot?
Thanks for the hint! The amdgpu driver (nor the firmware) are definitely NOT in /boot/initrd.img-5.18.11+ I will have to lookup how to include them into initrd. Though it may be worth mentioning neither are they included in 5.13 (my custom build) nor in Ubuntu's official kernels. I do wonder if previously was working fine by mere luck (i.e. race condition was just much harder to trigger) or if something changed that causes whatever Ubuntu does to wait on amdgpu to no longer wait
Your hint is very good. It tells me upstream kernel devs expect the amdgpu driver & firmware should be in initrd; while Ubuntu does not do that. This is starting to look more and more like an Ubuntu bug. I looked further into the matter and found out that /lib/udev/rules.d/78-graphics-card.rules has entries for 1. "drm": i915, radeon, nouveau, vmwgfx 2. "graphics": amdgpu, i915, radeon, nouveau, efifb, efi-framebuffer, vesa-framebuffer I just edited the rules file to include amdgpu on both sections and see what happens. So far rebooted only once and Xorg didn't crash. I'll monitor how it goes and if the crashes stop I'll close this ticket and report it to Ubuntu.
OK today it happened again so changing 78-graphics-card.rules did not fix it. I just found this: https://bbs.archlinux.org/viewtopic.php?id=260525 Which leads me to this: https://github.com/sddm/sddm/issues/1316 Apparently SDDM was having the same issue and the "fix" was to add QThread::sleep(1); Does the Kernel have an interface to know if a GPU driver will be or is being loaded and get notified when it's done? I assumed there was, but looking at those threads it appears there is not and graphical initialization is basically just YOLO?
Adding amdgpu to initramfs seems to have workarounded the problem. I have not experienced this problem after it. I can also visibly see the boot process is slightly different (splash becomes 1920x1080 a bit sooner) If anyone is having the same issue, the workaround is (Ubuntu): echo "amdgpu" | sudo tee --append /etc/initramfs-tools/modules sudo update-initramfs -c -k $(uname -r) If done properly then running: lsinitramfs /boot/initrd.img-$(uname -r) | grep amdgpu Should return multiple hits Then reboot. This ticket can be closed; but probably a new one to track an interface to notify when kernel is done loading all video interfaces should be created.