Bug 194559
Summary: | amdgpu problems loading 2 firmwares on multi-smp system | ||
---|---|---|---|
Product: | Drivers | Reporter: | Janpieter Sollie (janpieter.sollie) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | deathsimple, fin4478, janpieter.sollie |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
See Also: |
https://bugzilla.kernel.org/show_bug.cgi?id=194731 https://bugzilla.kernel.org/show_bug.cgi?id=194899 |
||
Kernel Version: | 4.9.9 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg.txt, lspci.txt and .config
config of working drm-next kernel |
Stock kernels have very little amdgpu code, see kernel.org and click diff. Use the command: git clone -b drm-next-4.11-wip git://people.freedesktop.org/~agd5f/linux The kernel configuration file of Debian Official kernel are available in /boot, named after the kernel release. Copy the .config file to the linux directory. Connect all your devices and run the command: make localmodconfig. You can use the command make defconfig too for creating initial .config file. Use the command: make xconfig and check that you have enabled: Reroute Broken IRQ, Virtualization KVM and 300Hz CPU timer, I also disabled Swap, Kernel Debug, CPU Freq scaling , Cpu handling in Acpi, Used Bios to control CPU and devices. In the drivers->graphics->amdgpu enable cik support for a gcn 1.1 gpu and si support for a gcn 1.0 gpu. Create debian kernel package: export CONCURRENCY_LEVEL=4 fakeroot make-kpkg --initrd kernel_image Install the kernel package with Gdebi. To make a custom kernel to boot, add a line to /etc/initramfs-tools/modules: unix And run: sudo update-initramfs Reboot. Dear fin4478, thank you for the tips, I will try them asap, but I am confused: I have nothing with debian, and the system is headless (running as OpenCL accelerator). Does this matter to you? it works! attached my config file of your drm-next kernel I don't know what needs to be done for you developers to integrate drm-next into the mainline kernel, but thank you!!! Created attachment 254741 [details]
config of working drm-next kernel
(In reply to Janpieter Sollie from comment #3) > it works! attached my config file of your drm-next kernel > I don't know what needs to be done for you developers to integrate drm-next > into the mainline kernel, but thank you!!! Amd should warn not use stock kernels and tell how to use use ~agd5f wip kernel and latest mesa git. Here is the page for you, dear Amd: http://support.amd.com/en-us/download/linux This and many other amdgpu bug reports prove my point. (In reply to fin4478 from comment #5) > This and many other amdgpu bug reports prove my point. Your bug report comments like this one rather indicate that you don't understand how the kernel development process works. (In reply to Michel Dänzer from comment #6) > (In reply to fin4478 from comment #5) > > This and many other amdgpu bug reports prove my point. > > Your bug report comments like this one rather indicate that you don't > understand how the kernel development process works. You do not see how agd5f wip kernel solved this and many other problems. Amd should warn not use stock kernels and tell how to use use ~agd5f wip kernel and latest mesa git. Here is the page for you, dear Amd: http://support.amd.com/en-us/download/linux You clearly want bad reputation for Amd gpus so I stop giving this info. (In reply to fin4478 from comment #7) > You clearly want bad reputation for Amd gpus so I stop giving this info. Well as an AMD employee I can only advise you to stop giving incorrect informations. Alex branches only contain additional features not upstream yet, so they are way more unstable than the upstream kernel driver. additional comment: works on 4.10-rc8, so necessary patch is already integrated thank you kernel developers! |
Created attachment 254705 [details] dmesg.txt, lspci.txt and .config System: opteron 2876 (*2), 128GB ram, x86_64. VGA1: SI, cape verde pro. VGA2: R9 nano. VGA3: onboard mgag200. Distribution: gentoo. kernel: vanilla-sources-4.9.9. kernel loader: lilo 24 firmware: linux-firmware-20170126. bug: amdgpu loading on the system causes a reboot. even when disabling panic, the kernel does not wait for me to reset the system. the issue occurs even when booting the system with init=/bin/bash and then modprobe amdgpu. solutions: 1) removing /lib/firmware/radeon. 2) removing /lib/firmware/amdgpu. 3) boot the kernel with nosmp. Each of these solutions works, but causes hardware not to be initialized Tried without success: 1) using the radeon module instead of amdgpu. 2) using amdgpu-pro. 3) using a different kernel version (4.4.39). 4) boot with iommu=soft. 5) first load drm (works), then load amdgpu (crashes). I suspect a nonsafe threaded kernel bug in drm. In attachment: 1) dmesg of noSMP boot. 2) lspci output. 3) config of current 4.9.9 kernel.