Bug 89661
Summary: | Kernel panic when trying use amdkfd driver on Kaveri | ||
---|---|---|---|
Product: | Drivers | Reporter: | Bernd Steinhauser (linux) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | oded.gabbay |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 3.18.0 + drm-next branch | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Picture of the kernel panic output
Print errors in case of NULL pointers and don't dereference them More checks on pointers being used workaround for the module order problem hacky workaround for module order problem |
Description
Bernd Steinhauser
2014-12-13 09:32:19 UTC
Created attachment 160441 [details]
Picture of the kernel panic output
Does it also happen with CONFIG_HSA_AMD=m? Created attachment 160721 [details]
Print errors in case of NULL pointers and don't dereference them
Hi, Please try the attached patch. (In reply to Michel Dänzer from comment #2) > Does it also happen with CONFIG_HSA_AMD=m? Only tried CONFIG_HSA_AMD=n, not module, but this happens so early that I'm confident it does not matter. Will try the patch, thanks. Tried the patch, exactly the same result. Created attachment 160751 [details]
More checks on pointers being used
Hi, Three things, please: 1. Please try the attached patch. It tries to verify more pointers before using them. 2. You said CONFIG_HSA_AMD=y. What's the value of CONFIG_DRM_RADEON ? If its "m", could you change it to "y" ? 3. I would still like to ask if you could check with the following config: CONFIG_DRM_RADEON="m" CONFIG_HSA_AMD="m" Thanks Oded One more thing, I'm trying to understand the exact tree you are using so we will look at the same code. Did you just took drm-next, or did you manually merged between trees ? If you did a manual merge, could you try instead to just take drm-next ? It's already based on 3.18.0 Hi, So I managed to recreate the bug on my setup. This is happening because you compiled all the modules inside the kernel. I need to address that, but for now, if you will compile them as "m", everything is supposed to work. Hm, ok. So should I still try the steps above? Because trying to use drm_radeon as a module would require me to do some testing with that setup before. (In reply to Oded Gabbay from comment #8) > 2. You said CONFIG_HSA_AMD=y. What's the value of CONFIG_DRM_RADEON ? If its > "m", could you change it to "y" ? I'm using a static initrd (only a basic system, but doesn't contain any kernel modules), so all drivers necessary to start the system (including drm_radeon) are compiled in. Regarding the tree: I took plain 3.18 (b2776b) and then merged the drm-next branch from the repo mentioned above. iirc, it was a fast forward. As I said, there is definitely a bug when compiling both radeon and amdkfd inside the kernel. I'm working on fixing it, but that could take a few days. In the meantime, the only way to make it work without touching the code, is to either compile both drivers as modules or just radeon as module. No need for further experiments. Created attachment 160951 [details]
workaround for the module order problem
I attached a new patch which should solve you the problem when compiling all the drivers into the kernel image. This is a hacky workaround, so this is not the final solution, but it will help you continue with your setup, I hope. Created attachment 160961 [details]
hacky workaround for module order problem
Thanks, I'll give it a try. Ok, it does now boot and seems to work. At some point (didn't have a closer look), this was fixed and does now work as expected without workarounds. (Tested: 4.5.1) |