Bug 212619

Summary: Linux Kernel build bug with AMD_IOMMU_V2=M and HSA_AMD=Y
Product: Other Reporter: deference
Component: ModulesAssignee: other_modules
Status: NEW ---    
Severity: normal CC: smf-linux
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: My .config in case you can't reproduce.

Description deference 2021-04-08 23:04:17 UTC
Created attachment 296301 [details]
My .config in case you can't reproduce.

When compiling for Linux version 5.11.12 using the AMDGPU GPU driver
with HSA_AMD enabled, I get the below set of errors. To work around this,
I need to change AMD_IOMMU_V2 from M to Y. This bug doesn't affect linux
kernel version 5.6 as it requires AMD_IOMMU_V2 to by Y when HSA_AMD is
enabled.
I'd bisect and request the removal of the relevant patch, but it's
possible that building the linux kernel should work this way and so a fix,
not a patch removal, is what should be issued.
I'm attaching my kernel config for 5.11.

Thanks,
David

drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: In function
`kfd_iommu_bind_process_to_device': /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:120:
undefined reference to `amd_iommu_bind_pasid'
drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: In function
`kfd_iommu_unbind_process': /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:138:
undefined reference to `amd_iommu_unbind_pasid'
drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: In function
`kfd_iommu_suspend': /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:292:
undefined reference to
`amd_iommu_set_invalidate_ctx_cb' /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:293:
undefined reference to `amd_iommu_set_invalid_ppr_cb'
drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: In function
`kfd_iommu_resume': /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:312:
undefined reference to
`amd_iommu_init_device' /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:316:
undefined reference to
`amd_iommu_set_invalidate_ctx_cb' /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:318:
undefined reference to
`amd_iommu_set_invalid_ppr_cb' /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:323:
undefined reference to
`amd_iommu_set_invalidate_ctx_cb' /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:324:
undefined reference to
`amd_iommu_set_invalid_ppr_cb' /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:325:
undefined reference to
`amd_iommu_free_device' /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:232:
undefined reference to `amd_iommu_bind_pasid'
drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: In function
`kfd_iommu_suspend': /root/working/linux-5.11.12/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_iommu.c:294:
undefined reference to `amd_iommu_free_device' Makefile:1166: recipe for
target 'vmlinux' failed make: *** [vmlinux] Error 1
Comment 1 Stuart Foster 2021-05-03 09:18:58 UTC
Just built 5.12.1 and got:

+ ld -m elf_x86_64 --emit-relocs --discard-none -z max-page-size=0x200000 --build-id=sha1 --orphan-handling=warn --strip-debug -o .tmp_vmlinux.kallsyms1 -T ./arch/x86/kernel/vmlinux.lds --whole-archive arch/x86/kernel/head_64.o arch/x86/kernel/head64.o arch/x86/kernel/ebda.o arch/x86/kernel/platform-quirks.o init/built-in.a usr/built-in.a arch/x86/built-in.a kernel/built-in.a certs/built-in.a mm/built-in.a fs/built-in.a ipc/built-in.a security/built-in.a crypto/built-in.a block/built-in.a lib/built-in.a arch/x86/lib/built-in.a lib/lib.a arch/x86/lib/lib.a drivers/built-in.a sound/built-in.a net/built-in.a virt/built-in.a arch/x86/pci/built-in.a arch/x86/power/built-in.a arch/x86/video/built-in.a --no-whole-archive --start-group --end-group
ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device':
kfd_iommu.c:(.text+0x27d): undefined reference to `amd_iommu_bind_pasid'
ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process':
kfd_iommu.c:(.text+0x2de): undefined reference to `amd_iommu_unbind_pasid'
ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
kfd_iommu.c:(.text+0x3a3): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
ld: kfd_iommu.c:(.text+0x3ae): undefined reference to `amd_iommu_set_invalid_ppr_cb'
ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume':
kfd_iommu.c:(.text+0x40b): undefined reference to `amd_iommu_init_device'
ld: kfd_iommu.c:(.text+0x429): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
ld: kfd_iommu.c:(.text+0x439): undefined reference to `amd_iommu_set_invalid_ppr_cb'
ld: kfd_iommu.c:(.text+0x4aa): undefined reference to `amd_iommu_bind_pasid'
ld: kfd_iommu.c:(.text+0x50c): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
ld: kfd_iommu.c:(.text+0x517): undefined reference to `amd_iommu_set_invalid_ppr_cb'
ld: kfd_iommu.c:(.text+0x520): undefined reference to `amd_iommu_free_device'
ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
kfd_iommu.c:(.text+0x3bf): undefined reference to `amd_iommu_free_device'

This look like the same issue.
Comment 2 Stuart Foster 2021-05-03 09:46:29 UTC
found this patch which seems to fix things:

---
  drivers/gpu/drm/amd/amdkfd/Kconfig | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

 diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
 index f02c938f75da..91f85dfb7ba6 100644
 --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
 +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
 @@ -5,8 +5,9 @@
  
  config HSA_AMD
  	bool "HSA kernel driver for AMD GPU devices"
 -	depends on DRM_AMDGPU && (X86_64 || ARM64 || PPC64)
 -	imply AMD_IOMMU_V2 if X86_64
 +	depends on DRM_AMDGPU && ((X86_64 && IOMMU_SUPPORT && ACPI) || ARM64 || PPC64)
 +	select AMD_IOMMU if X86_64
 +	select AMD_IOMMU_V2 if X86_64
  	select HMM_MIRROR
  	select MMU_NOTIFIER
  	select DRM_AMDGPU_USERPTR
Comment 3 deference 2021-05-04 16:19:53 UTC
Originally, a bit of code just like the above was used. It was later removed. Therefore, the question still remains as to whether or not the kernel should be able to build with the HSA driver enabled with the IOMMMU_V2 as a module.