Bug 208811
Summary: | AMDGPU on-load null pointer dereference | ||
---|---|---|---|
Product: | Drivers | Reporter: | R0b0t1 (sid) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | NEW --- | ||
Severity: | high | CC: | alexdeucher |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.8.0 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
kernel config
dmesg 5.8 APU dmesg 5.4.48 GPU dmesg 5.4.48 APU |
5.4.48 [ 77.383336] [drm] amdgpu kernel modesetting enabled. [ 77.383382] amdgpu 0000:03:00.0: remove_conflicting_pci_framebuffers: bar 0: 0xe0000000 -> 0xefffffff [ 77.383383] amdgpu 0000:03:00.0: remove_conflicting_pci_framebuffers: bar 2: 0xf0000000 -> 0xf01fffff [ 77.383384] amdgpu 0000:03:00.0: remove_conflicting_pci_framebuffers: bar 5: 0xfe700000 -> 0xfe77ffff [ 77.383385] checking generic (e0000000 7f0000) vs hw (e0000000 10000000) [ 77.383386] fb0: switching to amdgpudrmfb from EFI VGA [ 77.383521] Console: switching to colour dummy device 80x25 [ 77.383557] amdgpu 0000:03:00.0: vgaarb: deactivate vga console [ 77.383604] amdgpu 0000:03:00.0: enabling device (0006 -> 0007) [ 77.383777] [drm] initializing kernel modesetting (RAVEN 0x1002:0x15D8 0x1043:0x1B71 0xC1). [ 77.383882] [drm] register mmio base: 0xFE700000 [ 77.383883] [drm] register mmio size: 524288 [ 77.383898] [drm] add ip block number 0 <soc15_common> [ 77.383899] [drm] add ip block number 1 <gmc_v9_0> [ 77.383900] [drm] add ip block number 2 <vega10_ih> [ 77.383900] [drm] add ip block number 3 <psp> [ 77.383901] [drm] add ip block number 4 <gfx_v9_0> [ 77.383902] [drm] add ip block number 5 <sdma_v4_0> [ 77.383902] [drm] add ip block number 6 <powerplay> [ 77.383903] [drm] add ip block number 7 <dm> [ 77.383904] [drm] add ip block number 8 <vcn_v1_0> [ 77.383921] ATOM BIOS: 113-PICASSO-116 [ 77.383931] [drm] VCN decode is enabled in VM mode [ 77.383931] [drm] VCN encode is enabled in VM mode [ 77.383932] [drm] VCN jpeg decode is enabled in VM mode [ 77.383962] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit [ 77.383968] amdgpu 0000:03:00.0: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used) [ 77.383970] amdgpu 0000:03:00.0: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF [ 77.383971] amdgpu 0000:03:00.0: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF [ 77.383974] [drm] Detected VRAM RAM=2048M, BAR=2048M [ 77.383975] [drm] RAM width 128bits DDR4 [ 77.384061] [TTM] Zone kernel: Available graphics memory: 7134728 KiB [ 77.384062] [TTM] Zone dma32: Available graphics memory: 2097152 KiB [ 77.384062] [TTM] Initializing pool allocator [ 77.384065] [TTM] Initializing DMA pool allocator [ 77.384127] [drm] amdgpu: 2048M of VRAM memory ready [ 77.384129] [drm] amdgpu: 3072M of GTT memory ready. [ 77.384138] software IO TLB: Memory encryption is active and system is using DMA bounce buffers [ 77.384139] [drm] GART: num cpu pages 262144, num gpu pages 262144 [ 77.385195] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000). [ 77.393379] [drm] use_doorbell being set to: [true] [ 77.393452] amdgpu: [powerplay] hwmgr_sw_init smu backed is smu10_smu [ 77.397860] [drm] Found VCN firmware Version ENC: 1.9 DEC: 1 VEP: 0 Revision: 28 [ 77.397865] [drm] PSP loading VCN firmware [ 77.431850] [drm] reserve 0x400000 from 0xf47f800000 for PSP TMR [ 77.464806] [drm] psp command failed and response status is (0x34) [ 77.467806] [drm] failed to load ucode id (0) [ 77.467806] [drm] psp command failed and response status is (0x300F) [ 77.470808] [drm] failed to load ucode id (8) [ 77.470808] [drm] psp command failed and response status is (0x300F) [ 77.473807] [drm] failed to load ucode id (9) [ 77.473808] [drm] psp command failed and response status is (0x300F) [ 77.476807] [drm] failed to load ucode id (10) [ 77.476808] [drm] psp command failed and response status is (0x300F) [ 77.479809] [drm] failed to load ucode id (11) [ 77.479810] [drm] psp command failed and response status is (0x300F) [ 77.482808] [drm] failed to load ucode id (12) [ 77.482808] [drm] psp command failed and response status is (0x300F) [ 77.485807] [drm] failed to load ucode id (13) [ 77.485808] [drm] psp command failed and response status is (0x300F) [ 77.488799] [drm] failed to load ucode id (14) [ 77.488800] [drm] psp command failed and response status is (0x300F) [ 77.491807] [drm] failed to load ucode id (17) [ 77.491808] [drm] psp command failed and response status is (0xF) [ 77.494807] [drm] failed to load ucode id (18) [ 77.494807] [drm] psp command failed and response status is (0x300F) [ 77.497807] [drm] failed to load ucode id (19) [ 77.497807] [drm] psp command failed and response status is (0xF) [ 77.500808] [drm] failed to load ucode id (20) [ 77.500809] [drm] psp command failed and response status is (0x300F) [ 77.503807] [drm] failed to load ucode id (26) [ 77.503807] [drm] psp command failed and response status is (0x300F) [ 77.506807] [drm] failed to load ucode id (28) [ 77.506808] [drm] psp command failed and response status is (0xF) [ 77.509807] [drm] failed to load ucode id (29) [ 77.509808] [drm] psp command failed and response status is (0xF) [ 77.512201] amdgpu 0000:03:00.0: [gfxhub0] no-retry page fault (src_id:0 ring:221 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.512203] amdgpu 0000:03:00.0: in page starting at address 0x0000000000000000 from client 27 [ 77.512204] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000BBA [ 77.512205] amdgpu 0000:03:00.0: MORE_FAULTS: 0x0 [ 77.512206] amdgpu 0000:03:00.0: WALKER_ERROR: 0x5 [ 77.512207] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0xb [ 77.512207] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x1 [ 77.512208] amdgpu 0000:03:00.0: RW: 0x0 [ 77.512265] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.512266] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.512268] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.512269] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.512270] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.512271] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.512272] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.512273] amdgpu 0000:03:00.0: RW: 0x0 [ 77.514181] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.514183] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.514184] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.514185] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.514185] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.514186] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.514187] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.514188] amdgpu 0000:03:00.0: RW: 0x0 [ 77.515233] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.515235] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.515236] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.515237] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.515238] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.515238] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.515239] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.515240] amdgpu 0000:03:00.0: RW: 0x0 [ 77.516270] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.516272] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.516273] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.516273] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.516274] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.516275] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.516276] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.516276] amdgpu 0000:03:00.0: RW: 0x0 [ 77.517321] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.517322] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.517323] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.517324] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.517324] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.517325] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.517326] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.517327] amdgpu 0000:03:00.0: RW: 0x0 [ 77.518381] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.518382] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.518383] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.518384] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.518385] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.518385] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.518386] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.518387] amdgpu 0000:03:00.0: RW: 0x0 [ 77.519430] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.519432] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.519432] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.519433] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.519434] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.519435] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.519435] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.519436] amdgpu 0000:03:00.0: RW: 0x0 [ 77.520469] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.520471] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.520472] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.520472] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.520473] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.520474] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.520474] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.520475] amdgpu 0000:03:00.0: RW: 0x0 [ 77.521524] amdgpu 0000:03:00.0: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 77.521525] amdgpu 0000:03:00.0: in page starting at address 0x0000000000001000 from client 27 [ 77.521526] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000A91 [ 77.521527] amdgpu 0000:03:00.0: MORE_FAULTS: 0x1 [ 77.521527] amdgpu 0000:03:00.0: WALKER_ERROR: 0x0 [ 77.521528] amdgpu 0000:03:00.0: PERMISSION_FAULTS: 0x9 [ 77.521529] amdgpu 0000:03:00.0: MAPPING_ERROR: 0x0 [ 77.521529] amdgpu 0000:03:00.0: RW: 0x0 [ 77.749194] amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) [ 77.749272] [drm:gfx_v9_0_hw_init [amdgpu]] *ERROR* KCQ enable failed [ 77.749355] [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <gfx_v9_0> failed -110 [ 77.749356] amdgpu 0000:03:00.0: amdgpu_device_ip_init failed [ 77.749358] amdgpu 0000:03:00.0: Fatal error during GPU init [ 77.749359] [drm] amdgpu: finishing device. [ 77.798015] ------------[ cut here ]------------ [ 77.798016] Memory manager not clean during takedown. [ 77.798030] WARNING: CPU: 2 PID: 2926 at drivers/gpu/drm/drm_mm.c:939 drm_mm_takedown+0x1e/0x30 [ 77.798031] Modules linked in: amdgpu(+) mfd_core gpu_sched ttm iwlmvm kvm_amd kvm uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev irqbypass ax88179_178a mc snd_hda_codec_realtek usbnet snd_hda_codec_hdmi iwlwifi efivarfs [ 77.798043] CPU: 2 PID: 2926 Comm: modprobe Not tainted 5.4.48-gentoo #2 [ 77.798044] Hardware name: ASUSTeK COMPUTER INC. ZenBook UX434DA_UM433DA/UX434DA, BIOS UX434DA_UM433DA.302 09/05/2019 [ 77.798045] RIP: 0010:drm_mm_takedown+0x1e/0x30 [ 77.798047] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 53 48 89 fb 48 83 c3 38 48 8b 03 48 39 c3 75 02 5b c3 48 c7 c7 50 ff 2e 9b e8 3b 28 99 ff <0f> 0b 5b c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 41 57 49 89 [ 77.798048] RSP: 0018:ffffa3f2c07d7980 EFLAGS: 00010282 [ 77.798049] RAX: 0000000000000000 RBX: ffff9d514a5ccc38 RCX: 0000000000000006 [ 77.798050] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff9d5150c964d0 [ 77.798051] RBP: ffff9d510a5a4f50 R08: 0000000000000001 R09: 00000000000003e5 [ 77.798051] R10: 0000000000014a10 R11: 0000000000000001 R12: ffff9d514a5ccc00 [ 77.798052] R13: 0000000000000000 R14: ffff9d510a5a50c0 R15: 0000000000000170 [ 77.798053] FS: 00007f8cb2a2b740(0000) GS:ffff9d5150c80000(0000) knlGS:0000000000000000 [ 77.798054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 77.798054] CR2: 00007ffe84b50ff0 CR3: 00008003c9eae000 CR4: 00000000003406e0 [ 77.798055] Call Trace: [ 77.798122] amdgpu_vram_mgr_fini+0x23/0x90 [amdgpu] [ 77.798128] ttm_bo_clean_mm+0xab/0xc0 [ttm] [ 77.798188] amdgpu_ttm_fini+0x6e/0xc0 [amdgpu] [ 77.798249] amdgpu_bo_fini+0xc/0x30 [amdgpu] [ 77.798314] gmc_v9_0_sw_fini+0x11a/0x180 [amdgpu] [ 77.798376] ? amdgpu_sa_bo_manager_fini+0x7a/0x90 [amdgpu] [ 77.798446] amdgpu_device_fini+0x24a/0x46f [amdgpu] [ 77.798505] amdgpu_driver_unload_kms+0x45/0x90 [amdgpu] [ 77.798575] amdgpu_driver_load_kms.cold+0x39/0x5b [amdgpu] [ 77.798577] drm_dev_register+0x10c/0x150 [ 77.798636] amdgpu_pci_probe+0xe9/0x150 [amdgpu] [ 77.798639] ? __pm_runtime_resume+0x54/0x70 [ 77.798642] local_pci_probe+0x3d/0x70 [ 77.798644] pci_device_probe+0xd0/0x150 [ 77.798647] really_probe+0xd9/0x2a0 [ 77.798648] driver_probe_device+0x4b/0xc0 [ 77.798649] device_driver_attach+0x4e/0x60 [ 77.798650] __driver_attach+0x4d/0xc0 [ 77.798651] ? device_driver_attach+0x60/0x60 [ 77.798654] bus_for_each_dev+0x75/0xc0 [ 77.798656] bus_add_driver+0x172/0x1c0 [ 77.798657] driver_register+0x68/0xc0 [ 77.798659] ? 0xffffffffc04a8000 [ 77.798661] do_one_initcall+0x44/0x1df [ 77.798664] ? _cond_resched+0x10/0x20 [ 77.798667] ? kmem_cache_alloc_trace+0x196/0x220 [ 77.798669] do_init_module+0x56/0x200 [ 77.798671] load_module+0x2380/0x2600 [ 77.798674] ? __do_sys_finit_module+0xc6/0xe0 [ 77.798675] __do_sys_finit_module+0xc6/0xe0 [ 77.798677] do_syscall_64+0x46/0x110 [ 77.798678] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 77.798680] RIP: 0033:0x7f8cb2b24789 [ 77.798682] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d7 06 0c 00 f7 d8 64 89 01 48 [ 77.798683] RSP: 002b:00007ffe84b54018 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 77.798684] RAX: ffffffffffffffda RBX: 000055e41a8b0a80 RCX: 00007f8cb2b24789 [ 77.798684] RDX: 0000000000000000 RSI: 000055e418c14390 RDI: 0000000000000006 [ 77.798685] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000 [ 77.798685] R10: 0000000000000006 R11: 0000000000000246 R12: 000055e418c14390 [ 77.798686] R13: 0000000000000000 R14: 000055e41a8b0b00 R15: 000055e41a8b0a80 [ 77.798687] ---[ end trace 3c1c3b84380fb311 ]--- [ 77.798697] [TTM] Finalizing pool allocator [ 77.798700] [TTM] Finalizing DMA pool allocator [ 77.798827] [TTM] Zone kernel: Used memory at exit: 1 KiB [ 77.798831] [TTM] Zone dma32: Used memory at exit: 0 KiB [ 77.798833] [drm] amdgpu: ttm finalized [ 77.798851] ------------[ cut here ]------------ [ 77.798852] sysfs group 'fw_version' not found for kobject '0000:03:00.0' [ 77.798860] WARNING: CPU: 2 PID: 2926 at fs/sysfs/group.c:278 sysfs_remove_group+0x70/0x80 [ 77.798860] Modules linked in: amdgpu(+) mfd_core gpu_sched ttm iwlmvm kvm_amd kvm uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev irqbypass ax88179_178a mc snd_hda_codec_realtek usbnet snd_hda_codec_hdmi iwlwifi efivarfs [ 77.798868] CPU: 2 PID: 2926 Comm: modprobe Tainted: G W 5.4.48-gentoo #2 [ 77.798869] Hardware name: ASUSTeK COMPUTER INC. ZenBook UX434DA_UM433DA/UX434DA, BIOS UX434DA_UM433DA.302 09/05/2019 [ 77.798871] RIP: 0010:sysfs_remove_group+0x70/0x80 [ 77.798873] Code: ff 5b 48 89 ef 5d 41 5c e9 ed bb ff ff 48 89 ef e8 05 b9 ff ff eb cc 49 8b 14 24 48 8b 33 48 c7 c7 e0 c0 2a 9b e8 19 ad d9 ff <0f> 0b 5b 5d 41 5c c3 66 0f 1f 84 00 00 00 00 00 41 54 49 89 fc 55 [ 77.798873] RSP: 0018:ffffa3f2c07d7a50 EFLAGS: 00010282 [ 77.798874] RAX: 0000000000000000 RBX: ffffffffc08afbe0 RCX: 0000000000000425 [ 77.798875] RDX: 0000000000000001 RSI: 0000000000000086 RDI: ffffffff9bcc4248 [ 77.798876] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000425 [ 77.798877] R10: 0000000000015b24 R11: 0000000000000001 R12: ffff9d514e5650b0 [ 77.798877] R13: ffff9d510a5b4da8 R14: ffff9d5109d5cb60 R15: 0000000000000000 [ 77.798878] FS: 00007f8cb2a2b740(0000) GS:ffff9d5150c80000(0000) knlGS:0000000000000000 [ 77.798880] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 77.798880] CR2: 00007ffe84b50ff0 CR3: 00008003c9eae000 CR4: 00000000003406e0 [ 77.798881] Call Trace: [ 77.798979] amdgpu_device_fini+0x43b/0x46f [amdgpu] [ 77.799062] amdgpu_driver_unload_kms+0x45/0x90 [amdgpu] [ 77.799159] amdgpu_driver_load_kms.cold+0x39/0x5b [amdgpu] [ 77.799162] drm_dev_register+0x10c/0x150 [ 77.799244] amdgpu_pci_probe+0xe9/0x150 [amdgpu] [ 77.799246] ? __pm_runtime_resume+0x54/0x70 [ 77.799248] local_pci_probe+0x3d/0x70 [ 77.799250] pci_device_probe+0xd0/0x150 [ 77.799252] really_probe+0xd9/0x2a0 [ 77.799253] driver_probe_device+0x4b/0xc0 [ 77.799255] device_driver_attach+0x4e/0x60 [ 77.799256] __driver_attach+0x4d/0xc0 [ 77.799257] ? device_driver_attach+0x60/0x60 [ 77.799259] bus_for_each_dev+0x75/0xc0 [ 77.799261] bus_add_driver+0x172/0x1c0 [ 77.799263] driver_register+0x68/0xc0 [ 77.799264] ? 0xffffffffc04a8000 [ 77.799265] do_one_initcall+0x44/0x1df [ 77.799268] ? _cond_resched+0x10/0x20 [ 77.799269] ? kmem_cache_alloc_trace+0x196/0x220 [ 77.799271] do_init_module+0x56/0x200 [ 77.799273] load_module+0x2380/0x2600 [ 77.799276] ? __do_sys_finit_module+0xc6/0xe0 [ 77.799277] __do_sys_finit_module+0xc6/0xe0 [ 77.799279] do_syscall_64+0x46/0x110 [ 77.799281] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 77.799282] RIP: 0033:0x7f8cb2b24789 [ 77.799284] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d7 06 0c 00 f7 d8 64 89 01 48 [ 77.799285] RSP: 002b:00007ffe84b54018 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 77.799286] RAX: ffffffffffffffda RBX: 000055e41a8b0a80 RCX: 00007f8cb2b24789 [ 77.799287] RDX: 0000000000000000 RSI: 000055e418c14390 RDI: 0000000000000006 [ 77.799288] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000 [ 77.799289] R10: 0000000000000006 R11: 0000000000000246 R12: 000055e418c14390 [ 77.799289] R13: 0000000000000000 R14: 000055e41a8b0b00 R15: 000055e41a8b0a80 [ 77.799291] ---[ end trace 3c1c3b84380fb312 ]--- [ 77.799529] amdgpu: probe of 0000:03:00.0 failed with error -110 Is this a regression? Can you attach the full dmesg output? If this is a new system, does your bios have an option like Pre-boot DMA protection or UEFI DMA protection? If so, does disabling it fix the issue? Created attachment 290773 [details]
dmesg 5.8 APU
Created attachment 290775 [details]
dmesg 5.4.48 GPU
Created attachment 290777 [details]
dmesg 5.4.48 APU
To clarify, there is one machine with an APU (3700U) and one machine with a GPU (RX 5700XT). I have a machine with an older Picasso APU that works on 5.4.48. I am unsure if it is a regression, but I have messed with some settings to ensure it is not a configuration issue. As I understand these devices work for most users. BIOS has no settings I can recognize as DMA protection. Both devices are ASUS platforms, the desktop is a TUF X570-PLUS (WiFi) and the laptop is a ZenBook UM433D. Related Bug 204181, related https://bugzilla.redhat.com/show_bug.cgi?id=1851855. (In reply to R0b0t1 from comment #7) > Related Bug 204181, related > https://bugzilla.redhat.com/show_bug.cgi?id=1851855. Those are unrelated. Does disabling memory encryption fix the issue? Yep, seems obvious in retrospect. |
Created attachment 290771 [details] kernel config Originally encountered on 5.4.48, updated to 5.8 to see if persists. 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Picasso (rev c1) I receive a similar error with a NAVI10 device. Possible relation to #206519. [ 104.139682] [drm] amdgpu kernel modesetting enabled. [ 104.139754] checking generic (e0000000 7f0000) vs hw (e0000000 10000000) [ 104.139755] fb0: switching to amdgpudrmfb from EFI VGA [ 104.139952] Console: switching to colour dummy device 80x25 [ 104.139985] amdgpu 0000:03:00.0: vgaarb: deactivate vga console [ 104.140073] amdgpu 0000:03:00.0: enabling device (0006 -> 0007) [ 104.140237] [drm] initializing kernel modesetting (RAVEN 0x1002:0x15D8 0x1043:0x1B71 0xC1). [ 104.140241] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default) [ 104.140330] [drm] register mmio base: 0xFE700000 [ 104.140331] [drm] register mmio size: 524288 [ 104.140345] [drm] add ip block number 0 <soc15_common> [ 104.140346] [drm] add ip block number 1 <gmc_v9_0> [ 104.140347] [drm] add ip block number 2 <vega10_ih> [ 104.140348] [drm] add ip block number 3 <psp> [ 104.140349] [drm] add ip block number 4 <gfx_v9_0> [ 104.140349] [drm] add ip block number 5 <sdma_v4_0> [ 104.140350] [drm] add ip block number 6 <powerplay> [ 104.140351] [drm] add ip block number 7 <dm> [ 104.140351] [drm] add ip block number 8 <vcn_v1_0> [ 104.141800] amdgpu: ATOM BIOS: 113-PICASSO-116 [ 104.142877] [drm] VCN decode is enabled in VM mode [ 104.142878] [drm] VCN encode is enabled in VM mode [ 104.142878] [drm] JPEG decode is enabled in VM mode [ 104.142907] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit [ 104.142913] amdgpu 0000:03:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used) [ 104.142915] amdgpu 0000:03:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF [ 104.142916] amdgpu 0000:03:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF [ 104.142919] [drm] Detected VRAM RAM=2048M, BAR=2048M [ 104.142920] [drm] RAM width 128bits DDR4 [ 104.142974] [TTM] Zone kernel: Available graphics memory: 7134942 KiB [ 104.142975] [TTM] Zone dma32: Available graphics memory: 2097152 KiB [ 104.142975] [TTM] Initializing pool allocator [ 104.142979] [TTM] Initializing DMA pool allocator [ 104.143038] [drm] amdgpu: 2048M of VRAM memory ready [ 104.143041] [drm] amdgpu: 3072M of GTT memory ready. [ 104.143042] software IO TLB: Memory encryption is active and system is using DMA bounce buffers [ 104.143044] [drm] GART: num cpu pages 262144, num gpu pages 262144 [ 104.143290] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000). [ 104.162120] amdgpu: hwmgr_sw_init smu backed is smu10_smu [ 104.166452] [drm] Found VCN firmware Version ENC: 1.9 DEC: 1 VEP: 0 Revision: 28 [ 104.166465] [drm] PSP loading VCN firmware [ 104.207801] [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR [ 104.219109] [drm] failed to load ucode id (0) [ 104.219110] [drm] psp command (0x6) failed and response status is (0xFFFF300F) [ 104.222108] [drm] failed to load ucode id (8) [ 104.222109] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.225109] [drm] failed to load ucode id (9) [ 104.225110] [drm] psp command (0x6) failed and response status is (0xFFFF300F) [ 104.228108] [drm] failed to load ucode id (10) [ 104.228109] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.234110] [drm] failed to load ucode id (11) [ 104.234111] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.237109] [drm] failed to load ucode id (12) [ 104.237110] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.243111] [drm] failed to load ucode id (13) [ 104.243112] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.246109] [drm] failed to load ucode id (14) [ 104.246110] [drm] psp command (0x6) failed and response status is (0xFFFF300F) [ 104.249109] [drm] failed to load ucode id (17) [ 104.249110] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.252110] [drm] failed to load ucode id (18) [ 104.252111] [drm] psp command (0x6) failed and response status is (0xFFFF300F) [ 104.255109] [drm] failed to load ucode id (19) [ 104.255110] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.258109] [drm] failed to load ucode id (20) [ 104.258110] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.264104] [drm] failed to load ucode id (26) [ 104.264105] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.267109] [drm] failed to load ucode id (28) [ 104.267110] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.270109] [drm] failed to load ucode id (29) [ 104.270110] [drm] psp command (0x6) failed and response status is (0xFFFF000F) [ 104.318109] [drm] psp command (0x4) failed and response status is (0x34) [ 104.318113] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available [ 104.324110] [drm] psp command (0x1) failed and response status is (0x34) [ 104.327106] [drm] psp command (0x1) failed and response status is (0x34) [ 104.328567] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:221 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 104.328569] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 27 [ 104.328570] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000BBA [ 104.328571] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: 0x5 [ 104.328572] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 104.328572] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5 [ 104.328573] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0xb [ 104.328574] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1 [ 104.328574] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 [ 104.328630] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:72 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 104.328631] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000001000 from client 27 [ 104.328632] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000A90 [ 104.328633] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: 0x5 [ 104.328633] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 104.328634] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 104.328635] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x9 [ 104.328635] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 104.328636] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 [ 104.328687] [drm] kiq ring mec 2 pipe 1 q 0 [ 105.364077] amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) [ 105.364154] [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed [ 105.364233] [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <gfx_v9_0> failed -110 [ 105.364235] amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_init failed [ 105.364237] amdgpu 0000:03:00.0: amdgpu: Fatal error during GPU init [ 105.364256] [drm] amdgpu: finishing device. [ 105.367111] [drm] psp command (0x2) failed and response status is (0x11A) [ 105.370114] [drm] psp command (0x2) failed and response status is (0x11A) [ 105.373116] [drm] psp command (0x2) failed and response status is (0x11A) [ 105.373117] [drm] free PSP TMR buffer [ 105.459143] [TTM] Finalizing pool allocator [ 105.459147] [TTM] Finalizing DMA pool allocator [ 105.459250] [TTM] Zone kernel: Used memory at exit: 0 KiB [ 105.459253] [TTM] Zone dma32: Used memory at exit: 0 KiB [ 105.459254] [drm] amdgpu: ttm finalized [ 105.459414] [drm] Initialized amdgpu 3.38.0 20150101 for 0000:03:00.0 on minor 0 [ 105.459420] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 105.459425] #PF: supervisor read access in kernel mode [ 105.459427] #PF: error_code(0x0000) - not-present page [ 105.459429] PGD 0 P4D 0 [ 105.459432] Oops: 0000 [#1] SMP NOPTI [ 105.459435] CPU: 3 PID: 2936 Comm: modprobe Not tainted 5.8.0-gentoo-r1 #1 [ 105.459437] Hardware name: ASUSTeK COMPUTER INC. ZenBook UX434DA_UM433DA/UX434DA, BIOS UX434DA_UM433DA.302 09/05/2019 [ 105.459528] RIP: 0010:amdgpu_debugfs_init+0x7/0x2b0 [amdgpu] [ 105.459531] Code: 74 04 85 db 74 b2 48 89 84 dd 28 03 00 00 83 fa 08 75 c4 5b 31 c0 5d 41 5c c3 66 0f 1f 84 00 00 00 00 00 41 54 55 48 89 fd 53 <48> 8b 45 08 48 89 e9 49 c7 c0 40 fd 9c c0 be 80 01 00 00 48 c7 c7 [ 105.459536] RSP: 0018:ffffa2a742a57af0 EFLAGS: 00010246 [ 105.459538] RAX: 0000000000000000 RBX: ffffffffc09ccf00 RCX: 0000000000000000 [ 105.459541] RDX: ffff9c6a8957aac0 RSI: 0000000000000092 RDI: 0000000000000000 [ 105.459543] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000000003be [ 105.459546] R10: 0000000000013754 R11: 0000000000000001 R12: 0000000000000000 [ 105.459549] R13: 0000000000000000 R14: ffff9c6a8a3c5000 R15: ffffffffc0b99fd0 [ 105.459553] FS: 00007f07d2140740(0000) GS:ffff9c6a90ec0000(0000) knlGS:0000000000000000 [ 105.459556] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 105.459559] CR2: 0000000000000008 CR3: 00008003c613a000 CR4: 00000000003406e0 [ 105.459562] Call Trace: [ 105.459660] amdgpu_pci_probe+0x14e/0x170 [amdgpu] [ 105.459668] local_pci_probe+0x3d/0x70 [ 105.459673] pci_device_probe+0xd0/0x150 [ 105.459678] really_probe+0xd9/0x2a0 [ 105.459682] driver_probe_device+0x4a/0xa0 [ 105.459685] device_driver_attach+0x4e/0x60 [ 105.459689] __driver_attach+0x4d/0xc0 [ 105.459692] ? device_driver_attach+0x60/0x60 [ 105.459695] bus_for_each_dev+0x75/0xc0 [ 105.459699] bus_add_driver+0x172/0x1c0 [ 105.459703] driver_register+0x68/0xc0 [ 105.459706] ? 0xffffffffc05c3000 [ 105.459711] do_one_initcall+0x44/0x1e0 [ 105.459715] ? _cond_resched+0x10/0x20 [ 105.459720] ? kmem_cache_alloc_trace+0x196/0x220 [ 105.459724] do_init_module+0x56/0x200 [ 105.459728] load_module+0x2424/0x26b0 [ 105.459733] ? __do_sys_finit_module+0xc6/0xe0 [ 105.459737] __do_sys_finit_module+0xc6/0xe0 [ 105.459742] do_syscall_64+0x42/0x70 [ 105.459746] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 105.459750] RIP: 0033:0x7f07d2239789 [ 105.459753] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d7 06 0c 00 f7 d8 64 89 01 48 [ 105.459760] RSP: 002b:00007fff5e8f9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 105.459764] RAX: ffffffffffffffda RBX: 000055dac6747a80 RCX: 00007f07d2239789 [ 105.459767] RDX: 0000000000000000 RSI: 000055dac504d390 RDI: 0000000000000006 [ 105.459770] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000 [ 105.459773] R10: 0000000000000006 R11: 0000000000000246 R12: 000055dac504d390 [ 105.459775] R13: 0000000000000000 R14: 000055dac6747b00 R15: 000055dac6747a80 [ 105.459779] Modules linked in: amdgpu(+) mfd_core gpu_sched ttm iwlmvm uvcvideo kvm_amd videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common kvm videodev ax88179_178a mc irqbypass usbnet snd_hda_codec_realtek snd_hda_codec_hdmi iwlwifi efivarfs [ 105.459797] CR2: 0000000000000008 [ 105.459800] ---[ end trace 2525194e592c9997 ]--- [ 105.459904] RIP: 0010:amdgpu_debugfs_init+0x7/0x2b0 [amdgpu] [ 105.459908] Code: 74 04 85 db 74 b2 48 89 84 dd 28 03 00 00 83 fa 08 75 c4 5b 31 c0 5d 41 5c c3 66 0f 1f 84 00 00 00 00 00 41 54 55 48 89 fd 53 <48> 8b 45 08 48 89 e9 49 c7 c0 40 fd 9c c0 be 80 01 00 00 48 c7 c7 [ 105.459914] RSP: 0018:ffffa2a742a57af0 EFLAGS: 00010246 [ 105.459917] RAX: 0000000000000000 RBX: ffffffffc09ccf00 RCX: 0000000000000000 [ 105.459920] RDX: ffff9c6a8957aac0 RSI: 0000000000000092 RDI: 0000000000000000 [ 105.459922] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000000003be [ 105.459924] R10: 0000000000013754 R11: 0000000000000001 R12: 0000000000000000 [ 105.459927] R13: 0000000000000000 R14: ffff9c6a8a3c5000 R15: ffffffffc0b99fd0 [ 105.459930] FS: 00007f07d2140740(0000) GS:ffff9c6a90ec0000(0000) knlGS:0000000000000000 [ 105.459933] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 105.459935] CR2: 0000000000000008 CR3: 00008003c613a000 CR4: 00000000003406e0