Bug 214029 - [bisected] [amdgpu] Several memory leaks in amdgpu and ttm
Summary: [bisected] [amdgpu] Several memory leaks in amdgpu and ttm
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-10 18:34 UTC by Erhard F.
Modified: 2021-11-04 07:44 UTC (History)
6 users (show)

See Also:
Kernel Version: 5.14-rc5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel dmesg (kernel 5.14-rc5, AMD FX-8370) (48.27 KB, text/plain)
2021-08-10 18:34 UTC, Erhard F.
Details
output of kmemleak (kernel 5.14-rc5, AMD FX-8370) (20.69 KB, text/plain)
2021-08-10 18:35 UTC, Erhard F.
Details
kernel .config (kernel 5.14-rc5, AMD FX-8370) (107.75 KB, text/plain)
2021-08-10 18:36 UTC, Erhard F.
Details
kernel .config (kernel 5.14, AMD FX-8370) (107.80 KB, text/plain)
2021-08-30 13:45 UTC, Erhard F.
Details
kernel dmesg (kernel 5.14, AMD FX-8370) (72.93 KB, text/plain)
2021-08-30 13:45 UTC, Erhard F.
Details
output of kmemleak (kernel 5.14, AMD FX-8370) (32.00 KB, text/plain)
2021-08-30 13:49 UTC, Erhard F.
Details
output of kmemleak (kernel 5.15-rc1, AMD FX-8370) (14.62 KB, text/plain)
2021-09-13 19:57 UTC, Erhard F.
Details
kernel dmesg (kernel 5.15-rc1, AMD FX-8370) (79.27 KB, text/plain)
2021-09-13 19:58 UTC, Erhard F.
Details
kernel .config (kernel 5.15-rc1, AMD FX-8370) (108.98 KB, text/plain)
2021-09-13 20:00 UTC, Erhard F.
Details
kernel dmesg (kernel 5.15-rc2, AMD FX-8370) (67.52 KB, text/plain)
2021-09-20 16:29 UTC, Erhard F.
Details
kernel .config (kernel 5.15-rc2, AMD FX-8370) (108.34 KB, text/plain)
2021-09-20 16:30 UTC, Erhard F.
Details
bisect.log (5.18 KB, text/plain)
2021-09-20 16:35 UTC, Erhard F.
Details
kernel dmesg (kernel 5.14.6, AMD Opteron 6386 SE) (66.80 KB, text/plain)
2021-09-22 22:05 UTC, Erhard F.
Details
final bisect.log (2.84 KB, text/plain)
2021-09-28 15:25 UTC, Erhard F.
Details
kernel .config (kernel 5.15-rc5, AMD FX-8370) (108.49 KB, text/plain)
2021-10-15 22:55 UTC, Erhard F.
Details
Potential fix (1.00 KB, application/mbox)
2021-10-20 17:46 UTC, Christian König
Details

Description Erhard F. 2021-08-10 18:34:59 UTC
Created attachment 298265 [details]
kernel dmesg (kernel 5.14-rc5, AMD FX-8370)

Getting this on kernel 5.14-rc5 with my Radeon RX 5500.

unreferenced object 0xffff888169af1b40 (size 216):
  comm "lightdm-gtk-gre", pid 662, jiffies 4294902381 (age 13444.937s)
  hex dump (first 32 bytes):
    d0 1b af 69 81 88 ff ff 60 cb b9 c0 ff ff ff ff  ...i....`.......
    80 73 48 e1 13 00 00 00 58 7d c1 0b 00 c9 ff ff  .sH.....X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff888263377700 (size 72):
  comm "sdma0", pid 345, jiffies 4294902381 (age 13444.937s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    f2 a0 4c e1 13 00 00 00 58 28 9b c9 81 88 ff ff  ..L.....X(......
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff88811314b9c0 (size 216):
  comm "mate-session-ch", pid 768, jiffies 4294905408 (age 13434.854s)
  hex dump (first 32 bytes):
    50 ba 14 13 81 88 ff ff 60 cb b9 c0 ff ff ff ff  P.......`.......
    dc 7a c1 3a 16 00 00 00 58 7d c1 0b 00 c9 ff ff  .z.:....X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff888167ffc340 (size 72):
  comm "sdma0", pid 345, jiffies 4294905408 (age 13434.854s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    ac b5 c5 3a 16 00 00 00 58 e4 a7 01 81 88 ff ff  ...:....X.......
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff888113b6d240 (size 216):
  comm "mate-screensave", pid 57770, jiffies 4295052030 (age 12946.214s)
  hex dump (first 32 bytes):
    d0 d2 b6 13 81 88 ff ff 60 cb b9 c0 ff ff ff ff  ........`.......
    a2 85 ff 05 88 00 00 00 58 7d c1 0b 00 c9 ff ff  ........X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8881c85d6e80 (size 72):
  comm "sdma0", pid 345, jiffies 4295052030 (age 12946.217s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    0c a0 03 06 88 00 00 00 58 34 14 75 82 88 ff ff  ........X4.u....
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff888119b78940 (size 216):
  comm "mate-screensave", pid 98610, jiffies 4295149755 (age 12620.510s)
  hex dump (first 32 bytes):
    d0 89 b7 19 81 88 ff ff 60 cb b9 c0 ff ff ff ff  ........`.......
    08 db 28 de d3 00 00 00 58 7d c1 0b 00 c9 ff ff  ..(.....X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8882589af700 (size 72):
  comm "sdma0", pid 345, jiffies 4295149755 (age 12620.514s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    17 3c 2d de d3 00 00 00 58 b4 df 67 81 88 ff ff  .<-.....X..g....
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff8881274ccac0 (size 216):
  comm "mate-screensave", pid 98731, jiffies 4295150087 (age 12619.460s)
  hex dump (first 32 bytes):
    50 cb 4c 27 81 88 ff ff 60 cb b9 c0 ff ff ff ff  P.L'....`.......
    7e bc 18 20 d4 00 00 00 58 7d c1 0b 00 c9 ff ff  ~.. ....X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff888255796940 (size 72):
  comm "sdma0", pid 345, jiffies 4295150087 (age 12619.464s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    0f be 1c 20 d4 00 00 00 58 70 a1 be 81 88 ff ff  ... ....Xp......
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff88823ef75540 (size 216):
  comm "glxinfo", pid 173188, jiffies 4298442862 (age 1643.630s)
  hex dump (first 32 bytes):
    d0 55 f7 3e 82 88 ff ff 60 cb b9 c0 ff ff ff ff  .U.>....`.......
    d7 bb 9c a7 cf 0a 00 00 58 7d c1 0b 00 c9 ff ff  ........X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff88826dfee1c0 (size 72):
  comm "sdma0", pid 345, jiffies 4298442862 (age 1643.630s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    91 e3 a0 a7 cf 0a 00 00 58 04 01 14 81 88 ff ff  ........X.......
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff88823ef756c0 (size 216):
  comm "glxinfo:sh0", pid 173194, jiffies 4298442879 (age 1643.620s)
  hex dump (first 32 bytes):
    50 57 f7 3e 82 88 ff ff 60 cb b9 c0 ff ff ff ff  PW.>....`.......
    3a 18 f8 aa cf 0a 00 00 58 7d c1 0b 00 c9 ff ff  :.......X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8882a950cb80 (size 72):
  comm "sdma0", pid 345, jiffies 4298442879 (age 1643.620s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    27 7b fc aa cf 0a 00 00 58 cc ec 19 81 88 ff ff  '{......X.......
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff888227171840 (size 216):
  comm "glxinfo", pid 173188, jiffies 4298442879 (age 1643.620s)
  hex dump (first 32 bytes):
    d0 18 17 27 82 88 ff ff 60 cb b9 c0 ff ff ff ff  ...'....`.......
    f0 c7 0c ab cf 0a 00 00 58 7d c1 0b 00 c9 ff ff  ........X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8882a950cac0 (size 72):
  comm "sdma0", pid 345, jiffies 4298442879 (age 1643.620s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    f1 fe 10 ab cf 0a 00 00 58 9c ec 19 81 88 ff ff  ........X.......
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff88816aaca940 (size 216):
  comm "glxinfo", pid 173247, jiffies 4298445099 (age 1636.294s)
  hex dump (first 32 bytes):
    d0 a9 ac 6a 81 88 ff ff 60 cb b9 c0 ff ff ff ff  ...j....`.......
    4d 52 2f 64 d1 0a 00 00 58 7d c1 0b 00 c9 ff ff  MR/d....X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff88825fe8d700 (size 72):
  comm "sdma0", pid 345, jiffies 4298445099 (age 1636.294s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    63 72 33 64 d1 0a 00 00 58 64 7b 47 81 88 ff ff  cr3d....Xd{G....
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff8881433c2940 (size 216):
  comm "glxinfo:sh0", pid 173253, jiffies 4298445116 (age 1636.240s)
  hex dump (first 32 bytes):
    d0 29 3c 43 81 88 ff ff 60 cb b9 c0 ff ff ff ff  .)<C....`.......
    1b 8a 79 67 d1 0a 00 00 58 7d c1 0b 00 c9 ff ff  ..yg....X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff88825fe8d580 (size 72):
  comm "sdma0", pid 345, jiffies 4298445116 (age 1636.240s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    99 20 7e 67 d1 0a 00 00 58 a8 28 8b 82 88 ff ff  . ~g....X.(.....
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff8881433c24c0 (size 216):
  comm "glxinfo", pid 173247, jiffies 4298445116 (age 1636.314s)
  hex dump (first 32 bytes):
    50 25 3c 43 81 88 ff ff 60 cb b9 c0 ff ff ff ff  P%<C....`.......
    57 37 94 67 d1 0a 00 00 58 7d c1 0b 00 c9 ff ff  W7.g....X}......
  backtrace:
    [<ffffffffc0b9914f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc0b944de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc138dd37>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc0f7ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f7b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f7ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0a79897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc0a7d297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc0f83444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f93313>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffab55c3b3>] __do_fault+0xf3/0x3e0
    [<ffffffffab56e5ab>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffab56f5ca>] handle_mm_fault+0x12a/0x490
    [<ffffffffab0908b9>] do_user_addr_fault+0x259/0xb70
    [<ffffffffac7b6935>] exc_page_fault+0x55/0xb0
    [<ffffffffac800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8881dbf6b340 (size 72):
  comm "sdma0", pid 345, jiffies 4298445116 (age 1636.314s)
  hex dump (first 32 bytes):
    f0 f3 5c 69 81 88 ff ff 80 8a cf c1 ff ff ff ff  ..\i............
    f9 60 98 67 d1 0a 00 00 58 b4 28 8b 82 88 ff ff  .`.g....X.(.....
  backtrace:
    [<ffffffffc0f70521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc0fdd4bb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc138d09e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc0b9792e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffab12fda2>] kthread+0x342/0x410
    [<ffffffffab0030d2>] ret_from_fork+0x22/0x30


 $ inxi -bZ
System:    Kernel: 5.14.0-rc5-bdver2 x86_64 bits: 64 Desktop: MATE 1.24.1 Distro: Gentoo Base System release 2.7 
Machine:   Type: Desktop System: Gigabyte product: N/A v: N/A serial: <superuser/root required> 
           Mobo: Gigabyte model: 970-GAMING v: x.x serial: <superuser/root required> UEFI: American Megatrends v: F2 
           date: 04/06/2016 
CPU:       Info: 8-Core AMD FX-8370 [MCP] speed: 1727 MHz min/max: 1400/4000 MHz 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Navi 14 [Radeon RX 5500/5500M / Pro 5500M] driver: amdgpu v: kernel 
           Display: x11 server: X.Org 1.20.11 driver: amdgpu,ati unloaded: fbdev,modesetting,radeon resolution: 1920x1080~60Hz 
           OpenGL: renderer: Radeon RX 5500 XT (NAVI14 DRM 3.42.0 5.14.0-rc5-bdver2 LLVM 12.0.1) v: 4.6 Mesa 21.1.4 
Network:   Device-1: Qualcomm Atheros Killer E2400 Gigabit Ethernet driver: alx 

 $ lspci 
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD9x0/RX980 Host Bridge (rev 02)
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD890S/RD990 I/O Memory Management Unit (IOMMU)
00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GFX port 0)
00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 0)
00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 2)
00:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 4)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 42)
00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) (rev 40)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge (rev 40)
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0)
00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c5)
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 14 [Radeon RX 5500/5500M / Pro 5500M] (rev c5)
03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 HDMI Audio
04:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03)
05:00.0 Non-Volatile memory controller: Shenzhen Longsys Electronics Co., Ltd. Device 2263 (rev 03)
06:00.0 USB controller: ASMedia Technology Inc. ASM1143 USB 3.1 Host Controller
08:00.0 Ethernet controller: Qualcomm Atheros Killer E2400 Gigabit Ethernet Controller (rev 10)

 # lspci -vv -s 03:00.0
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 14 [Radeon RX 5500/5500M / Pro 5500M] (rev c5) (prog-if 00 [VGA controller])
	Subsystem: ASRock Incorporation Navi 14 [Radeon RX 5500/5500M / Pro 5500M]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 65
	IOMMU group: 17
	Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at d0000000 (64-bit, prefetchable) [size=2M]
	Region 4: I/O ports at e000 [size=256]
	Region 5: Memory at fe500000 (32-bit, non-prefetchable) [size=512K]
	Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [64] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 16GT/s (ok), Width x16 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 1
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-
			 AtomicOpsCap: 32bit+ 64bit+ 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
			 EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee01004  Data: 0025
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [200 v1] Physical Resizable BAR
		BAR 0: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB
		BAR 2: current size: 2MB, supported: 2MB 4MB 8MB 16MB 32MB 64MB 128MB 256MB
	Capabilities: [240 v1] Power Budgeting <?>
	Capabilities: [270 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
		LaneErrStat: 0
	Capabilities: [2a0 v1] Access Control Services
		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
	Capabilities: [2b0 v1] Address Translation Service (ATS)
		ATSCap:	Invalidate Queue Depth: 00
		ATSCtl:	Enable-, Smallest Translation Unit: 00
	Capabilities: [2c0 v1] Page Request Interface (PRI)
		PRICtl: Enable- Reset-
		PRISta: RF- UPRGI- Stopped+
		Page Request Capacity: 00000100, Page Request Allocation: 00000000
	Capabilities: [2d0 v1] Process Address Space ID (PASID)
		PASIDCap: Exec+ Priv+, Max PASID Width: 10
		PASIDCtl: Enable- Exec- Priv-
	Capabilities: [320 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [400 v1] Data Link Feature <?>
	Capabilities: [410 v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [440 v1] Lane Margining at the Receiver <?>
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu
Comment 1 Erhard F. 2021-08-10 18:35:42 UTC
Created attachment 298267 [details]
output of kmemleak (kernel 5.14-rc5, AMD FX-8370)
Comment 2 Erhard F. 2021-08-10 18:36:25 UTC
Created attachment 298269 [details]
kernel .config (kernel 5.14-rc5, AMD FX-8370)
Comment 3 Erhard F. 2021-08-30 13:45:06 UTC
Created attachment 298525 [details]
kernel .config (kernel 5.14, AMD FX-8370)
Comment 4 Erhard F. 2021-08-30 13:45:37 UTC
Created attachment 298527 [details]
kernel dmesg (kernel 5.14, AMD FX-8370)
Comment 5 Erhard F. 2021-08-30 13:49:29 UTC
Created attachment 298529 [details]
output of kmemleak (kernel 5.14, AMD FX-8370)

[...]
unreferenced object 0xffff8881dd96f0c0 (size 216):
  comm "lxdm-greeter-gt", pid 619, jiffies 4294888177 (age 3729.577s)
  hex dump (first 32 bytes):
    50 f1 96 dd 81 88 ff ff 20 8b 72 c0 ff ff ff ff  P....... .r.....
    c0 3a 59 da 08 00 00 00 58 fd a6 08 00 c9 ff ff  .:Y.....X.......
  backtrace:
    [<ffffffffc072514f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc07204de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc159e317>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc118ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc118b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc118ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0887897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc088b297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc1193444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc11a3323>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffb255c853>] __do_fault+0xf3/0x3e0
    [<ffffffffb256ea4b>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffb256fa6a>] handle_mm_fault+0x12a/0x490
    [<ffffffffb2090909>] do_user_addr_fault+0x259/0xb70
    [<ffffffffb37ad665>] exc_page_fault+0x55/0xb0
    [<ffffffffb3800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff888120f77100 (size 72):
  comm "sdma0", pid 344, jiffies 4294888177 (age 3729.577s)
  hex dump (first 32 bytes):
    30 f4 ec 3a 81 88 ff ff 20 8a f0 c1 ff ff ff ff  0..:.... .......
    a3 49 5d da 08 00 00 00 58 84 a3 0c 82 88 ff ff  .I].....X.......
  backtrace:
    [<ffffffffc1180521>] amdgpu_fence_emit+0x91/0x790 [amdgpu]
    [<ffffffffc11ed4cb>] amdgpu_ib_schedule+0x8cb/0x12f0 [amdgpu]
    [<ffffffffc159d67e>] amdgpu_job_run+0x35e/0x790 [amdgpu]
    [<ffffffffc072392e>] drm_sched_main+0x64e/0xc60 [gpu_sched]
    [<ffffffffb212fdf2>] kthread+0x342/0x410
    [<ffffffffb20030d2>] ret_from_fork+0x22/0x30
unreferenced object 0xffff888246342640 (size 216):
  comm "mate-session-ch", pid 718, jiffies 4294890392 (age 3722.200s)
  hex dump (first 32 bytes):
    d0 26 34 46 82 88 ff ff 20 8b 72 c0 ff ff ff ff  .&4F.... .r.....
    03 71 65 92 0a 00 00 00 58 fd a6 08 00 c9 ff ff  .qe.....X.......
  backtrace:
    [<ffffffffc072514f>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc07204de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc159e317>] amdgpu_job_submit+0x27/0x2d0 [amdgpu]
    [<ffffffffc118ae6e>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc118b6ca>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc118ce06>] amdgpu_bo_move+0x356/0x2180 [amdgpu]
    [<ffffffffc0887897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc088b297>] ttm_bo_validate+0x2c7/0x450 [ttm]
    [<ffffffffc1193444>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc11a3323>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffffb255c853>] __do_fault+0xf3/0x3e0
    [<ffffffffb256ea4b>] __handle_mm_fault+0x1bcb/0x2ac0
    [<ffffffffb256fa6a>] handle_mm_fault+0x12a/0x490
    [<ffffffffb2090909>] do_user_addr_fault+0x259/0xb70
    [<ffffffffb37ad665>] exc_page_fault+0x55/0xb0
    [<ffffffffb3800acb>] asm_exc_page_fault+0x1b/0x20
[...]
Comment 6 xiehuanjun 2021-08-31 09:39:37 UTC
hi

use your .config (with CONFIG_DEBUG_KMEMLEAK=y) to make a kernel image and install this image ,then reboot the system, the issue will be reproduced?

thanks.
Comment 7 Erhard F. 2021-08-31 10:20:18 UTC
(In reply to xiehuanjun from comment #6)
> hi
> 
> use your .config (with CONFIG_DEBUG_KMEMLEAK=y) to make a kernel image and
> install this image ,then reboot the system, the issue will be reproduced?
This debugging kernel is already built with CONFIG_DEBUG_KMEMLEAK=y. And yes, the issue is reproducible, happens every time after some desktop useage.
Comment 8 Erhard F. 2021-09-13 19:57:12 UTC
Created attachment 298783 [details]
output of kmemleak (kernel 5.15-rc1, AMD FX-8370)

Seems unchanged in kernel 5.15-rc1.

 # cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff88810830b400 (size 1024):
  comm "lxdm-greeter-gt", pid 624, jiffies 4294887923 (age 1566.300s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 08 b4 30 08 81 88 ff ff  ..........0.....
    08 b4 30 08 81 88 ff ff 30 f5 10 9b 81 88 ff ff  ..0.....0.......
  backtrace:
    [<ffffffffc1352a88>] amdgpu_job_alloc+0x38/0x2f0 [amdgpu]
    [<ffffffffc1352d67>] amdgpu_job_alloc_with_ib+0x27/0xf0 [amdgpu]
    [<ffffffffc0f37323>] amdgpu_copy_buffer+0x1d3/0x700 [amdgpu]
    [<ffffffffc0f37e4a>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f39586>] amdgpu_bo_move+0x356/0x2050 [amdgpu]
    [<ffffffffc06fa897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc06fe403>] ttm_bo_validate+0x2b3/0x3b0 [ttm]
    [<ffffffffc0f3fa84>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f4f903>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffff97568963>] __do_fault+0xf3/0x3e0
    [<ffffffff9757a5f5>] __handle_mm_fault+0x16e5/0x2aa0
    [<ffffffff9757bada>] handle_mm_fault+0x12a/0x490
    [<ffffffff9708e449>] do_user_addr_fault+0x259/0xb70
    [<ffffffff988137a5>] exc_page_fault+0x55/0xb0
    [<ffffffff98a00acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8881fe3ca4c0 (size 216):
  comm "lxdm-greeter-gt", pid 624, jiffies 4294887923 (age 1566.300s)
  hex dump (first 32 bytes):
    50 a5 3c fe 81 88 ff ff 20 ab 74 c0 ff ff ff ff  P.<..... .t.....
    e0 ea 04 ac 08 00 00 00 50 fd c5 08 00 c9 ff ff  ........P.......
  backtrace:
    [<ffffffffc07471df>] drm_sched_fence_create+0x1f/0x1d0 [gpu_sched]
    [<ffffffffc07424de>] drm_sched_job_init+0x10e/0x240 [gpu_sched]
    [<ffffffffc13538a5>] amdgpu_job_submit+0x25/0x100 [amdgpu]
    [<ffffffffc0f375ee>] amdgpu_copy_buffer+0x49e/0x700 [amdgpu]
    [<ffffffffc0f37e4a>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f39586>] amdgpu_bo_move+0x356/0x2050 [amdgpu]
    [<ffffffffc06fa897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc06fe403>] ttm_bo_validate+0x2b3/0x3b0 [ttm]
    [<ffffffffc0f3fa84>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f4f903>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffff97568963>] __do_fault+0xf3/0x3e0
    [<ffffffff9757a5f5>] __handle_mm_fault+0x16e5/0x2aa0
    [<ffffffff9757bada>] handle_mm_fault+0x12a/0x490
    [<ffffffff9708e449>] do_user_addr_fault+0x259/0xb70
    [<ffffffff988137a5>] exc_page_fault+0x55/0xb0
    [<ffffffff98a00acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8881cdbb7000 (size 1024):
  comm "mate-session-ch", pid 722, jiffies 4294890054 (age 1559.204s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 08 70 bb cd 81 88 ff ff  .........p......
    08 70 bb cd 81 88 ff ff 30 f5 10 9b 81 88 ff ff  .p......0.......
  backtrace:
    [<ffffffffc1352a88>] amdgpu_job_alloc+0x38/0x2f0 [amdgpu]
    [<ffffffffc1352d67>] amdgpu_job_alloc_with_ib+0x27/0xf0 [amdgpu]
    [<ffffffffc0f37323>] amdgpu_copy_buffer+0x1d3/0x700 [amdgpu]
    [<ffffffffc0f37e4a>] amdgpu_ttm_copy_mem_to_mem+0x5fa/0xf00 [amdgpu]
    [<ffffffffc0f39586>] amdgpu_bo_move+0x356/0x2050 [amdgpu]
    [<ffffffffc06fa897>] ttm_bo_handle_move_mem+0x1c7/0x620 [ttm]
    [<ffffffffc06fe403>] ttm_bo_validate+0x2b3/0x3b0 [ttm]
    [<ffffffffc0f3fa84>] amdgpu_bo_fault_reserve_notify+0x2a4/0x640 [amdgpu]
    [<ffffffffc0f4f903>] amdgpu_gem_fault+0x123/0x2d0 [amdgpu]
    [<ffffffff97568963>] __do_fault+0xf3/0x3e0
    [<ffffffff9757a5f5>] __handle_mm_fault+0x16e5/0x2aa0
    [<ffffffff9757bada>] handle_mm_fault+0x12a/0x490
    [<ffffffff9708e449>] do_user_addr_fault+0x259/0xb70
    [<ffffffff988137a5>] exc_page_fault+0x55/0xb0
    [<ffffffff98a00acb>] asm_exc_page_fault+0x1b/0x20
[...]
Comment 9 Erhard F. 2021-09-13 19:58:48 UTC
Created attachment 298785 [details]
kernel dmesg (kernel 5.15-rc1, AMD FX-8370)
Comment 10 Erhard F. 2021-09-13 20:00:32 UTC
Created attachment 298787 [details]
kernel .config (kernel 5.15-rc1, AMD FX-8370)
Comment 11 Erhard F. 2021-09-20 16:29:23 UTC
Created attachment 298891 [details]
kernel dmesg (kernel 5.15-rc2, AMD FX-8370)
Comment 12 Erhard F. 2021-09-20 16:30:03 UTC
Created attachment 298893 [details]
kernel .config (kernel 5.15-rc2, AMD FX-8370)
Comment 13 Erhard F. 2021-09-20 16:35:10 UTC
Created attachment 298897 [details]
bisect.log

Verified that the issue still exists in latest v5.15-rc2 and v5.14.6 and did a bisect:

# possible first bad commit: [355b60296143a090039211c5f0e1463f84aab65a] Merge drm/drm-next into drm-misc-next
# possible first bad commit: [91185d55b32e7e377f15fb46a62b216f8d3038d4] drm: Remove DRM_KMS_FB_HELPER Kconfig option
# possible first bad commit: [a50e74bec1d17e95275909660c6b43ffe11ebcf0] drm/zte: Don't select DRM_KMS_FB_HELPER
# possible first bad commit: [13b29cc3a722c2c0bc9ab9f72f9047d55d08a2f9] drm/mxsfb: Don't select DRM_KMS_FB_HELPER
# possible first bad commit: [5dbf2fc587cb79cb366bd6e79ac6b52269d64fc5] drm/vmwgfx: Make console emulation depend on DRM_FBDEV_EMULATION
# possible first bad commit: [c777dc9e793342ecdfc95045d2127a3ea32791a0] drm/ttm: move the page_alignment into the BO v2
# possible first bad commit: [65747ded86b4608387d5618d14f0fe9dc88e17ea] drm/ttm: minor range manager coding style clean ups
# possible first bad commit: [d02117f8efaa5fbc37437df1ae955a147a2a424a] drm/ttm: remove special handling for non GEM drivers
# possible first bad commit: [13ea9aa1e7d891e950230e82f1dd2c84e5debcff] drm/ttm: fix error handling if no BO can be swapped out v4
# possible first bad commit: [ae053fa234f42b4abc582372af7410ad0e3e1dad] drm: bridge: adv7511: Support I2S IEC958 encoded PCM format

I had to skip a few commits as kmemleak did not properly work for some commits due to:
[...]
[    0.706509] kmemleak: Cannot insert 0xffff953c1ba02f40 into the object search tree (overlaps existing)
[    0.706514] CPU: 0 PID: 6 Comm: kthreadd Not tainted 5.12.0-rc3-bdver2+ #24
[    0.706518] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./970-GAMING, BIOS F2 04/06/2016
[    0.706521] Call Trace:
[    0.706524]  dump_stack+0x69/0x8e
[    0.706531]  create_object.isra.0.cold+0x3b/0x5d
[    0.706536]  ? kthread+0x35/0x130
[    0.706537]  kmem_cache_alloc+0x15a/0x4a0
[    0.706537]  ? rescuer_thread+0x380/0x380
[    0.706537]  kthread+0x35/0x130
[    0.706537]  ? __kthread_bind_mask+0x60/0x60
[    0.706537]  ret_from_fork+0x22/0x30
[    0.706537] kmemleak: Kernel memory leak detector disabled
[    0.706537] kmemleak: Object 0xffff953c1ba00000 (size 2097152):
[    0.706537] kmemleak:   comm "swapper", pid 0, jiffies 4294877296
[    0.706537] kmemleak:   min_count = 0
[    0.706537] kmemleak:   count = 0
[    0.706537] kmemleak:   flags = 0x1
[    0.706537] kmemleak:   checksum = 0
[    0.706537] kmemleak:   backtrace:
[    0.706537]      memblock_alloc_internal+0xb8/0x152
[    0.706537]      memblock_alloc_try_nid+0xa0/0xf3
[    0.706537]      kfence_alloc_pool+0x59/0xbf
[    0.706537]      start_kernel+0x2b3/0x61a
[    0.706537]      secondary_startup_64_no_verify+0xb0/0xbb
Comment 14 Erhard F. 2021-09-22 22:05:07 UTC
Created attachment 298927 [details]
kernel dmesg (kernel 5.14.6, AMD Opteron 6386 SE)

Does not seem to be Navi specific after all as the leaks do happen with the Radeon R7 360 in my Opteron box too.

[...]
unreferenced object 0xffff8afeddd0c2c0 (size 176):
  comm "Web Content", pid 1830253, jiffies 4302445561 (age 2701.157s)
  hex dump (first 32 bytes):
    50 c3 d0 dd fe 8a ff ff 80 51 3a c0 ff ff ff ff  P........Q:.....
    0f 89 14 e9 f1 16 00 00 48 fe b6 09 41 a7 ff ff  ........H...A...
  backtrace:
    [<ffffffffc03a347d>] drm_sched_fence_create+0x1d/0xb0 [gpu_sched]
    [<ffffffffc03a20d0>] drm_sched_job_init+0x58/0xa0 [gpu_sched]
    [<ffffffffc10fb711>] amdgpu_job_submit+0x21/0xe0 [amdgpu]
    [<ffffffffc0feef6a>] amdgpu_copy_buffer+0x1ea/0x290 [amdgpu]
    [<ffffffffc0fef292>] amdgpu_ttm_copy_mem_to_mem+0x282/0x5b0 [amdgpu]
    [<ffffffffc0fefad8>] amdgpu_bo_move+0x130/0x7d8 [amdgpu]
    [<ffffffffc0609e49>] ttm_bo_handle_move_mem+0x89/0x178 [ttm]
    [<ffffffffc060b1ba>] ttm_bo_validate+0xba/0x140 [ttm]
    [<ffffffffc0ff13ae>] amdgpu_bo_fault_reserve_notify+0xb6/0x160 [amdgpu]
    [<ffffffffc0ff62f8>] amdgpu_gem_fault+0x78/0x100 [amdgpu]
    [<ffffffff9b166941>] __do_fault+0x31/0xe8
    [<ffffffff9b16dc4a>] __handle_mm_fault+0xe1a/0x1290
    [<ffffffff9b16e175>] handle_mm_fault+0xb5/0x218
    [<ffffffff9b6ca347>] exc_page_fault+0x177/0x5d0
    [<ffffffff9b800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8b01f00bd0c0 (size 72):
  comm "sdma0", pid 403, jiffies 4302445561 (age 2701.157s)
  hex dump (first 32 bytes):
    e0 c7 64 13 ff 8a ff ff 00 1c 30 c1 ff ff ff ff  ..d.......0.....
    65 59 16 e9 f1 16 00 00 58 28 b9 86 03 8b ff ff  eY......X(......
  backtrace:
    [<ffffffffc0febecb>] amdgpu_fence_emit+0x2b/0x1f0 [amdgpu]
    [<ffffffffc100945b>] amdgpu_ib_schedule+0x2e3/0x4e8 [amdgpu]
    [<ffffffffc10fb34b>] amdgpu_job_run+0x8b/0x1e8 [amdgpu]
    [<ffffffffc03a2ad7>] drm_sched_main+0x1b7/0x3d8 [gpu_sched]
    [<ffffffff9b05f9e2>] kthread+0x122/0x140
    [<ffffffff9b001102>] ret_from_fork+0x22/0x30
unreferenced object 0xffff8b02ec1796c0 (size 176):
  comm "Renderer", pid 108402, jiffies 4302694486 (age 1871.424s)
  hex dump (first 32 bytes):
    50 97 17 ec 02 8b ff ff 80 51 3a c0 ff ff ff ff  P........Q:.....
    4f 9c 02 1a b3 17 00 00 48 fe b6 09 41 a7 ff ff  O.......H...A...
  backtrace:
    [<ffffffffc03a347d>] drm_sched_fence_create+0x1d/0xb0 [gpu_sched]
    [<ffffffffc03a20d0>] drm_sched_job_init+0x58/0xa0 [gpu_sched]
    [<ffffffffc10fb711>] amdgpu_job_submit+0x21/0xe0 [amdgpu]
    [<ffffffffc0feef6a>] amdgpu_copy_buffer+0x1ea/0x290 [amdgpu]
    [<ffffffffc0fef292>] amdgpu_ttm_copy_mem_to_mem+0x282/0x5b0 [amdgpu]
    [<ffffffffc0fefad8>] amdgpu_bo_move+0x130/0x7d8 [amdgpu]
    [<ffffffffc0609e49>] ttm_bo_handle_move_mem+0x89/0x178 [ttm]
    [<ffffffffc060b1ba>] ttm_bo_validate+0xba/0x140 [ttm]
    [<ffffffffc0ff13ae>] amdgpu_bo_fault_reserve_notify+0xb6/0x160 [amdgpu]
    [<ffffffffc0ff62f8>] amdgpu_gem_fault+0x78/0x100 [amdgpu]
    [<ffffffff9b166941>] __do_fault+0x31/0xe8
    [<ffffffff9b16dc4a>] __handle_mm_fault+0xe1a/0x1290
    [<ffffffff9b16e175>] handle_mm_fault+0xb5/0x218
    [<ffffffff9b6ca347>] exc_page_fault+0x177/0x5d0
    [<ffffffff9b800acb>] asm_exc_page_fault+0x1b/0x20


 # lspci -s 01:00.0 -v
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81) (prog-if 00 [VGA controller])
	Subsystem: PC Partner Limited / Sapphire Technology Tobago PRO [Radeon R7 360 / R9 360 OEM]
	Flags: bus master, fast devsel, latency 0, IRQ 47, IOMMU group 11
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Memory at cf800000 (64-bit, prefetchable) [size=8M]
	I/O ports at c000 [size=256]
	Memory at fdc80000 (32-bit, non-prefetchable) [size=256K]
	Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
	Capabilities: [58] Express Legacy Endpoint, MSI 00
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150] Advanced Error Reporting
	Capabilities: [200] Physical Resizable BAR
	Capabilities: [270] Secondary PCI Express
	Capabilities: [2b0] Address Translation Service (ATS)
	Capabilities: [2c0] Page Request Interface (PRI)
	Capabilities: [2d0] Process Address Space ID (PASID)
	Kernel driver in use: amdgpu
	Kernel modules: radeon, amdgpu
Comment 15 Erhard F. 2021-09-28 15:21:20 UTC
I got around skipping commits by cherry-picking 9551158069ba8fcc893798d42dc4f978b62ef60f (kfence: make compatible with kmemleak) and finally was able to complete the bisect. The offending commit was:

 # git bisect good
d02117f8efaa5fbc37437df1ae955a147a2a424a is the first bad commit
commit d02117f8efaa5fbc37437df1ae955a147a2a424a
Author: Christian König <christian.koenig@amd.com>
Date:   Sat Apr 17 19:09:30 2021 +0200

    drm/ttm: remove special handling for non GEM drivers
    
    vmwgfx is the only driver actually using this. Move the handling into
    the driver instead.
    
    Signed-off-by: Christian König <christian.koenig@amd.com>
    Acked-by: Huang Rui <ray.huang@amd.com>
    Reviewed-by: Zack Rusin <zackr@vmware.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20210419092853.1605-1-christian.koenig@amd.com

 drivers/gpu/drm/ttm/ttm_bo.c       | 11 -----------
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 10 ++++++++++
 include/drm/ttm/ttm_bo_api.h       | 19 -------------------
 3 files changed, 10 insertions(+), 30 deletions(-)
Comment 16 Erhard F. 2021-09-28 15:25:50 UTC
Created attachment 299007 [details]
final bisect.log
Comment 17 Erhard F. 2021-10-15 22:53:50 UTC
v5.15-rc5 is still affected.

However I reverted d02117f8efaa5fbc37437df1ae955a147a2a424a on top of v5.15-rc5 and can confirm this fixes the leak.
Comment 18 Erhard F. 2021-10-15 22:55:44 UTC
Created attachment 299219 [details]
kernel .config (kernel 5.15-rc5, AMD FX-8370)
Comment 19 Christian König 2021-10-20 17:46:51 UTC
Created attachment 299277 [details]
Potential fix
Comment 20 Erhard F. 2021-10-21 07:37:13 UTC
(In reply to Christian König from comment #19)
> Created attachment 299277 [details]
> Potential fix
Fixes the leak as it does in bug #214447. Thanks!
Comment 21 Christian König 2021-10-21 08:44:04 UTC
No problem. It just took me a while to realize what the issue is.

The patches bisected didn't caused it, but rather just made it more likely to appear.

Can I add your mail as Tested-by? (you potentially get a bit more spam with that).
Comment 22 Erhard F. 2021-10-21 12:39:41 UTC
(In reply to Christian König from comment #21)
> Can I add your mail as Tested-by? (you potentially get a bit more spam with
> that).
Yes, I'm fine with that. Getting spam anyhow. ;)
Comment 23 Erhard F. 2021-11-03 09:45:44 UTC
The fix landed in kernel 5.15, 5.14.16 and affected LTS kernels.

Closing.
Comment 24 Jan Steffens 2021-11-03 15:07:37 UTC
Looks like this was mistakenly picked into LTS 5.10, which does not contain d02117f8efaa5fbc37437df1ae955a147a2a424a.
Comment 25 Erhard F. 2021-11-03 17:54:30 UTC
(In reply to Jan Steffens from comment #24)
> Looks like this was mistakenly picked into LTS 5.10, which does not contain
> d02117f8efaa5fbc37437df1ae955a147a2a424a.
As Christian wrote in comment 21 the patches bisected didn't cause the memleak, but rather just made it more likely to appear. So the patch (0db55f9a1bafbe3dac750ea669de9134922389b5) most probably wandered correctly in 5.10 LTS and 5.4 LTS.
Comment 26 Christian König 2021-11-04 07:16:27 UTC
It most likely won't hurt to have the patch in older kernels as well, yes.

The only possibility I can see is that we then have a double free on older kernels and that would mean that we need to get back to the drawing board again.
Comment 27 Christian König 2021-11-04 07:44:45 UTC
I've just finished up reading my mails this morning and found a crash report for this patch when it is back ported to 5.10.

So please do NOT apply this patch to 5.10!

The memory leak is potentially there as well, just much much less likely and a double free certainly crashes the kernel.

Note You need to log in before you can comment on or make changes to this bug.