Bug 199425

Summary: BUG: KASAN: use-after-free in drm_atomic_helper_wait_for_flip_done+0x247/0x260
Product: Drivers Reporter: Johannes Hirte (johannes.hirte)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: NEW ---    
Severity: normal CC: daniel, harry.wentland, mikita.lipski
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.17-rc1 Subsystem:
Regression: No Bisected commit-id:
Attachments: Patch to either dublicate or reuse an existing crtc state that might pervent use-after-free error in race condition
Patch to either dublicate or reuse an existing crtc state that might pervent use-after-free error in race condition

Description Johannes Hirte 2018-04-17 08:01:51 UTC
With dc enabled, I get the following use-after-free on my Carrizo:

[53213.875800] ==================================================================
[53213.875826] BUG: KASAN: use-after-free in drm_atomic_helper_wait_for_flip_done+0x247/0x260
[53213.875835] Read of size 8 at addr ffff8801063aaa88 by task kworker/u8:3/9911

[53213.875848] CPU: 3 PID: 9911 Comm: kworker/u8:3 Not tainted 4.17.0-rc1-00001-g9e7729e9a66c #566
[53213.875855] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.12 12/19/2017
[53213.875864] Workqueue: events_unbound commit_work
[53213.875870] Call Trace:
[53213.875881]  dump_stack+0x5b/0x8b
[53213.875890]  ? drm_atomic_helper_wait_for_flip_done+0x247/0x260
[53213.875899]  print_address_description+0x65/0x270
[53213.875907]  ? drm_atomic_helper_wait_for_flip_done+0x247/0x260
[53213.875913]  kasan_report+0x232/0x350
[53213.875920]  drm_atomic_helper_wait_for_flip_done+0x247/0x260
[53213.875930]  amdgpu_dm_atomic_commit_tail+0x1b19/0x4010
[53213.875940]  ? _raw_spin_unlock_irq+0x35/0x50
[53213.875946]  ? wait_for_completion_timeout+0x215/0x2b0
[53213.875953]  ? btrfs_rmap_block+0x9c0/0x9c0
[53213.875959]  ? dm_update_crtcs_state+0xcb0/0xcb0
[53213.875966]  ? _raw_spin_unlock_irqrestore+0x3a/0x70
[53213.875973]  ? try_to_wake_up+0xa1/0xf90
[53213.875980]  ? drm_atomic_helper_wait_for_dependencies+0x3de/0x7d0
[53213.875986]  ? normal_work_helper+0x273/0xa70
[53213.875993]  commit_tail+0x95/0xf0
[53213.876000]  process_one_work+0x7c8/0x1330
[53213.876006]  ? _raw_spin_lock_irq+0x1c/0x40
[53213.876013]  worker_thread+0xc9/0xef0
[53213.876021]  ? process_one_work+0x1330/0x1330
[53213.876026]  kthread+0x2d6/0x390
[53213.876032]  ? kthread_create_worker+0xd0/0xd0
[53213.876038]  ret_from_fork+0x22/0x40

[53213.876049] Allocated by task 508:
[53213.876056]  kasan_kmalloc+0xa0/0xd0
[53213.876063]  kmem_cache_alloc_trace+0xf3/0x1f0
[53213.876068]  dm_crtc_duplicate_state+0x73/0x130
[53213.876075]  drm_atomic_get_crtc_state+0x142/0x400
[53213.876080]  page_flip_common+0x52/0x220
[53213.876086]  drm_atomic_helper_page_flip+0xa1/0x100
[53213.876093]  drm_mode_page_flip_ioctl+0xbe3/0xff0
[53213.876100]  drm_ioctl_kernel+0x13d/0x1d0
[53213.876106]  drm_ioctl+0x63d/0x920
[53213.876112]  amdgpu_drm_ioctl+0xc7/0x1a0
[53213.876120]  do_vfs_ioctl+0x173/0xde0
[53213.876125]  ksys_ioctl+0x6b/0x80
[53213.876130]  __x64_sys_ioctl+0x6a/0xb0
[53213.876137]  do_syscall_64+0x95/0x2f0
[53213.876142]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[53213.876149] Freed by task 637:
[53213.876154]  __kasan_slab_free+0x130/0x180
[53213.876159]  kfree+0x8b/0x1c0
[53213.876164]  drm_atomic_state_default_clear+0x2c5/0xa00
[53213.876169]  __drm_atomic_state_free+0x30/0xc0
[53213.876174]  drm_atomic_helper_update_plane+0xb6/0x350
[53213.876179]  __setplane_internal+0x48c/0x7f0
[53213.876184]  drm_mode_cursor_universal+0x2e7/0x970
[53213.876189]  drm_mode_cursor_common+0x493/0x860
[53213.876194]  drm_mode_cursor_ioctl+0x7a/0xa0
[53213.876199]  drm_ioctl_kernel+0x13d/0x1d0
[53213.876203]  drm_ioctl+0x63d/0x920
[53213.876207]  amdgpu_drm_ioctl+0xc7/0x1a0
[53213.876212]  do_vfs_ioctl+0x173/0xde0
[53213.876216]  ksys_ioctl+0x6b/0x80
[53213.876221]  __x64_sys_ioctl+0x6a/0xb0
[53213.876225]  do_syscall_64+0x95/0x2f0
[53213.876230]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[53213.876239] The buggy address belongs to the object at ffff8801063aa880
                which belongs to the cache kmalloc-1024 of size 1024
[53213.876247] The buggy address is located 520 bytes inside of
                1024-byte region [ffff8801063aa880, ffff8801063aac80)
[53213.876252] The buggy address belongs to the page:
[53213.876258] page:ffffea000418ea00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
[53213.876268] flags: 0x2000000000008100(slab|head)
[53213.876278] raw: 2000000000008100 0000000000000000 0000000000000000 00000001801c001c
[53213.876284] raw: dead000000000100 dead000000000200 ffff8803f3402c40 0000000000000000
[53213.876288] page dumped because: kasan: bad access detected

[53213.876294] Memory state around the buggy address:
[53213.876300]  ffff8801063aa980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[53213.876305]  ffff8801063aaa00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[53213.876310] >ffff8801063aaa80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[53213.876313]                       ^
[53213.876319]  ffff8801063aab00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[53213.876324]  ffff8801063aab80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[53213.876327] ==================================================================
[53213.876331] Disabling lock debugging due to kernel taint


I've obverved this already with kernel 4.14, 4.15 and 4.16.
Comment 1 Johannes Hirte 2018-04-25 21:08:39 UTC
(gdb) list *(drm_atomic_helper_wait_for_flip_done+0x247)
0xffffffff82043447 is in drm_atomic_helper_wait_for_flip_done (drivers/gpu/drm/drm_atomic_helper.c:1381).
1376            struct drm_crtc_state *new_crtc_state;
1377            struct drm_crtc *crtc;
1378            int i;
1379
1380            for_each_new_crtc_in_state(old_state, crtc, new_crtc_state, i) {
1381                    struct drm_crtc_commit *commit = new_crtc_state->commit;
1382                    int ret;
1383
1384                    if (!commit)
1385                            continue;
(gdb)
Comment 2 Johannes Hirte 2018-05-22 16:34:02 UTC
ping? We have rc6, a use-after-free and no developer cares?
Comment 3 mikita.lipski@amd.com 2018-05-23 17:22:13 UTC
Hi Johannes,

We have started investigating the issue. 

Whats the scenario to reproduce the issue?
Comment 4 Johannes Hirte 2018-05-23 17:30:26 UTC
Sadly I don't have a reproducer for this. I'm starting the system, and after some time I get the kasan-warning. Sometimes it happened really fast after boot, sometimes it took several hours.
Comment 5 mikita.lipski@amd.com 2018-05-24 18:33:07 UTC
Created attachment 276171 [details]
Patch to either dublicate or reuse an existing crtc state that might pervent use-after-free error in race condition

I wasn't able to reproduce the issue, but could you please try applying this patch and seeing if does any difference?

Also could add a dmesg log with drm.debug=0x6 to see whats the chain of events that caused the issue 

Thanks
Comment 6 mikita.lipski@amd.com 2018-05-24 20:47:21 UTC
Created attachment 276173 [details]
Patch to either dublicate or reuse an existing crtc state that might pervent use-after-free error in race condition

Sorry, the previous patch is irrelevant and was attached by mistake! Please try the one above. Thanks
Comment 7 Johannes Hirte 2018-05-25 12:00:10 UTC
dmesg output with drm.debug=0x6 and without your patch:

May 25 13:40:54 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1a010000 
May 25 13:40:54 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:54 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1de10000 
May 25 13:40:55 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: ==================================================================
May 25 13:40:55 probook kernel: BUG: KASAN: use-after-free in drm_atomic_helper_wait_for_flip_done+0x247/0x260
May 25 13:40:55 probook kernel: Read of size 8 at addr ffff8801aa0a0688 by task kworker/u8:10/7828
May 25 13:40:55 probook kernel: 
May 25 13:40:55 probook kernel: CPU: 2 PID: 7828 Comm: kworker/u8:10 Not tainted 4.17.0-rc6-00152-ga048a07d7f45 #606
May 25 13:40:55 probook kernel: Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.15 03/26/2018
May 25 13:40:55 probook kernel: Workqueue: events_unbound commit_work
May 25 13:40:55 probook kernel: Call Trace:
May 25 13:40:55 probook kernel:  dump_stack+0x5b/0x8b
May 25 13:40:55 probook kernel:  ? drm_atomic_helper_wait_for_flip_done+0x247/0x260
May 25 13:40:55 probook kernel:  print_address_description+0x65/0x270
May 25 13:40:55 probook kernel:  ? drm_atomic_helper_wait_for_flip_done+0x247/0x260
May 25 13:40:55 probook kernel:  kasan_report+0x232/0x350
May 25 13:40:55 probook kernel:  drm_atomic_helper_wait_for_flip_done+0x247/0x260
May 25 13:40:55 probook kernel:  amdgpu_dm_atomic_commit_tail+0x1b19/0x4010
May 25 13:40:55 probook kernel:  ? _raw_spin_unlock_irq+0x35/0x50
May 25 13:40:55 probook kernel:  ? wait_for_completion_timeout+0x215/0x2b0
May 25 13:40:55 probook kernel:  ? dm_update_crtcs_state+0xde0/0xde0
May 25 13:40:55 probook kernel:  ? _raw_spin_unlock_irq+0x35/0x50
May 25 13:40:55 probook kernel:  ? finish_task_switch+0x12f/0x770
May 25 13:40:55 probook kernel:  ? drm_atomic_helper_wait_for_dependencies+0x3de/0x7d0
May 25 13:40:55 probook kernel:  commit_tail+0x95/0xf0
May 25 13:40:55 probook kernel:  process_one_work+0x7c8/0x1330
May 25 13:40:55 probook kernel:  ? _raw_spin_lock_irq+0x1c/0x40
May 25 13:40:55 probook kernel:  worker_thread+0xc9/0xef0
May 25 13:40:55 probook kernel:  ? process_one_work+0x1330/0x1330
May 25 13:40:55 probook kernel:  kthread+0x2d6/0x390
May 25 13:40:55 probook kernel:  ? kthread_create_worker+0xd0/0xd0
May 25 13:40:55 probook kernel:  ret_from_fork+0x22/0x40
May 25 13:40:55 probook kernel:
May 25 13:40:55 probook kernel: Allocated by task 614:
May 25 13:40:55 probook kernel:  kasan_kmalloc+0xa0/0xd0
May 25 13:40:55 probook kernel:  kmem_cache_alloc_trace+0xf3/0x1f0
May 25 13:40:55 probook kernel:  dm_crtc_duplicate_state+0x73/0x130
May 25 13:40:55 probook kernel:  drm_atomic_get_crtc_state+0x142/0x400
May 25 13:40:55 probook kernel:  page_flip_common+0x52/0x220
May 25 13:40:55 probook kernel:  drm_atomic_helper_page_flip+0xa1/0x100
May 25 13:40:55 probook kernel:  drm_mode_page_flip_ioctl+0xbe3/0xff0
May 25 13:40:55 probook kernel:  drm_ioctl_kernel+0x13d/0x1d0
May 25 13:40:55 probook kernel:  drm_ioctl+0x63d/0x920
May 25 13:40:55 probook kernel:  amdgpu_drm_ioctl+0xc7/0x1a0
May 25 13:40:55 probook kernel:  do_vfs_ioctl+0x173/0xde0
May 25 13:40:55 probook kernel:  ksys_ioctl+0x6b/0x80
May 25 13:40:55 probook kernel:  __x64_sys_ioctl+0x6a/0xb0
May 25 13:40:55 probook kernel:  do_syscall_64+0x95/0x2f0
May 25 13:40:55 probook kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 25 13:40:55 probook kernel:
May 25 13:40:55 probook kernel: Freed by task 626:
May 25 13:40:55 probook kernel:  __kasan_slab_free+0x130/0x180
May 25 13:40:55 probook kernel:  kfree+0x8b/0x1c0
May 25 13:40:55 probook kernel:  drm_atomic_state_default_clear+0x30f/0xbd0
May 25 13:40:55 probook kernel:  __drm_atomic_state_free+0x30/0xc0
May 25 13:40:55 probook kernel:  drm_atomic_helper_update_plane+0xb6/0x350
May 25 13:40:55 probook kernel:  __setplane_internal+0x48c/0x7f0
May 25 13:40:55 probook kernel:  drm_mode_cursor_universal+0x2e7/0x970
May 25 13:40:55 probook kernel:  drm_mode_cursor_common+0x493/0x860
May 25 13:40:55 probook kernel:  drm_mode_cursor_ioctl+0x7a/0xa0
May 25 13:40:55 probook kernel:  drm_ioctl_kernel+0x13d/0x1d0
May 25 13:40:55 probook kernel:  drm_ioctl+0x63d/0x920
May 25 13:40:55 probook kernel:  amdgpu_drm_ioctl+0xc7/0x1a0
May 25 13:40:55 probook kernel:  do_vfs_ioctl+0x173/0xde0
May 25 13:40:55 probook kernel:  ksys_ioctl+0x6b/0x80
May 25 13:40:55 probook kernel:  __x64_sys_ioctl+0x6a/0xb0
May 25 13:40:55 probook kernel:  do_syscall_64+0x95/0x2f0
May 25 13:40:55 probook kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
May 25 13:40:55 probook kernel: 
May 25 13:40:55 probook kernel: The buggy address belongs to the object at ffff8801aa0a0480
 which belongs to the cache kmalloc-1024 of size 1024
May 25 13:40:55 probook kernel: The buggy address is located 520 bytes inside of
 1024-byte region [ffff8801aa0a0480, ffff8801aa0a0880)
May 25 13:40:55 probook kernel: The buggy address belongs to the page:
May 25 13:40:55 probook kernel: page:ffffea0006a82800 count:1 mapcount:0 mapping:0000000000000000 index:0xffff8801aa0a7080 compound_mapcount: 0
May 25 13:40:55 probook kernel: flags: 0x2000000000008100(slab|head)
May 25 13:40:55 probook kernel: raw: 2000000000008100 0000000000000000 ffff8801aa0a7080 00000001801c001a
May 25 13:40:55 probook kernel: raw: 0000000000000000 0000000100000001 ffff8803f3402c40 0000000000000000
May 25 13:40:55 probook kernel: page dumped because: kasan: bad access detected
May 25 13:40:55 probook kernel: 
May 25 13:40:55 probook kernel: Memory state around the buggy address:
May 25 13:40:55 probook kernel:  ffff8801aa0a0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
May 25 13:40:55 probook kernel:  ffff8801aa0a0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
May 25 13:40:55 probook kernel: >ffff8801aa0a0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
May 25 13:40:55 probook kernel:                       ^
May 25 13:40:55 probook kernel:  ffff8801aa0a0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
May 25 13:40:55 probook kernel:  ffff8801aa0a0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
May 25 13:40:55 probook kernel: ==================================================================
May 25 13:40:55 probook kernel: Disabling lock debugging due to kernel taint
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1a010000 
May 25 13:40:55 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1de10000 
May 25 13:40:55 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1a010000 
May 25 13:40:55 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1de10000 
May 25 13:40:55 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1a010000 
May 25 13:40:55 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1de10000 
May 25 13:40:55 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] amdgpu_dm_do_flip Flipping to hi: 0xf4, low: 0x1a010000 
May 25 13:40:55 probook kernel: [drm:dm_pflip_high_irq] dm_pflip_high_irq - crtc :0[00000000bae227b0], pflip_stat:AMDGPU_FLIP_NONE
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] handle_cursor_update: crtc_id=0 with size 128 to 128
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_atomic_commit_tail] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0
May 25 13:40:55 probook kernel: [drm:amdgpu_dm_do_flip] crtc:0, pflip_stat:AMDGPU_FLIP_SUBMITTED


I'm trying your patch if it makes any difference.
Comment 8 Johannes Hirte 2018-05-28 16:01:49 UTC
(In reply to mikita.lipski@amd.com from comment #6)
> Created attachment 276173 [details]
> Patch to either dublicate or reuse an existing crtc state that might pervent
> use-after-free error in race condition
> 
> Sorry, the previous patch is irrelevant and was attached by mistake! Please
> try the one above. Thanks

The patch seems to help. I was running the system the last days without any use-after-free.
Comment 9 Michel Dänzer 2018-06-15 14:24:51 UTC
Mikita, can you send this patch to the dri-devel mailing list for review?
Comment 10 mikita.lipski@amd.com 2018-06-21 13:10:53 UTC
Johannes,
My patch wasn't merged into DRM, but Daniel Vetter proposed another patch that might remove legacy code that causes the issue. Could you remove my patch from your tree and apply the following patch:

https://patchwork.freedesktop.org/patch/230355/

Could you please if it fixes the Kasan issue for you, thanks.
Comment 11 Harry Wentland 2018-06-27 15:04:21 UTC
Should be fixed by https://patchwork.freedesktop.org/patch/230831/ which is merged into amd-staging-drm-next
Comment 12 Johannes Hirte 2018-07-03 20:57:08 UTC
(In reply to mikita.lipski@amd.com from comment #10)
> Johannes,
> My patch wasn't merged into DRM, but Daniel Vetter proposed another patch
> that might remove legacy code that causes the issue. Could you remove my
> patch from your tree and apply the following patch:
> 
> https://patchwork.freedesktop.org/patch/230355/
> 
> Could you please if it fixes the Kasan issue for you, thanks.

Sorry, I don't have access to the Carrizo system at moment. I hope, I'll have it back in a week or so. Will test it, as soon as possible.
Comment 13 Johannes Hirte 2018-07-24 19:38:14 UTC
(In reply to Harry Wentland from comment #11)
> Should be fixed by https://patchwork.freedesktop.org/patch/230831/ which is
> merged into amd-staging-drm-next

Just tested with 4.18-rc6 that has this patch applied. Still getting the use-after-free. Looking at this patch, this seems to fix another bug:

BUG: KASAN: use-after-free in amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]

whereas this one is:

BUG: KASAN: use-after-free in drm_atomic_helper_wait_for_flip_done+0x212/0x270

I'll try the patch from Daniel Vetter now.
Comment 14 Johannes Hirte 2018-07-25 13:02:11 UTC
(In reply to mikita.lipski@amd.com from comment #10)
> Johannes,
> My patch wasn't merged into DRM, but Daniel Vetter proposed another patch
> that might remove legacy code that causes the issue. Could you remove my
> patch from your tree and apply the following patch:
> 
> https://patchwork.freedesktop.org/patch/230355/
> 
> Could you please if it fixes the Kasan issue for you, thanks.

Doesn't avoid the use-after-free. Tested with the patch on top of 4.18-rc6.
Comment 15 mikita.lipski@amd.com 2018-07-25 14:12:08 UTC
Lyude Paul fixed this issue, please try his patch:

https://patchwork.kernel.org/patch/10480569/

Thanks
Comment 16 Johannes Hirte 2018-07-25 16:40:04 UTC
(In reply to mikita.lipski@amd.com from comment #15)
> Lyude Paul fixed this issue, please try his patch:
> 
> https://patchwork.kernel.org/patch/10480569/
> 
> Thanks

As written in https://bugzilla.kernel.org/show_bug.cgi?id=199425#c13, this patch fix another bug. It doesn't help with this use-after-free. Your patch is still needed.
Comment 17 Daniel Vetter 2018-08-17 09:31:53 UTC
Can you pls attach a new kasan backtrace with my patch

https://patchwork.freedesktop.org/patch/230355/

applied? Just want to double check nothing has moved, and also whether some other peculiarities of the stacktraces are invariant.
Comment 18 Johannes Hirte 2018-08-20 06:28:02 UTC
[183309.195913] ==================================================================
[183309.195937] BUG: KASAN: use-after-free in drm_atomic_helper_wait_for_flip_done+0x212/0x270
[183309.195944] Read of size 8 at addr ffff880115b906a8 by task kworker/u8:1/12462

[183309.195956] CPU: 1 PID: 12462 Comm: kworker/u8:1 Not tainted 4.18.0-00001-g61b0dd9978b0 #14
[183309.195961] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.15 03/26/2018
[183309.195968] Workqueue: events_unbound commit_work
[183309.195973] Call Trace:
[183309.195985]  dump_stack+0x5b/0x90
[183309.195993]  print_address_description+0x60/0x229
[183309.195999]  ? drm_atomic_helper_wait_for_flip_done+0x212/0x270
[183309.196005]  kasan_report.cold.5+0x241/0x2ff
[183309.196011]  drm_atomic_helper_wait_for_flip_done+0x212/0x270
[183309.196020]  amdgpu_dm_atomic_commit_tail+0x2718/0x4040
[183309.196029]  ? _raw_spin_unlock_irq+0x35/0x50
[183309.196034]  ? wait_for_completion_timeout+0x214/0x2d0
[183309.196040]  ? commit_planes_to_stream.constprop.47+0x13b0/0x13b0
[183309.196047]  ? finish_task_switch+0x1a0/0x700
[183309.196052]  ? drm_atomic_helper_wait_for_dependencies+0x478/0x7e0
[183309.196058]  commit_tail+0x91/0xe0
[183309.196064]  process_one_work+0x866/0x1460
[183309.196071]  worker_thread+0x82/0xf60
[183309.196076]  ? _raw_spin_unlock_irqrestore+0x3a/0x70
[183309.196081]  ? __kthread_parkme+0x7d/0xf0
[183309.196086]  ? rescuer_thread+0xcd0/0xcd0
[183309.196090]  kthread+0x2cf/0x380
[183309.196095]  ? kthread_create_worker+0xd0/0xd0
[183309.196100]  ret_from_fork+0x22/0x40

[183309.196109] Allocated by task 570:
[183309.196116]  kasan_kmalloc+0xbf/0xe0
[183309.196123]  kmem_cache_alloc_trace+0xf3/0x1f0
[183309.196128]  dm_crtc_duplicate_state+0x73/0x130
[183309.196134]  drm_atomic_get_crtc_state+0x142/0x400
[183309.196138]  page_flip_common+0x52/0x220
[183309.196142]  drm_atomic_helper_page_flip+0xa1/0x100
[183309.196148]  drm_mode_page_flip_ioctl+0xc46/0x1090
[183309.196152]  drm_ioctl_kernel+0x192/0x210
[183309.196156]  drm_ioctl+0x3ea/0x850
[183309.196161]  amdgpu_drm_ioctl+0xc7/0x1a0
[183309.196165]  do_vfs_ioctl+0x18e/0xed0
[183309.196169]  ksys_ioctl+0x5b/0x90
[183309.196173]  __x64_sys_ioctl+0x6a/0xb0
[183309.196177]  do_syscall_64+0x95/0x2f0
[183309.196183]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[183309.196188] Freed by task 634:
[183309.196193]  __kasan_slab_free+0x125/0x170
[183309.196197]  kfree+0x8b/0x1c0
[183309.196202]  drm_atomic_state_default_clear+0x310/0xc40
[183309.196206]  __drm_atomic_state_free+0x30/0xc0
[183309.196210]  drm_atomic_helper_update_plane+0xa7/0x350
[183309.196214]  __setplane_internal+0x2d1/0x820
[183309.196218]  drm_mode_cursor_universal+0x2f0/0x910
[183309.196222]  drm_mode_cursor_common+0x49a/0x880
[183309.196226]  drm_mode_cursor_ioctl+0x81/0xb0
[183309.196229]  drm_ioctl_kernel+0x192/0x210
[183309.196233]  drm_ioctl+0x3ea/0x850
[183309.196237]  amdgpu_drm_ioctl+0xc7/0x1a0
[183309.196241]  do_vfs_ioctl+0x18e/0xed0
[183309.196244]  ksys_ioctl+0x5b/0x90
[183309.196248]  __x64_sys_ioctl+0x6a/0xb0
[183309.196252]  do_syscall_64+0x95/0x2f0
[183309.196256]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[183309.196263] The buggy address belongs to the object at ffff880115b90480
                 which belongs to the cache kmalloc-1024 of size 1024
[183309.196269] The buggy address is located 552 bytes inside of
                 1024-byte region [ffff880115b90480, ffff880115b90880)
[183309.196274] The buggy address belongs to the page:
[183309.196279] page:ffffea000456e400 count:1 mapcount:0 mapping:ffff8803ef002c40 index:0x0 compound_mapcount: 0
[183309.196286] flags: 0x2000000000008100(slab|head)
[183309.196294] raw: 2000000000008100 ffffea000ceba800 0000000200000002 ffff8803ef002c40
[183309.196300] raw: 0000000000000000 00000000801c001c 00000001ffffffff 0000000000000000
[183309.196303] page dumped because: kasan: bad access detected

[183309.196308] Memory state around the buggy address:
[183309.196312]  ffff880115b90580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[183309.196317]  ffff880115b90600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[183309.196321] >ffff880115b90680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[183309.196324]                                   ^
[183309.196328]  ffff880115b90700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[183309.196332]  ffff880115b90780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[183309.196335] ==================================================================
[183309.196338] Disabling lock debugging due to kernel taint


This is with kernel 4.18.0 and your patch on top.