LINUX KERNEL 5.4.0-rc2 - __blk_add_trace use-after-free 0x01 - Introduction === # Product: Linux Kernel # Version: 5.4.0-rc2 (and probably other versions) # Bug: UAF (Read) # Tested on: GNU/Linux Debian 9 x86_64 0x02 - Details === There is a UAF read in the __blk_add_trace function which is used to fill out a blk_io_trace structure and places it in a per-cpu subbuffer. Code analysis (kernel/trace/blktrace.c): static void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes, int op, int op_flags, u32 what, int error, int pdu_len, void *pdu_data, union kernfs_node_id *cgid) { struct task_struct *tsk = current; struct ring_buffer_event *event = NULL; struct ring_buffer *buffer = NULL; struct blk_io_trace *t; unsigned long flags = 0; unsigned long *sequence; pid_t pid; int cpu, pc = 0; bool blk_tracer = blk_tracer_enabled; ssize_t cgid_len = cgid ? sizeof(*cgid) : 0; if (unlikely(bt->trace_state != Blktrace_running && !blk_tracer)) // UAF read here return; what |= ddir_act[op_is_write(op) ? WRITE : READ]; what |= MASK_TC_BIT(op_flags, SYNC); ../.. 0x03 - Crash report === BUG: KASAN: use-after-free in __blk_add_trace.constprop.27+0x801/0xbb0 kernel/trace/blktrace.c:228 Read of size 4 at addr ffff88803019be00 by task jbd2/sda-8/100 CPU: 1 PID: 100 Comm: jbd2/sda-8 Not tainted 5.4.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x75/0xae lib/dump_stack.c:113 print_address_description.constprop.6+0x16/0x220 mm/kasan/report.c:374 __kasan_report.cold.9+0x1a/0x40 mm/kasan/report.c:506 kasan_report+0xe/0x20 mm/kasan/common.c:634 __blk_add_trace.constprop.27+0x801/0xbb0 kernel/trace/blktrace.c:228 blk_add_trace_bio.isra.19+0xb1/0x100 kernel/trace/blktrace.c:865 trace_block_bio_queue include/trace/events/block.h:357 [inline] generic_make_request_checks+0xd05/0x1300 block/blk-core.c:966 generic_make_request+0x7e/0x910 block/blk-core.c:1018 submit_bio+0xa7/0x3c0 block/blk-core.c:1190 submit_bh_wbc.isra.56+0x504/0x6e0 fs/buffer.c:3095 jbd2_journal_commit_transaction+0x2639/0x6365 fs/jbd2/commit.c:727 kjournald2+0x1fd/0x7f0 fs/jbd2/journal.c:210 kthread+0x308/0x3c0 kernel/kthread.c:255 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352 Allocated by task 10077: // <--- where it has been allocated save_stack+0x19/0x80 mm/kasan/common.c:69 set_track mm/kasan/common.c:77 [inline] __kasan_kmalloc mm/kasan/common.c:510 [inline] __kasan_kmalloc.constprop.6+0xc1/0xd0 mm/kasan/common.c:483 kmalloc include/linux/slab.h:556 [inline] kzalloc include/linux/slab.h:690 [inline] do_blk_trace_setup+0x157/0xaa0 kernel/trace/blktrace.c:497 __blk_trace_setup+0xa8/0x140 kernel/trace/blktrace.c:571 blk_trace_setup+0x41/0x60 kernel/trace/blktrace.c:589 sg_ioctl+0xd50/0x2540 drivers/scsi/sg.c:1112 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1a0/0x1000 fs/ioctl.c:696 ksys_ioctl+0x77/0xa0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:718 do_syscall_64+0x9a/0x330 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 340: // <--- where it has been freed save_stack+0x19/0x80 mm/kasan/common.c:69 set_track mm/kasan/common.c:77 [inline] kasan_set_free_info mm/kasan/common.c:332 [inline] __kasan_slab_free+0x125/0x170 mm/kasan/common.c:471 slab_free_hook mm/slub.c:1424 [inline] slab_free_freelist_hook mm/slub.c:1475 [inline] slab_free mm/slub.c:3018 [inline] kfree+0x90/0x240 mm/slub.c:3967 blk_trace_cleanup kernel/trace/blktrace.c:339 [inline] __blk_trace_remove+0x5f/0x80 kernel/trace/blktrace.c:352 blk_trace_remove+0x21/0x30 kernel/trace/blktrace.c:362 sg_ioctl+0x66f/0x2540 drivers/scsi/sg.c:1121 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1a0/0x1000 fs/ioctl.c:696 ksys_ioctl+0x77/0xa0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:718 do_syscall_64+0x9a/0x330 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff88803019be00 which belongs to the cache kmalloc-128 of size 128 // <--- the object's cache The buggy address is located 0 bytes inside of 128-byte region [ffff88803019be00, ffff88803019be80) The buggy address belongs to the page: page:ffffea0000c066c0 refcount:1 mapcount:0 mapping:ffff888035c01640 index:0xffff88803019bf00 flags: 0x100000000000200(slab) raw: 0100000000000200 ffffea0000d51788 ffff888035c00d10 ffff888035c01640 raw: ffff88803019bf00 0000000000100006 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88803019bd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc ffff88803019bd80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88803019be00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88803019be80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88803019bf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ==================================================================
It seems to me that to properly resolve this race, we need to protect q->blk_trace with RCU and do the teardown and freeing of the structure only after the RCU period expires... Any better idea?
Created attachment 287171 [details] blktrace: Protect q->blktrace with RCU I've submitted this patch upstream.
Tristan, I'm not sure how reproducible this is but if you can reasonably reproduce, I'd be happy if you checked whether this patch fixes the problem for you.
(In reply to Jan Kara from comment #3) > Tristan, I'm not sure how reproducible this is but if you can reasonably > reproduce, I'd be happy if you checked whether this patch fixes the problem > for you. Hi, Jan Kara. I am also trying to fix this problem and and I am now trying to reproduce it. I think the two functions: sysfs_blk_trace_attr_store() and sysfs_blk_trace_attr_show() are okay and do not need to be modified because mutex_lock(&q->blk_trace_mutex) provides sufficient protection for this. The other modifications are basically similar to your patch.
(In reply to Jan Kara from comment #1) > It seems to me that to properly resolve this race, we need to protect > q->blk_trace with RCU and do the teardown and freeing of the structure only > after the RCU period expires... Any better idea? IMO, RCU might be the best approach. It should be fixed by allocating 'struct blk_trace' for the request queue statically, or using percpu_ref too. Looks RCU is still better and simpler.
The patches are committed - c780e86dd48 ("blktrace: Protect q->blk_trace with RCU"). Closing the bug.