In kernel lots of objects are freed by rcu, in such cases free stack is close to useless, e.g.: https://groups.google.com/forum/#!topic/syzkaller-bugs/2pLLqWLYweo We should extend KASAN to be able to remember 1 or several aux stacks for each object in FIFO order, insert hooks that note aux stacks to call_rcu (and maybe timer/workqueue functions), and print aux stacks in reports. Ideally, we don't increase alloc_info size for this. I think we need to get back to storing free_info inside of the object.
Hi Dmitry, Interesting. It looks like to record free stack which are freed with RCU and print them in report? Walter
We already record free stack, the idea is to record call_rcu and maybe some timer/workqueue stacks for heap objects. The tricky part is how to not increase the header size.
Yes, I know to record call_rcu and print a backtrace to get where call call_rcu. My original thought is simultaneously to print two stack(free stack and call_rcu call stack) in report. If we only want to print one of stack, then we need to know who should be replaced?
We should still remember and print the free stack. We don't know if the bug is anyhow related to rcu/times/workqueue/etc, so it should be only an additional info, not replace anything we memorize/print now.
I agree you meaning. but if we don't replace someone info, then the additional call_rcu info must increase the header size? The tricky part is not easy to implement it. ~^.^~
If it would be trivial, it would be already done :) Current header is: struct kasan_alloc_meta { struct kasan_track alloc_track; struct kasan_track free_track; }; This is 16 bytes. This is a good size because this is the minimal redzone size (see optimal_redzone) and a good number of alignment. We can remove free_track by restoring kasan_free_meta (storing free meta data in the freed object). This gives us space to memorize 2 more stacks (without pid, just depot_stack_handle_t). A simple version may memorize 2 stacks, and after that if we need to memorize more either (1) overwrite a random slot, (2) always overwrite the second slot. (2) may be better because it deterministically gives first and last aux stacks, and we know which one is which (can say in the report that this is the "last" one). A more elaborate version could also store a bit somewhere which would say what of the 2 stacks is the oldest one, and then we can overwrite the oldest one, so we always have 2 latest stacks and we know which one is which.
Thank you for good suggestion. It seems to be a working solution, we will try to implement it. Thanks
The following commit by Walter Wu addresses the main part: 26e760c9a7c8 rcu: kasan: record and print call_rcu() call stack There are also these patches mailed, but not in linux-next yet: [PATCH v4 0/6] kasan: add workqueue and timer stack for generic KASAN [PATCH v2] rcu: kasan: record and print kvfree_call_rcu call stack Once they are in linux-next, I think this issues can be closed.
I think this can be considered fixed now, we have support for RCU and workqueue. Works like a charm. The credit goes to Walter Wu.
I think it also makes sense to memorize task_work_add() stacks as aux stacks otherwise there UAF reports like this: Freed by task 93: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:357 ____kasan_slab_free mm/kasan/common.c:360 [inline] ____kasan_slab_free mm/kasan/common.c:325 [inline] __kasan_slab_free+0xf5/0x130 mm/kasan/common.c:367 kasan_slab_free include/linux/kasan.h:199 [inline] slab_free_hook mm/slub.c:1562 [inline] slab_free_freelist_hook+0x92/0x210 mm/slub.c:1600 slab_free mm/slub.c:3161 [inline] kmem_cache_free+0x8a/0x740 mm/slub.c:3177 rcu_do_batch kernel/rcu/tree.c:2559 [inline] rcu_core+0x74a/0x12f0 kernel/rcu/tree.c:2794 __do_softirq+0x29b/0x9f6 kernel/softirq.c:345 Last potentially related work creation: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38 kasan_record_aux_stack+0xe5/0x110 mm/kasan/generic.c:345 __call_rcu kernel/rcu/tree.c:3039 [inline] call_rcu+0xb1/0x740 kernel/rcu/tree.c:3114 task_work_run+0xdd/0x1a0 kernel/task_work.c:140 tracehook_notify_resume include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop kernel/entry/common.c:174 [inline] exit_to_user_mode_prepare+0x249/0x250 kernel/entry/common.c:208 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline] syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x44/0xae This communicates almost nothing. And there is a number of these: https://groups.google.com/g/syzkaller-bugs/search?q=kasan%20use-after-free%20task_work_run task_work_add() is called in few places: https://elixir.bootlin.com/linux/v5.12-rc2/C/ident/task_work_add so it should not produce lots of aux stacks, but most notably the file closing stack seems to be critical to understand lots of reports. Walter, what do you think? Do you mind sending such patch?
It seems to be needed if we want to solve the UAF. I will send the patch. Thanks for your information.
This seems resolved: task_work_add() has kasan_record_aux_stack() as of [1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23f61f0fe106da8c9f6a883965439ecc2838f116