Bug 198437

Summary: KASAN: memorize and print call_rcu stack
Product: Memory Management Reporter: Dmitry Vyukov (dvyukov)
Component: SanitizersAssignee: Dmitry Vyukov (dvyukov)
Status: RESOLVED CODE_FIX    
Severity: normal CC: andreyknvl, kasan-dev, walter-zh.wu
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: ALL Subsystem:
Regression: No Bisected commit-id:

Description Dmitry Vyukov 2018-01-10 14:26:17 UTC
In kernel lots of objects are freed by rcu, in such cases free stack is close to useless, e.g.:
https://groups.google.com/forum/#!topic/syzkaller-bugs/2pLLqWLYweo

We should extend KASAN to be able to remember 1 or several aux stacks for each object in FIFO order, insert hooks that note aux stacks to call_rcu (and maybe timer/workqueue functions), and print aux stacks in reports.

Ideally, we don't increase alloc_info size for this. I think we need to get back to storing free_info inside of the object.
Comment 1 Walter Wu 2020-04-24 06:59:44 UTC
Hi Dmitry,

Interesting.
It looks like to record free stack which are freed with RCU and print them in report?

Walter
Comment 2 Dmitry Vyukov 2020-04-24 07:44:24 UTC
We already record free stack, the idea is to record call_rcu and maybe some timer/workqueue stacks for heap objects. The tricky part is how to not increase the header size.
Comment 3 Walter Wu 2020-04-24 08:10:53 UTC
Yes, I know to record call_rcu and print a backtrace to get where call call_rcu. 
My original thought is simultaneously to print two stack(free stack and call_rcu call stack) in report. If we only want to print one of stack, then we need to know who should be replaced?
Comment 4 Dmitry Vyukov 2020-04-24 08:29:12 UTC
We should still remember and print the free stack. We don't know if the bug is anyhow related to rcu/times/workqueue/etc, so it should be only an additional info, not replace anything we memorize/print now.
Comment 5 Walter Wu 2020-04-24 08:48:53 UTC
I agree you meaning. but if we don't replace someone info, then the additional call_rcu info must increase the header size? The tricky part is not easy to implement it. ~^.^~
Comment 6 Dmitry Vyukov 2020-04-24 08:59:48 UTC
If it would be trivial, it would be already done :)

Current header is:

struct kasan_alloc_meta {
	struct kasan_track alloc_track;
	struct kasan_track free_track;
};

This is 16 bytes. This is a good size because this is the minimal redzone size (see optimal_redzone) and a good number of alignment.

We can remove free_track by restoring kasan_free_meta (storing free meta data in the freed object). This gives us space to memorize 2 more stacks (without pid, just depot_stack_handle_t).

A simple version may memorize 2 stacks, and after that if we need to memorize more either (1) overwrite a random slot, (2) always overwrite the second slot. (2) may be better because it deterministically gives first and last aux stacks, and we know which one is which (can say in the report that this is the "last" one).

A more elaborate version could also store a bit somewhere which would say what of the 2 stacks is the oldest one, and then we can overwrite the oldest one, so we always have 2 latest stacks and we know which one is which.
Comment 7 Walter Wu 2020-04-27 04:01:41 UTC
Thank you for good suggestion. It seems to be a working solution, we will try to implement it. 

Thanks
Comment 8 Dmitry Vyukov 2020-12-01 08:01:10 UTC
The following commit by Walter Wu addresses the main part:

26e760c9a7c8 rcu: kasan: record and print call_rcu() call stack

There are also these patches mailed, but not in linux-next yet:

[PATCH v4 0/6] kasan: add workqueue and timer stack for generic KASAN
[PATCH v2] rcu: kasan: record and print kvfree_call_rcu call stack

Once they are in linux-next, I think this issues can be closed.
Comment 9 Dmitry Vyukov 2021-02-03 11:00:00 UTC
I think this can be considered fixed now, we have support for RCU and workqueue. Works like a charm. The credit goes to Walter Wu.
Comment 10 Dmitry Vyukov 2021-03-10 12:58:27 UTC
I think it also makes sense to memorize task_work_add() stacks as aux stacks otherwise there UAF reports like this:

Freed by task 93:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:357
 ____kasan_slab_free mm/kasan/common.c:360 [inline]
 ____kasan_slab_free mm/kasan/common.c:325 [inline]
 __kasan_slab_free+0xf5/0x130 mm/kasan/common.c:367
 kasan_slab_free include/linux/kasan.h:199 [inline]
 slab_free_hook mm/slub.c:1562 [inline]
 slab_free_freelist_hook+0x92/0x210 mm/slub.c:1600
 slab_free mm/slub.c:3161 [inline]
 kmem_cache_free+0x8a/0x740 mm/slub.c:3177
 rcu_do_batch kernel/rcu/tree.c:2559 [inline]
 rcu_core+0x74a/0x12f0 kernel/rcu/tree.c:2794
 __do_softirq+0x29b/0x9f6 kernel/softirq.c:345

Last potentially related work creation:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_record_aux_stack+0xe5/0x110 mm/kasan/generic.c:345
 __call_rcu kernel/rcu/tree.c:3039 [inline]
 call_rcu+0xb1/0x740 kernel/rcu/tree.c:3114
 task_work_run+0xdd/0x1a0 kernel/task_work.c:140
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
 exit_to_user_mode_prepare+0x249/0x250 kernel/entry/common.c:208
 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x44/0xae

This communicates almost nothing.
And there is a number of these:

https://groups.google.com/g/syzkaller-bugs/search?q=kasan%20use-after-free%20task_work_run

task_work_add() is called in few places:

https://elixir.bootlin.com/linux/v5.12-rc2/C/ident/task_work_add

so it should not produce lots of aux stacks, but most notably the file closing stack seems to be critical to understand lots of reports.

Walter, what do you think? Do you mind sending such patch?
Comment 11 Walter Wu 2021-03-11 02:14:28 UTC
It seems to be needed if we want to solve the UAF. I will send the patch. 

Thanks for your information.
Comment 12 Andrey Konovalov 2022-02-28 00:31:33 UTC
This seems resolved: task_work_add() has kasan_record_aux_stack() as of [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23f61f0fe106da8c9f6a883965439ecc2838f116