Bug 203497
Summary: | KASAN (sw-tags): support stack instrumentation | ||
---|---|---|---|
Product: | Memory Management | Reporter: | Andrey Konovalov (andreyknvl) |
Component: | Sanitizers | Assignee: | MM/Sanitizers virtual assignee (mm_sanitizers) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | dvyukov, kasan-dev, melver, walter-zh.wu |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | upstream | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Andrey Konovalov
2019-05-03 16:03:25 UTC
Hi Andrey, I try to make -hwasan-instrument-stack=1 and KASAN_OUTLINE=y, then it gets many false positive case as you saying. We break down the case which executed kasan_stack_oob(), the following is shown. a). Modify the following testcase is not oob case to see why it is triggered. static noinline void kasan_stack_oob(void) { char stack_array[80]; char *p = &stack_array[17]; // it should not trigger *(volatile char *)p; pr_info("out-of-bounds on stack 0x%lx\n",p); } b). trigger by kasan_stack_oob+0x48 [ 36.096929] ================================================================== [ 36.097867] BUG: KASAN: invalid-access in kasan_stack_oob+0x48/0x94 [ 36.098465] Read of size 1 at addr 00ff0000704875b1 by task cat/179 [ 36.098996] Pointer tag: [00], memory tag: [04] [ 36.099421] [ 36.099996] CPU: 3 PID: 179 Comm: cat Tainted: G B 5.6.0-next-20200408-dirty #13 [ 36.100617] Hardware name: linux,dummy-virt (DT) [ 36.101049] Call trace: [ 36.101489] dump_backtrace+0x0/0x260 [ 36.101976] show_stack+0x14/0x1c [ 36.102451] dump_stack+0xe0/0x150 [ 36.102944] print_address_description+0x8c/0x398 [ 36.103485] __kasan_report+0x14c/0x22c [ 36.103991] kasan_report+0x3c/0x58 [ 36.104470] check_memory_region+0x98/0xa0 [ 36.104981] __hwasan_load1_noabort+0x18/0x20 [ 36.105501] kasan_stack_oob+0x48/0x94 ... [ 36.118039] Memory state around the buggy address: [ 36.118593] ffff000070487300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 36.119247] ffff000070487400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 36.119894] >ffff000070487500: 00 00 00 00 00 00 00 00 00 00 04 04 04 04 04 00 [ 36.120473] ^ [ 36.121067] ffff000070487600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 36.121712] ffff000070487700: 00 00 00 00 00 00 00 00 00 00 34 34 34 f4 74 74 [ 36.122271] ================================================================== c). We think the wrong is "and x20, x10, x9", x20 doesn't get correct tag which will store in shadow memory, x20[56:63] and x1[0:7] should have the same? so the next tag comparing may wrong. ffff9000104a331c <kasan_stack_oob>: ffff9000104a331c: d10203ff sub sp, sp, #0x80 ffff9000104a3320: a9067bfd stp x29, x30, [sp, #96] ffff9000104a3324: a9074ff4 stp x20, x19, [sp, #112] ffff9000104a3328: 910183fd add x29, sp, #0x60 ffff9000104a332c: d000eec8 adrp x8, ffff90001227d000 <page_wait_table+0x14c0> ffff9000104a3330: f944a508 ldr x8, [x8, #2376] ffff9000104a3334: ca5d53a1 eor x1, x29, x29, lsr #20 ffff9000104a3338: 92ffe009 mov x9, #0xffffffffffffff // #72057594037927935 ffff9000104a333c: 910003ea mov x10, sp ffff9000104a3340: b3481c29 bfi x9, x1, #56, #8 ffff9000104a3344: 910003e0 mov x0, sp ffff9000104a3348: 52800a02 mov w2, #0x50 // #80 ffff9000104a334c: f81f83a8 stur x8, [x29, #-8] ffff9000104a3350: 8a090154 and x20, x10, x9 ffff9000104a3354: 97fb24c7 bl ffff90001036c670 <__hwasan_tag_memory> ffff9000104a3358: 91004693 add x19, x20, #0x11 ffff9000104a335c: aa1303e0 mov x0, x19 ffff9000104a3360: 97fb2466 bl ffff90001036c4f8 <__hwasan_load1_noabort> AFAIU the issue here is that HWASAN instrumentation doesn't expect the sp register to be tagged, but with KASAN it might happen if the stack is allocated in slab memory.
This patch should fix the issue:
diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..c9c76a4d1180 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -261,7 +261,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
THREAD_SIZE_ORDER);
if (likely(page)) {
- tsk->stack = page_address(page);
+ tsk->stack = kasan_reset_tag(page_address(page));
return tsk->stack;
}
return NULL;
@@ -307,6 +307,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk,
{
unsigned long *stack;
stack = kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node);
+ stack = kasan_reset_tag(stack);
tsk->stack = stack;
return stack;
}
However even with this change there's something else that's wrong, I still see the following crash during boot:
==================================================================
BUG: KASAN: invalid-access in start_kernel+0xd0/0x568
Read of size 8 at addr 63ff900012337f90 by task swapper/0
Pointer tag: [63], memory tag: [ff]
CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-rc1-15086-gb3a9e3b9622a-dirty #64
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x2e4 arch/arm64/kernel/time.c:51
show_stack+0x1c/0x28 arch/arm64/kernel/traps.c:142
__dump_stack lib/dump_stack.c:77
dump_stack+0xf0/0x16c lib/dump_stack.c:118
print_address_description+0x7c/0x308 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513
kasan_report+0x19c/0x26c mm/kasan/report.c:530
check_memory_region+0xa4/0xac mm/kasan/tags.c:127
__hwasan_load8_noabort+0x44/0x50 mm/kasan/tags.c:144
start_kernel+0xd0/0x568 init/main.c:854
Memory state around the buggy address:
ffff900012337d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff900012337e00: 00 00 00 00 00 00 00 00 00 00 00 ff 00 00 ff ff
>ffff900012337f00: ff 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff900012338000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff900012338100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
If I understand correctly, I originally think this issue can fixed by adding an additional instruction "bfi x20, x1, #56, #8" after this instruction "and x20, x10, x9". After this modification, x20[56:63] can get the x1[0:7]. Its purpose is shown below: 1) x20[56:63] is pointer tag. It will needed by 3). 2) x1[0:7] is random tag, it generate from "eor x1, x29, x29, lsr #20". This tag will store in shadow memory. It will needed by 3). 3) __hwasan_load1_noabort() will check whether x20[56:63] and x1[0:7] are the same. --- assemble code: ffff9000104a331c <kasan_stack_oob>: ffff9000104a331c: d10203ff sub sp, sp, #0x80 ffff9000104a3320: a9067bfd stp x29, x30, [sp, #96] ffff9000104a3324: a9074ff4 stp x20, x19, [sp, #112] ffff9000104a3328: 910183fd add x29, sp, #0x60 ffff9000104a332c: d000eec8 adrp x8, ffff90001227d000 <page_wait_table+0x14c0> ffff9000104a3330: f944a508 ldr x8, [x8, #2376] ffff9000104a3334: ca5d53a1 eor x1, x29, x29, lsr #20 ffff9000104a3338: 92ffe009 mov x9, #0xffffffffffffff // #72057594037927935 ffff9000104a333c: 910003ea mov x10, sp ffff9000104a3340: b3481c29 bfi x9, x1, #56, #8 ffff9000104a3344: 910003e0 mov x0, sp ffff9000104a3348: 52800a02 mov w2, #0x50 // #80 ffff9000104a334c: f81f83a8 stur x8, [x29, #-8] ffff9000104a3350: 8a090154 and x20, x10, x9 ffff9000104a3354: 97fb24c7 bl ffff90001036c670 <__hwasan_tag_memory> ffff9000104a3358: 91004693 add x19, x20, #0x11 ffff9000104a335c: aa1303e0 mov x0, x19 ffff9000104a3360: 97fb2466 bl ffff90001036c4f8 <__hwasan_load1_noabort> This can be fixed by changing instrumentation, but I prefer the solution with untagging the kernel stacks. Userspace HWASAN can't have sp register tagged right now, and I think we should make the kernel do the same. Agree, It makes sense that kernel and user have the same instrumentation. I try to trace code and reproduce this issue after only modify untagging kernel stack, I found below code need to untag the sp? --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2575,7 +2576,7 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, .child_tid = child_tidptr, .parent_tid = parent_tidptr, .exit_signal = (clone_flags & CSIGNAL), - .stack = newsp, + .stack = kasan_reset_tag(newsp), .tls = tls, }; After apply your and my patch, it always trigger the following one report during boot. I will see why it always is triggered. ================================================================== [ 0.002285] BUG: KASAN: invalid-access in format_decode+0x90/0x10fc [ 0.002306] Read of size 8 at addr 74ff900015447a00 by task swapper/0 [ 0.002324] Pointer tag: [74], memory tag: [08] [ 0.002347] [ 0.002375] CPU: 0 PID: 0 Comm: swapper Not tainted 5.6.0-next-20200408-dirty #4 [ 0.002395] Hardware name: linux,dummy-virt (DT) [ 0.002412] Call trace: [ 0.002430] dump_backtrace+0x0/0x578 [ 0.002447] show_stack+0x14/0x1c [ 0.002464] dump_stack+0x188/0x260 [ 0.002481] print_address_description+0x8c/0x398 [ 0.002498] __kasan_report+0x14c/0x1dc [ 0.002515] kasan_report+0x3c/0x58 [ 0.002533] check_memory_region+0x98/0xa0 [ 0.002551] __hwasan_loadN_noabort+0x14/0x1c [ 0.002568] format_decode+0x90/0x10fc [ 0.002585] vsnprintf+0x184/0x31e4 [ 0.002602] vscnprintf+0x80/0xd4 [ 0.002619] vprintk_store+0x98/0x93c [ 0.002636] vprintk_emit+0x168/0x79c [ 0.002653] vprintk_default+0x78/0xa8 [ 0.002670] vprintk_func+0x918/0x9a0 [ 0.002687] printk+0xb8/0xf0 [ 0.002704] kasan_init+0x2b8/0x2d8 [ 0.002721] setup_arch+0x460/0xbc8 [ 0.002738] start_kernel+0xe4/0xb88 [ 0.002755] [ 0.002771] [ 0.002789] Memory state around the buggy address: [ 0.002807] ffff900015447800: 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff [ 0.002826] ffff900015447900: ff ff ff ff ff 08 ff ff ff ff ff ff ff ff ff ff [ 0.002844] >ffff900015447a00: 08 ff ff ff ff ff ff ff e4 e4 ff ff ff ff ff ff [ 0.002861] ^ [ 0.002879] ffff900015447b00: ff 14 14 ff ff ff ff ff ff ff ff ff a4 a4 ff ff [ 0.002898] ffff900015447c00: ff ff ff ff ff d4 d4 ff ff ff ff ff ff 94 94 d4 [ 0.002916] ================================================================== newsp is a userspace pointer, it's incorrect to use it with kasan_reset_tag(). Why do you need to do anything with it anyway? I forget it is syscall, we should not need modify this sp. I always see below the report on qemu, the stack variable is 'spec', but it should not have corruption. ================================================================== [ 0.002089] BUG: KASAN: invalid-access in format_decode+0x90/0x10fc [ 0.002110] Read of size 8 at addr 74ff900015447a00 by task swapper/0 [ 0.002127] Pointer tag: [74], memory tag: [08] [ 0.002148] [ 0.002175] CPU: 0 PID: 0 Comm: swapper Not tainted 5.6.0-next-20200408-dirty #23 [ 0.002193] Hardware name: linux,dummy-virt (DT) [ 0.002209] Call trace: [ 0.002225] dump_backtrace+0x0/0x578 [ 0.002242] show_stack+0x14/0x1c [ 0.002258] dump_stack+0x188/0x260 [ 0.002274] print_address_description+0x8c/0x398 [ 0.002291] __kasan_report+0x14c/0x1dc [ 0.002307] kasan_report+0x3c/0x58 [ 0.002323] check_memory_region+0x98/0xa0 [ 0.002339] __hwasan_loadN_noabort+0x14/0x1c [ 0.002356] format_decode+0x90/0x10fc [ 0.002372] vsnprintf+0x184/0x31e4 [ 0.002388] vscnprintf+0x80/0xd4 [ 0.002404] vprintk_store+0x98/0x93c [ 0.002420] vprintk_emit+0x168/0x79c [ 0.002436] vprintk_default+0x78/0xa8 [ 0.002452] vprintk_func+0x918/0x9a0 [ 0.002468] printk+0xb8/0xf0 [ 0.002484] kasan_init+0x2b8/0x2d8 [ 0.002500] setup_arch+0x460/0xbc8 [ 0.002517] start_kernel+0xe4/0xb88 [ 0.002532] [ 0.002548] [ 0.002564] Memory state around the buggy address: [ 0.002582] ffff900015447800: 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff [ 0.002599] ffff900015447900: ff ff ff ff ff 08 ff ff ff ff ff ff ff ff ff ff [ 0.002617] >ffff900015447a00: 08 ff ff ff ff ff ff ff e4 e4 ff ff ff ff ff ff [ 0.002633] ^ [ 0.002650] ffff900015447b00: ff 14 14 ff ff ff ff ff ff ff ff ff a4 a4 ff ff [ 0.002668] ffff900015447c00: ff ff ff ff ff d4 d4 ff ff ff ff ff ff 94 94 d4 [ 0.002685] ================================================================== I try to add below patch then I see that "BUG: KASAN: invalid-access in start_kernel". I am not sure whether you need this information. --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -2278,7 +2278,7 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr, * @precision: precision of a number * @qualifier: qualifier of a number (long, size_t, ...) */ -static noinline_for_stack +static int format_decode(const char *fmt, struct printf_spec *spec) { const char *start = fmt; My environment: - clang-r377782d - linux-next-20200408 ================================================================== [ 0.000000] BUG: KASAN: invalid-access in start_kernel+0x718/0xb88 [ 0.000000] Read of size 8 at addr 74ff900015447f70 by task swapper/0 [ 0.000000] Pointer tag: [74], memory tag: [ff] [ 0.000000] [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.6.0-next-20200408-dirty #4 [ 0.000000] Hardware name: linux,dummy-virt (DT) [ 0.000000] Call trace: [ 0.000000] dump_backtrace+0x0/0x578 [ 0.000000] show_stack+0x14/0x1c [ 0.000000] dump_stack+0x188/0x260 [ 0.000000] print_address_description+0x8c/0x398 [ 0.000000] __kasan_report+0x14c/0x1dc [ 0.000000] kasan_report+0x3c/0x58 [ 0.000000] kasan_handler+0x88/0x22c [ 0.000000] early_brk64+0x1c/0x38 [ 0.000000] do_debug_exception+0x4c4/0x814 [ 0.000000] el1_sync_handler+0x40/0x244 [ 0.000000] el1_sync+0x7c/0x100 [ 0.000000] start_kernel+0x718/0xb88 [ 0.000000] --- It looks like that the KASAN report is triggered by start_kernel(), as I remember that process 0 execute start_kernel(), the process 0 should be init_task, the tag of init_task.stack should be 0xff? below is init_task structure. struct task_struct init_task #ifdef CONFIG_ARCH_TASK_STRUCT_ON_STACK __init_task_data #endif = { ... .stack = init_stack, No, init_stack is a global and it's not tagged. The problem here is that in start_kernel() we tag stack variables before kasan_init(), and those tags end up in temporary shadow memory. Later we enable KASAN, switch to normal shadow, and start checking tags, which are missing in normal shadow. We could disable KASAN instrumentation for start_kernel() (and probably for arm64's setup_arch() even though it doesn't have any stack variables right now). Dmitry, WDYT? This won't work, as there's a number of functions that are inlined into start_kernel(), and we'll need to mark those with __no_sanitize_address too. We could disable instrumentation of kernel/init.c, but it seems quite harsh. Maybe we can only disable stack instrumentation for tag-based mode for those files via something like KASAN_SANITIZE_STACK_init.o := n. (In reply to Andrey Konovalov from comment #11) > This won't work, as there's a number of functions that are inlined into > start_kernel(), and we'll need to mark those with __no_sanitize_address too. > We could disable instrumentation of kernel/init.c, but it seems quite harsh. Does tagged ASAN's inlining behaviour differ to ASAN? AFAIK, they don't, and inlined functions still "inherit" whether or not they're instrumented from the parent function. Same for __always_inline functions. So there is, AFAIK, no need to explicitly mark the inlined functions __no_sanitize. Hm, right, then the issue I'm getting is probably unrelated. Let me debug further. Thanks! Walter, please pass the stack through the scripts/decode_stacktrace.sh script to add line numbers. It would be useful to understand what exactly variable causes the report. In comments 8 and 9 you posted different stacks, which of these do you see and when? I would try KASAN_SANITIZE_STACK_init.o := n anyway b/c it's just cheap to try. One thing I would is the following. At the end of kasan_init we do: /* At this point kasan is fully initialized. Enable error messages */ init_task.kasan_depth = 0; I would try to move it way later, say to beginning of rest_init. Regardless of result it will give some additional information: is the problem related to start_kernel only? or it affects later execution as well? (In reply to Dmitry Vyukov from comment #15) > Walter, please pass the stack through the scripts/decode_stacktrace.sh > script to add line numbers. It would be useful to understand what exactly > variable causes the report. > In comments 8 and 9 you posted different stacks, which of these do you see > and when? > I can reproduce both reports. Please refer below the KASAN report in Comment 8. The corruption stack variable is "spec", but it should be a false positive, because it is declared by vsnprintf() and should be valid. I see the call stack, this report should be triggered by "pr_info("KernelAddressSanitizer initialized\n");". Unfortunately, it should be first false positive case after finish KASAN initialization. format_decode() trigger many KASAN reports during booting. But if I remove the noinline_for_stack in format_decode(), I will not see any KASAN reports which trigger by format_decode(), then I only see the report in Comment 9. ================================================================== [ 0.002077] BUG: KASAN: invalid-access in format_decode (outfolder/../lib/vsprintf.c:2288) [ 0.002098] Read of size 8 at addr 74ff900015447a00 by task swapper/0 [ 0.002116] Pointer tag: [74], memory tag: [08] [ 0.002138] [ 0.002167] CPU: 0 PID: 0 Comm: swapper Not tainted 5.6.0-next-20200408-dirty #6 [ 0.002187] Hardware name: linux,dummy-virt (DT) [ 0.002203] Call trace: [ 0.002220] dump_backtrace (outfolder/../arch/arm64/kernel/traps.c:87) [ 0.002236] show_stack (outfolder/../arch/arm64/kernel/traps.c:143) [ 0.002253] dump_stack (outfolder/../lib/dump_stack.c:121) [ 0.002269] print_address_description (outfolder/../mm/kasan/report.c:383) [ 0.002286] __kasan_report (outfolder/../mm/kasan/report.c:512) [ 0.002303] kasan_report (outfolder/../mm/kasan/common.c:625) [ 0.002319] check_memory_region (outfolder/../mm/kasan/tags.c:128) [ 0.002336] __hwasan_loadN_noabort (outfolder/../mm/kasan/tags.c:151) [ 0.002353] format_decode (outfolder/../lib/vsprintf.c:2288) [ 0.002369] vsnprintf (outfolder/../lib/vsprintf.c:2530) [ 0.002386] vscnprintf (outfolder/../lib/vsprintf.c:2679) [ 0.002411] vprintk_store (outfolder/../kernel/printk/printk.c:1918) [ 0.002429] vprintk_emit (outfolder/../kernel/printk/printk.c:1977) [ 0.002445] vprintk_default (outfolder/../kernel/printk/printk.c:2023) [ 0.002462] vprintk_func (outfolder/../kernel/printk/printk_safe.c:386) [ 0.002478] printk (outfolder/../kernel/printk/printk.c:2057) [ 0.002494] kasan_init (outfolder/../arch/arm64/mm/kasan_init.c:266) [ 0.002510] setup_arch (outfolder/../arch/arm64/kernel/setup.c:338) [ 0.002526] start_kernel (outfolder/../init/main.c:817) [ 0.002543] [ 0.002558] [ 0.002575] Memory state around the buggy address: [ 0.002593] ffff900015447800: 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff [ 0.002611] ffff900015447900: ff ff ff ff ff 08 ff ff ff ff ff ff ff ff ff ff [ 0.002628] >ffff900015447a00: 08 ff ff ff ff ff ff ff e4 e4 ff ff ff ff ff ff [ 0.002645] ^ [ 0.002663] ffff900015447b00: ff 14 14 ff ff ff ff ff ff ff ff ff a4 a4 ff ff [ 0.002680] ffff900015447c00: ff ff ff ff ff d4 d4 ff ff ff ff ff ff 94 94 d4 [ 0.002698] ================================================================== ================================================================== [ 0.000000] BUG: KASAN: invalid-access in start_kernel (outfolder/../init/main.c:817) [ 0.000000] Read of size 8 at addr 74ff900015447f70 by task swapper/0 [ 0.000000] Pointer tag: [74], memory tag: [ff] [ 0.000000] [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.6.0-next-20200408-dirty #5 [ 0.000000] Hardware name: linux,dummy-virt (DT) [ 0.000000] Call trace: [ 0.000000] dump_backtrace (outfolder/../arch/arm64/kernel/traps.c:87) [ 0.000000] show_stack (outfolder/../arch/arm64/kernel/traps.c:143) [ 0.000000] dump_stack (outfolder/../lib/dump_stack.c:121) [ 0.000000] print_address_description (outfolder/../mm/kasan/report.c:383) [ 0.000000] __kasan_report (outfolder/../mm/kasan/report.c:512) [ 0.000000] kasan_report (outfolder/../mm/kasan/common.c:625) [ 0.000000] kasan_handler (outfolder/../arch/arm64/kernel/traps.c:1019) [ 0.000000] early_brk64 (outfolder/../arch/arm64/kernel/traps.c:1045) [ 0.000000] do_debug_exception (outfolder/../arch/arm64/mm/fault.c:873) [ 0.000000] el1_sync_handler (outfolder/../arch/arm64/kernel/entry-common.c:98) [ 0.000000] el1_sync (outfolder/../arch/arm64/kernel/entry.S:588) [ 0.000000] start_kernel (outfolder/../init/main.c:817) [ 0.000000] [ 0.000000] [ 0.000000] Memory state around the buggy address: [ 0.000000] ffff900015447d00: 00 00 00 00 00 00 00 00 00 ff ff ff ff 00 ff ff [ 0.000000] ffff900015447e00: 00 00 00 00 00 00 00 00 ff 00 ff 00 00 ff ff 00 [ 0.000000] >ffff900015447f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 0.000000] ^ [ 0.000000] ffff900015448000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 0.000000] ffff900015448100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 0.000000] ================================================================== I think I fixed those, patches are here: https://github.com/xairy/linux/commits/up-kasan-stack-tags With these patches and outline instrumentation the kernel without any false-positives for me, but there are still some issues with inline instrumentation. I found inline instrumentation root cause. I am not sure if there are still other issues besides this. First, I write one testcase which always reproduces false positive case and it is simpler, please refer below information. struct foo { unsigned int t1:8; unsigned int t2:8; unsigned int t3:8; unsigned int t4:8; unsigned int t5:8; }; static noinline_for_stack void fun(struct foo *spec) { spec->t1 = 1; // always trigger kasan report, but it is false positives } static noinline void kasan_stack_oob(void) { struct foo spec = {0}; fun(&spec); } [ 10.508952] BUG: KASAN: invalid-access in fun+0x30/0x40 [ 10.509222] Write of size 8 at addr 4bff000070b2f520 by task cat/179 [ 10.509487] Pointer tag: [4b], memory tag: [08] [ 10.509676] [ 10.509859] CPU: 3 PID: 179 Comm: cat Tainted: G B 5.6.0-next-20200408-dirty #25 [ 10.510175] Hardware name: linux,dummy-virt (DT) [ 10.510362] Call trace: [ 10.510632] dump_backtrace+0x0/0x578 [ 10.510880] show_stack+0x14/0x1c [ 10.511147] dump_stack+0x188/0x260 [ 10.511353] print_address_description+0x8c/0x398 [ 10.511588] __kasan_report+0x14c/0x1dc [ 10.511796] kasan_report+0x3c/0x58 [ 10.511997] check_memory_region+0x98/0xa0 [ 10.512214] __hwasan_storeN_noabort+0x14/0x1c [ 10.512439] fun+0x30/0x40 [ 10.512621] kasan_stack_oob+0x70/0xcc Second, see the below assembly code, I found the shadow memory which map the address, it store #8 value, please see that "mov w12, #0x8 // #8". It should store the tag of address? So we see many false positive KASAN reports always show "memory tag: [08]" after finish KASAN initialization. This is kernel or clang bug? ffff900010d69c7c <kasan_stack_oob>: ffff900010d69c7c: d10103ff sub sp, sp, #0x40 ffff900010d69c80: a9027bfd stp x29, x30, [sp, #32] ffff900010d69c84: a9034ff4 stp x20, x19, [sp, #48] ffff900010d69c88: 910083fd add x29, sp, #0x20 ffff900010d69c8c: 90023728 adrp x8, ffff90001544d000 <page_wait_table+0x14c0> ffff900010d69c90: f944a508 ldr x8, [x8, #2376] ffff900010d69c94: ca5d53a9 eor x9, x29, x29, lsr #20 ffff900010d69c98: 92ffe00a mov x10, #0xffffffffffffff // #72057594037927935 ffff900010d69c9c: 910003eb mov x11, sp ffff900010d69ca0: b3481d2a bfi x10, x9, #56, #8 ffff900010d69ca4: d2d20013 mov x19, #0x900000000000 // #158329674399744 ffff900010d69ca8: 8a0a0160 and x0, x11, x10 ffff900010d69cac: f2fdfff3 movk x19, #0xefff, lsl #48 ffff900010d69cb0: 5280010c mov w12, #0x8 // #8 ; w12 should be the value of shadow memory ffff900010d69cb4: d344fd74 lsr x20, x11, #4 ffff900010d69cb8: f81f83a8 stur x8, [x29, #-8] ffff900010d69cbc: b2481c08 orr x8, x0, #0xff00000000000000 ffff900010d69cc0: 38336a8c strb w12, [x20, x19] ; w12 write into shadow memory ffff900010d69cc4: 39003fe9 strb w9, [sp, #15] ffff900010d69cc8: d344fd09 lsr x9, x8, #4 ffff900010d69ccc: 3873692a ldrb w10, [x9, x19] ffff900010d69cd0: d378fc09 lsr x9, x0, #56 ffff900010d69cd4: 6b0a013f cmp w9, w10 ffff900010d69cd8: 54000060 b.eq ffff900010d69ce4 <kasan_stack_oob+0x68> // b.none ffff900010d69cdc: 7103fd3f cmp w9, #0xff ffff900010d69ce0: 540001a1 b.ne ffff900010d69d14 <kasan_stack_oob+0x98> // b.any ffff900010d69ce4: f900001f str xzr, [x0] ffff900010d69ce8: 94000018 bl ffff900010d69d48 <fun> ... ffff900010d69d48 <fun>: ffff900010d69d48: a9be7bfd stp x29, x30, [sp, #-32]! ffff900010d69d4c: a9014ff4 stp x20, x19, [sp, #16] ffff900010d69d50: 910003fd mov x29, sp ffff900010d69d54: 52800101 mov w1, #0x8 // #8 ffff900010d69d58: aa0003f3 mov x19, x0 ffff900010d69d5c: 97f0647f bl ffff900010982f58 <__hwasan_loadN_noabort> ffff900010d69d60: f9400268 ldr x8, [x19] ffff900010d69d64: 52800101 mov w1, #0x8 // #8 ffff900010d69d68: aa1303e0 mov x0, x19 ffff900010d69d6c: 9278dd08 and x8, x8, #0xffffffffffffff00 ffff900010d69d70: b2400114 orr x20, x8, #0x1 ffff900010d69d74: 97f06480 bl ffff900010982f74 <__hwasan_storeN_noabort> ffff900010d69d78: f9000274 str x20, [x19] ffff900010d69d7c: a9414ff4 ldp x20, x19, [sp, #16] ffff900010d69d80: a8c27bfd ldp x29, x30, [sp], #32 ffff900010d69d84: d65f03c0 ret Hi Walter, Thanks for narrowing down the issue. I'll be likely busy this week with other stuff, but I'll take a look on the next one. OK, finally got to working on this. It looks like the issue is caused by HWASAN short granules [1], which the kernel doesn't account for. Let me look more closely into this. [1] https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html Hi Walter, I've updated the series with fixes here: https://github.com/xairy/linux/commits/up-kasan-stack-tags Could you try them and see if they fix the false positive with inline instrumentation for you? Thanks! Hi Andrey, Thanks for your help. It is workable for me. It looks like no false-positive issue, and I think below patch should not need it. because it is fixed by hwasan-use-short-granules=0. https://github.com/xairy/linux/commit/a48d20d2397d084af3790dfc5ddfc83b788ddd9a After apply patches, I try to execute KASAN UT to verify them. The stack oob test should need to be fixed. Could you send below patch together? Thanks. --- lib/test_kasan.c: fix the stack OOB test for tag-based KASAN With tag based KASAN, the stack OOB test doesn't trigger out-of-bounds memory access. This test need to be fixed. Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com> --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -488,7 +488,7 @@ static noinline void __init kasan_global_oob(void) static noinline void __init kasan_stack_oob(void) { char stack_array[10]; - volatile int i = 0; + volatile int i = OOB_TAG_OFF; char *p = &stack_array[ARRAY_SIZE(stack_array) + i]; pr_info("out-of-bounds on stack\n"); Yeah, we have KASAN_SANITIZE:=n for arm64 kasan_init.c, so that patch isn't needed indeed. I've mailed the series, thanks! Basic stack instrumentation support has been merged. No short granule support for now, filed: https://bugzilla.kernel.org/show_bug.cgi?id=208865 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8dcc1d34661d58a7889fb06517c8738d1412d1bc https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2c547f9da0539ad1f7ef7f08c8c82036d61b011a https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f9409d58e972cada2c524b7f1e54631bb8fa176f https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cae9dc35ed9ff82a99754e51d57ff6c332e1f7e4 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=51dcc81c282dc401dfd8460a7e59546bc1b30e32 |