Bug 206525 - BUG: KASAN: stack-out-of-bounds in test_bit+0x30/0x44 (kernel 5.6-rc1)
Summary: BUG: KASAN: stack-out-of-bounds in test_bit+0x30/0x44 (kernel 5.6-rc1)
Status: RESOLVED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: platform_ppc-32
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-13 20:07 UTC by Erhard F.
Modified: 2020-02-26 21:44 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.6.0-rc1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg (5.6.0-rc1 + v2 Fix DSI and ISI... patch, PowerMac G4 DP) (61.71 KB, text/plain)
2020-02-13 20:07 UTC, Erhard F.
Details
kernel .config (5.6.0-rc1, PowerMac G4 DP) (95.95 KB, text/plain)
2020-02-13 20:12 UTC, Erhard F.
Details

Description Erhard F. 2020-02-13 20:07:46 UTC
Created attachment 287357 [details]
dmesg (5.6.0-rc1 + v2 Fix DSI and ISI... patch, PowerMac G4 DP)

[...]
Feb 13 20:18:53 T600 kernel: ==================================================================
Feb 13 20:18:53 T600 kernel: BUG: KASAN: stack-out-of-bounds in test_bit+0x30/0x44
Feb 13 20:18:53 T600 kernel: Read of size 4 at addr ee8bddac by task systemd/1
Feb 13 20:18:53 T600 kernel: 
Feb 13 20:18:53 T600 kernel: CPU: 0 PID: 1 Comm: systemd Tainted: G        W         5.6.0-rc1-PowerMacG4+ #20
Feb 13 20:18:53 T600 kernel: Call Trace:
Feb 13 20:18:53 T600 kernel: [ee8bdc38] [c078cf18] dump_stack+0xbc/0x118 (unreliable)
Feb 13 20:18:53 T600 kernel: [ee8bdc68] [c0249f94] print_address_description.isra.0+0x3c/0x420
Feb 13 20:18:53 T600 kernel: [ee8bdcf8] [c024a554] __kasan_report+0x138/0x180
Feb 13 20:18:53 T600 kernel: [ee8bdd38] [c0249718] kasan_report+0x7c/0x104
Feb 13 20:18:53 T600 kernel: [ee8bdd58] [c06526b4] test_bit+0x30/0x44
Feb 13 20:18:53 T600 kernel: [ee8bdd78] [c0657c6c] netlink_bind+0x24c/0x33c
Feb 13 20:18:53 T600 kernel: [ee8bde18] [c05c0c3c] __sys_bind+0xd4/0x120
Feb 13 20:18:53 T600 kernel: [ee8bdf38] [c001a278] ret_from_syscall+0x0/0x34
Feb 13 20:18:53 T600 kernel: --- interrupt: c01 at 0x4f3ea8
                                 LR = 0x8f5b80
Feb 13 20:18:53 T600 kernel: 
Feb 13 20:18:53 T600 kernel: The buggy address belongs to the page:
Feb 13 20:18:53 T600 kernel: page:ef460a94 refcount:0 mapcount:0 mapping:00000000 index:0x0
Feb 13 20:18:53 T600 kernel: flags: 0x0()
Feb 13 20:18:53 T600 kernel: raw: 00000000 ef460a98 ef460a98 00000000 00000000 00000000 ffffffff 00000000
Feb 13 20:18:53 T600 kernel: raw: 00000000
Feb 13 20:18:53 T600 kernel: page dumped because: kasan: bad access detected
Feb 13 20:18:53 T600 kernel: 
Feb 13 20:18:53 T600 kernel: addr ee8bddac is located in stack of task systemd/1 at offset 36 in frame:
Feb 13 20:18:53 T600 kernel:  netlink_bind+0x0/0x33c
Feb 13 20:18:53 T600 kernel: 
Feb 13 20:18:53 T600 kernel: this frame has 1 object:
Feb 13 20:18:53 T600 kernel:  [32, 36) 'groups'
Feb 13 20:18:53 T600 kernel: 
Feb 13 20:18:53 T600 kernel: Memory state around the buggy address:
Feb 13 20:18:53 T600 kernel:  ee8bdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Feb 13 20:18:53 T600 kernel:  ee8bdd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Feb 13 20:18:53 T600 kernel: >ee8bdd80: 00 f1 f1 f1 f1 04 f3 f3 f3 00 00 00 00 00 00 00
Feb 13 20:18:53 T600 kernel:                           ^
Feb 13 20:18:53 T600 kernel:  ee8bde00: 00 00 00 00 00 f1 f1 f1 f1 04 f2 04 f2 00 00 00
Feb 13 20:18:53 T600 kernel:  ee8bde80: 00 00 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3
Feb 13 20:18:53 T600 kernel: ==================================================================

Happens on my G4 DP with kernel 5.6.0-rc1 and KASAN enabled (outline) during boot. kernel is patched with Christophe's '[v2] powerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACK' (https://patchwork.ozlabs.org/patch/1237387/) but CONFIG_VMAP_STACK was not used here.
Comment 1 Erhard F. 2020-02-13 20:12:49 UTC
Created attachment 287359 [details]
kernel .config (5.6.0-rc1, PowerMac G4 DP)
Comment 2 Christophe Leroy 2020-02-14 10:04:34 UTC
Probably a bug in or around netlink_bind() in net/netlink/af_netlink.c https://elixir.bootlin.com/linux/v5.6-rc1/source/net/netlink/af_netlink.c#L1017

Could you print the value of nlk->ngroups just before the loop which does the test_bit() ? It shall be 32 or less.
Comment 3 Christophe Leroy 2020-02-15 17:52:44 UTC
Bug introduced by commit ("cf5bddb95cbe net: bridge: vlan: add rtnetlink group and notify support")

RTNLGRP_MAX is now 33.

'unsigned long groups' is 32 bits long on PPC32

Following loop in netlink_bind() overflows.


		for (group = 0; group < nlk->ngroups; group++) {
			if (!test_bit(group, &groups))
				continue;
			err = nlk->netlink_bind(net, group + 1);
			if (!err)
				continue;
			netlink_undo_bind(group, groups, sk);
			goto unlock;
		}


Should 'groups' be changes to 'unsigned long long' ?
Comment 4 Christophe Leroy 2020-02-16 08:26:55 UTC
Feedback from Nikolay:

I think we can just cap these at min(BITS_PER_TYPE(u32), nlk->ngroups) since "groups" is coming from sockaddr_nl's "nl_groups" which is a u32, for any groups beyond u32 one has to use setsockopt().
Comment 5 Christophe Leroy 2020-02-17 10:53:47 UTC
That's not a PPC32 bug but a Network bug affecting all 32 bits architectures.
Comment 6 Nikolay Aleksandrov 2020-02-20 12:19:51 UTC
Note that the bug wasn't introduced by my commit, but instead has been there since:
 commit 4f520900522f
 Author: Richard Guy Briggs <rgb@redhat.com>
 Date:   Tue Apr 22 21:31:54 2014 -0400

    netlink: have netlink per-protocol bind function return an error code.

which moved the ngroups test_bit() to a local variable. My commit only exposed the bug since it added the 33rd group. I'm currently preparing a fix and will post it to netdev after verifying and testing it.
Comment 7 Erhard F. 2020-02-26 21:44:31 UTC
Fix landed in 5.6-rc3, works now as expected. Thanks!

Note You need to log in before you can comment on or make changes to this bug.