Bug 214867 - UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
Summary: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
Status: RESOLVED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: PPC-64 (show other bugs)
Hardware: PPC-64 Linux
: P1 normal
Assignee: platform_ppc-64
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-29 13:59 UTC by Erhard F.
Modified: 2022-02-04 13:20 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.15-rc7
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel dmesg (kernel 5.15-rc7, Talos II) (60.94 KB, text/plain)
2021-10-29 13:59 UTC, Erhard F.
Details
kernel .config (kernel 5.15-rc7, Talos II) (109.20 KB, text/plain)
2021-10-29 14:00 UTC, Erhard F.
Details

Description Erhard F. 2021-10-29 13:59:02 UTC
Created attachment 299361 [details]
kernel dmesg (kernel 5.15-rc7, Talos II)

UBSAN catches this at boot on my Talos II.

[...]
### dt-test ### EXPECT / : GPIO line <<int>> (line-C-input) hogged as input
================================================================================
UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
shift exponent -1 is negative
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc7-TalosII #1
Call Trace:
[c000000004163700] [c0000000008ffaa8] .dump_stack_lvl+0xa4/0x100 (unreliable)
[c000000004163790] [c0000000008fb46c] .ubsan_epilogue+0x10/0x70
[c000000004163800] [c0000000008fb270] .__ubsan_handle_shift_out_of_bounds+0x1f0/0x34c
[c000000004163910] [c000000000ad94a0] .of_unittest_untrack_overlay+0x6c/0xe0
[c0000000041639a0] [c000000002098ff8] .of_unittest+0x4c50/0x59f8
[c000000004163b60] [c000000000011b5c] .do_one_initcall+0x7c/0x4f0
[c000000004163c50] [c00000000200300c] .kernel_init_freeable+0x704/0x858
[c000000004163d90] [c000000000012730] .kernel_init+0x20/0x190
[c000000004163e10] [c00000000000ce78] .ret_from_kernel_thread+0x58/0x60
================================================================================
### dt-test ### EXPECT \ : OF: overlay: WARNING: memory leak will occur if overlay removed, property: /testcase-data-2/substation@100/status
[...]
Comment 1 Erhard F. 2021-10-29 14:00:20 UTC
Created attachment 299363 [details]
kernel .config (kernel 5.15-rc7, Talos II)

 # lspci 
0000:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0000:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Turks XT [Radeon HD 6670/7670]
0000:01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Turks HDMI Audio [Radeon HD 6500/6600 / 6700M Series]
0001:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0001:01:00.0 Non-Volatile memory controller: Phison Electronics Corporation Device 5008 (rev 01)
0002:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
0004:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0004:01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0004:01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0005:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0005:01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04)
0005:02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)
0030:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0031:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0032:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0033:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
Comment 2 Arnd Bergmann 2021-10-29 14:06:48 UTC
This is the function that triggers it:

static void of_unittest_untrack_overlay(int id)
{
        if (overlay_first_id < 0)
                return;
        id -= overlay_first_id;
        if (WARN_ON(id >= MAX_UNITTEST_OVERLAYS))
                return;
        overlay_id_bits[BIT_WORD(id)] &= ~BIT_MASK(id);
}

My guess is that 'id' is negative here, which means it fails to tigger the
WARN_ON() but ends up still being out of range.

Can you try changing it to 'unsigned int id'?
Comment 3 Frank Rowand 2021-10-30 00:01:50 UTC
I forwarded my email notification of this bug to the mail lists.  I prefer
discussion to occur there:

  https://lore.kernel.org/all/c474a371-b524-1da8-4a67-e72cf8f2b0f7@gmail.com/

Thank you for the report.
Comment 4 Erhard F. 2022-02-04 13:20:08 UTC
Fix landed in mainline meanwhile. At least I can replicate this no longer on v5.17-rc2.

Thanks!

Note You need to log in before you can comment on or make changes to this bug.