Bug 207873 - BUG at swapops + rcu stall + soft lockup at running btrfs test suite (TEST=013\* ./misc-tests.sh)
Summary: BUG at swapops + rcu stall + soft lockup at running btrfs test suite (TEST=01...
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: PPC-32 (show other bugs)
Hardware: PPC-32 Linux
: P1 normal
Assignee: platform_ppc-32
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-23 22:21 UTC by Erhard F.
Modified: 2020-05-24 17:04 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.7-rc6
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel .config (5.7-rc6, PowerMac G4 DP) (98.00 KB, text/plain)
2020-05-23 22:21 UTC, Erhard F.
Details
kernel dmesg (5.7-rc6, PowerMac G4 DP) (55.10 KB, text/plain)
2020-05-23 22:22 UTC, Erhard F.
Details
screenshot 01 (1007.66 KB, image/jpeg)
2020-05-23 22:23 UTC, Erhard F.
Details
screenshot 02 (952.57 KB, image/jpeg)
2020-05-23 22:23 UTC, Erhard F.
Details
transcript of both screenshots (3.46 KB, text/plain)
2020-05-23 22:24 UTC, Erhard F.
Details

Description Erhard F. 2020-05-23 22:21:55 UTC
Created attachment 289253 [details]
kernel .config (5.7-rc6, PowerMac G4 DP)

The bug is triggered by running "TEST=013\* ./misc-tests.sh" of btrfs-progs test suite, built from git master:

 # git clone https://github.com/kdave/btrfs-progs && cd btrfs-progs/
 # ./autogen.sh && ./configure --disable-documentation
 # make && make fssum
 # cd tests/
 # TEST=013\* ./misc-tests.sh

The G4 crashes and the reboot timer kicks in. Before it shows a series of stack traces, starting with the "kernel BUG at include/linux/swapops.h:197!"-part from bug #207221. After that I get an rcu stall and a soft lockup. For the full stacktrace have a look at the transcript of both screenshots.

[...]
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: o1-....: (7799 ticks this GP) idle=a06/1/0x40000002 soft irq=11075/11075 fqs=2599
o(t=7804 jiffies g=21629 q=59)
Task dump for CPU 1:
dd              R  running task        0  2200    394 0x0000000c
Call Trace:
[f49fb458] [c00fcddc] sched_show_task+0x3bc/Ox3fe (unreliable)
[f49fb498] [c01c650c] rcu_dump_cpu_stacks+0x228/0x23c
[f49fb4e8] [c01c2e18] rcu_sched_clock_irq+0x81c/0x1360
[f49fb568] [c01d8940] update_process_times+0x2c/0x98
[f49fb588] [c02027d4] tick_sched_timer+0x128/0x1d8
[f49fb5a8] [c01dc49c] __hrtimer_run_queues+0x490/Oxae8
[f49fb698] [c01dd788] hrtimer_interrupt+0x278/0x520
[f49fb6f8] [c001710c] timer_interrupt+0x374/0xb4c
[f49fb738] [c002c5e4] ret_from_except+0x0/0x14
--- interrupt: 901 at do_raw_spin_lock+0x1c8/0x2cc
    LR = do_raw_spin_lock+0x1a4/0x2cc
[f49fb800] [c0180e0c] do_raw_spin_lock+0x188/0x2cc (unrelable)
[f49fb830] [c0428890] unmap_page_range+0x244/0xb08
[f49fb910] [c0429610] unmap_vmas+0x94/0xdc
[f49fb930] [c043c25c] exit_mmap+0x340/0x46c
[f49fba20] [c0078260] __mmput+0x78/0x360
[f49fba50] [c0090514] do_exit+0x9c4/0x21fc
[f49fbb20] [c0019d38] user_single_step_report+0x0/0x74
[f49fbb70] [c002c5e0] ret_from_except+0x0/0x4
--- interrupt: 700 at __migration_entry_wait+0x13c/0x198
    LR = __migration_entry_wait+0xf0/0x198
[f49fbc58] [c042c0f0] do_swap_page+0x1f0/0x198
[f49fbd28] [c042e7e4] handle_mm_fault+0x794/0x16f4
[f49fbe48] [c0039868] do_page_fault+0xf50/0x12f8
[f49fbf38] [c002c468] handle_page_fault+0x10/0x3c
--- interrupt: 301 at 0x87e378
    LR = 0x87e33c
[...]

I don't know wether this is a btrfs bug, or a bug only triggered by this specific test. So I am filing this as platform specific as I have not seen it on x86 yet.

Unlike bug #207221 KASAN is enabled here, so the stack trace looks slightly different.
Comment 1 Erhard F. 2020-05-23 22:22:27 UTC
Created attachment 289255 [details]
kernel dmesg (5.7-rc6, PowerMac G4 DP)
Comment 2 Erhard F. 2020-05-23 22:23:14 UTC
Created attachment 289257 [details]
screenshot 01
Comment 3 Erhard F. 2020-05-23 22:23:40 UTC
Created attachment 289259 [details]
screenshot 02
Comment 4 Erhard F. 2020-05-23 22:24:35 UTC
Created attachment 289261 [details]
transcript of both screenshots
Comment 6 Erhard F. 2020-05-24 17:04:46 UTC
(In reply to Christophe Leroy from comment #5)
> Try
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/
> ?id=40bb0e904212cf7d6f041a98c58c8341b2016670
Thanks for the hint! That patch did the trick. The btrfs test suite completes fine now and building larger projects works unremarkably. 

Will close here as the fix seems to be going into -rc7.

Note You need to log in before you can comment on or make changes to this bug.