Bug 217738

Summary: crash/hang in mm/swapfile.c:718 add_to_avail_list when exercising stress-ng
Product: Linux Reporter: Colin Ian King (colin.i.king)
Component: KernelAssignee: Virtual assignee for kernel bugs (linux-kernel)
Status: NEW ---    
Severity: blocking CC: regressions
Priority: P3    
Hardware: Intel   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: console image of hang
crash with 6.1 kernel
crash with 6.1 kernel ppc64el

Description Colin Ian King 2023-07-31 14:26:50 UTC
Created attachment 304744 [details]
console image of hang

How to reproduce:

Had 24 CPU Alderlake 16GB debian12 system running with default kernel (from makecondig) on 6.5-rc4, exercised with no swap to start with.

using stress-ng tip commit 0f2ef02e9bc5abb3419c44be056d5fa3c97e0137
(see https://github.com/ColinIanKing/stress-ng )

build and run stress-ng for say 60 minutes:

./stress-ng --cpu-online 50 --brk 50 --swap 50 --vmstat 1 -t 60m

Will hang in mm/swapfile.c:718 add_to_avail_list+0x93/0xa0

See attached file for an image of the console on the hang (I'm trying to get the full stack dump).
Comment 1 Colin Ian King 2023-07-31 14:28:15 UTC
Hitting the WARN_ON in the following:

static void add_to_avail_list(struct swap_info_struct *p)
{
        int nid;

        spin_lock(&swap_avail_lock);
        for_each_node(nid) {
                WARN_ON(!plist_node_empty(&p->avail_lists[nid]));
                plist_add(&p->avail_lists[nid], &swap_avail_heads[nid]);
        }
        spin_unlock(&swap_avail_lock);
}
Comment 2 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-08-02 09:28:12 UTC
Reminder: the right developers to fix this are unlikely to see this here; for details, search bugzilla.kernel.org in https://docs.kernel.org/admin-guide/reporting-issues.html
Comment 3 Colin Ian King 2023-08-02 13:06:57 UTC
Note one needs to run this as root.

Turns out one can reproduce this with:

sudo ./stress-ng --brk 50 --swap 50 --vmstat 1 -t 60m
Comment 4 Colin Ian King 2023-08-02 13:08:10 UTC
..and swapoff on all existing swap is useful before running the reproducer
Comment 5 Colin Ian King 2023-08-02 13:41:45 UTC
Created attachment 304757 [details]
crash with 6.1 kernel
Comment 6 Colin Ian King 2023-08-02 14:03:10 UTC
Created attachment 304758 [details]
crash with 6.1 kernel ppc64el