Bug 213299 - kswapd Regression as of 5.13-rc3, eats 100% of CPU for no reason
Summary: kswapd Regression as of 5.13-rc3, eats 100% of CPU for no reason
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Slab Allocator (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-31 18:09 UTC by Matt McDonald
Modified: 2021-06-04 00:00 UTC (History)
0 users

See Also:
Kernel Version: 5.13-rc3, 5.13-rc4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
attachment-5394-0.html (3.87 KB, text/html)
2021-05-31 21:35 UTC, Matt McDonald
Details
attachment-743-0.html (926 bytes, text/html)
2021-06-04 00:00 UTC, Matt McDonald
Details

Description Matt McDonald 2021-05-31 18:09:38 UTC
I've seen reports of kswapd0 eating up CPU, but this doesn't fit those descriptions. 

 - The bug is not present on 5.12.

 - It happens immediately on boot, CPU usage goes to 100 on all 24 threads of a Ryzen 5900X

 - The only solution is to run `sudo pkill -9 kswapd0`. swapoff does nothing, nor does `echo 1 > /proc/sys/vm/drop_caches`. 

 - I have a swap partition and 32GB of RAM. Like I said, this happens on boot, so nowhere *near* 32GB of RAM is being taken. And no swap is being used. This is clearly an error in kswapd, it's like it goes completely bonkers. 5.13-rc2 didn't have it, nor did rc1, and 5.12 doesn't have it either. 


System Info:

Arch Linux
Kernel: 5.13-rc3, 5.13-rc4
Ryzen 9 5900X
32GB DDR4-3600
40GB Swap partition on SSD

sudo dmesg | grep -i swap only gives these three lines so far (after already killing the process and about 5 min. after boot):

[    0.093277] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[    0.334800] zswap: loaded using pool lz4/z3fold
[    2.941371] Adding 41502716k swap on /dev/sdb3.  Priority:-2 extents:1 across:41502716k SSFS
Comment 1 Andrew Morton 2021-05-31 21:33:31 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 31 May 2021 18:09:38 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=213299
> 
>             Bug ID: 213299
>            Summary: kswapd Regression as of 5.13-rc3, eats 100% of CPU for
>                     no reason
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 5.13-rc3, 5.13-rc4
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Slab Allocator
>           Assignee: akpm@linux-foundation.org
>           Reporter: gardotd426@gmail.com
>         Regression: No
> 
> I've seen reports of kswapd0 eating up CPU, but this doesn't fit those
> descriptions. 
> 
>  - The bug is not present on 5.12.
> 
>  - It happens immediately on boot, CPU usage goes to 100 on all 24 threads of
>  a
> Ryzen 5900X
> 
>  - The only solution is to run `sudo pkill -9 kswapd0`. swapoff does nothing,
> nor does `echo 1 > /proc/sys/vm/drop_caches`. 
> 
>  - I have a swap partition and 32GB of RAM. Like I said, this happens on
>  boot,
> so nowhere *near* 32GB of RAM is being taken. And no swap is being used. This
> is clearly an error in kswapd, it's like it goes completely bonkers. 5.13-rc2
> didn't have it, nor did rc1, and 5.12 doesn't have it either. 
> 
> 
> System Info:
> 
> Arch Linux
> Kernel: 5.13-rc3, 5.13-rc4
> Ryzen 9 5900X
> 32GB DDR4-3600
> 40GB Swap partition on SSD
> 
> sudo dmesg | grep -i swap only gives these three lines so far (after already
> killing the process and about 5 min. after boot):
> 
> [    0.093277] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user
> pointer sanitization
> [    0.334800] zswap: loaded using pool lz4/z3fold
> [    2.941371] Adding 41502716k swap on /dev/sdb3.  Priority:-2 extents:1
> across:41502716k SSFS
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are the assignee for the bug.
Comment 2 Matt McDonald 2021-05-31 21:35:33 UTC
Created attachment 297091 [details]
attachment-5394-0.html

Can do. Happy to provide any logs and test any patches as needed, I build
my own kernels.

On Mon, May 31, 2021 at 5:33 PM <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=213299
>
> --- Comment #1 from Andrew Morton (akpm@linux-foundation.org) ---
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Mon, 31 May 2021 18:09:38 +0000 bugzilla-daemon@bugzilla.kernel.org
> wrote:
>
> > https://bugzilla.kernel.org/show_bug.cgi?id=213299
> >
> >             Bug ID: 213299
> >            Summary: kswapd Regression as of 5.13-rc3, eats 100% of CPU
> for
> >                     no reason
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 5.13-rc3, 5.13-rc4
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Slab Allocator
> >           Assignee: akpm@linux-foundation.org
> >           Reporter: gardotd426@gmail.com
> >         Regression: No
> >
> > I've seen reports of kswapd0 eating up CPU, but this doesn't fit those
> > descriptions.
> >
> >  - The bug is not present on 5.12.
> >
> >  - It happens immediately on boot, CPU usage goes to 100 on all 24
> threads of
> >  a
> > Ryzen 5900X
> >
> >  - The only solution is to run `sudo pkill -9 kswapd0`. swapoff does
> nothing,
> > nor does `echo 1 > /proc/sys/vm/drop_caches`.
> >
> >  - I have a swap partition and 32GB of RAM. Like I said, this happens on
> >  boot,
> > so nowhere *near* 32GB of RAM is being taken. And no swap is being used.
> This
> > is clearly an error in kswapd, it's like it goes completely bonkers.
> 5.13-rc2
> > didn't have it, nor did rc1, and 5.12 doesn't have it either.
> >
> >
> > System Info:
> >
> > Arch Linux
> > Kernel: 5.13-rc3, 5.13-rc4
> > Ryzen 9 5900X
> > 32GB DDR4-3600
> > 40GB Swap partition on SSD
> >
> > sudo dmesg | grep -i swap only gives these three lines so far (after
> already
> > killing the process and about 5 min. after boot):
> >
> > [    0.093277] Spectre V1 : Mitigation: usercopy/swapgs barriers and
> __user
> > pointer sanitization
> > [    0.334800] zswap: loaded using pool lz4/z3fold
> > [    2.941371] Adding 41502716k swap on /dev/sdb3.  Priority:-2 extents:1
> > across:41502716k SSFS
> >
> > --
> > You may reply to this email to add a comment.
> >
> > You are receiving this mail because:
> > You are the assignee for the bug.
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You reported the bug.
Comment 3 Johannes Weiner 2021-06-03 20:33:00 UTC
On Mon, May 31, 2021 at 02:33:29PM -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Mon, 31 May 2021 18:09:38 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=213299
> > 
> >             Bug ID: 213299
> >            Summary: kswapd Regression as of 5.13-rc3, eats 100% of CPU for
> >                     no reason
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 5.13-rc3, 5.13-rc4
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Slab Allocator
> >           Assignee: akpm@linux-foundation.org
> >           Reporter: gardotd426@gmail.com
> >         Regression: No
> > 
> > I've seen reports of kswapd0 eating up CPU, but this doesn't fit those
> > descriptions. 
> > 
> >  - The bug is not present on 5.12.
> > 
> >  - It happens immediately on boot, CPU usage goes to 100 on all 24 threads
> of a
> > Ryzen 5900X
> > 
> >  - The only solution is to run `sudo pkill -9 kswapd0`. swapoff does
> nothing,
> > nor does `echo 1 > /proc/sys/vm/drop_caches`. 
> > 
> >  - I have a swap partition and 32GB of RAM. Like I said, this happens on
> boot,
> > so nowhere *near* 32GB of RAM is being taken. And no swap is being used.
> This
> > is clearly an error in kswapd, it's like it goes completely bonkers.
> 5.13-rc2
> > didn't have it, nor did rc1, and 5.12 doesn't have it either.

Does it trigger (or not trigger) reliably with these version?

There aren't any obvious mm changes in that window. If it reproduces
reliably during boot, can you please git bisect it?
Comment 4 chris 2021-06-03 22:22:12 UTC
Can you also please show /proc/pid/stack for the offending kswapd, and what 
your watermark settings are?
Comment 5 Matt McDonald 2021-06-04 00:00:29 UTC
Created attachment 297141 [details]
attachment-743-0.html

Sure thing. I'll get on it.

On Thu, Jun 3, 2021 at 6:22 PM <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=213299
>
> --- Comment #4 from chris@chrisdown.name ---
> Can you also please show /proc/pid/stack for the offending kswapd, and
> what
> your watermark settings are?
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You reported the bug.

Note You need to log in before you can comment on or make changes to this bug.