kswapd0 randomly load one core of CPU by 100% Linux localhost 3.12.0-1-ARCH #1 SMP PREEMPT Wed Nov 6 09:06:27 CET 2013 x86_64 GNU/Linux No swap enabled Befor on same laptop was installed Ubuntu 12.04 and kernel 3.2 32-bit pae, and there is no such problem. [root@localhost ~]# free -mh total used free shared buffers cached Mem: 3.8G 2.4G 1.3G 0B 150M 508M -/+ buffers/cache: 1.8G 2.0G Swap: 0B 0B 0B [root@localhost ~]# cat /proc/meminfo MemTotal: 3935792 kB MemFree: 1381360 kB Buffers: 154216 kB Cached: 533096 kB SwapCached: 0 kB Active: 1958896 kB Inactive: 438004 kB Active(anon): 1740916 kB Inactive(anon): 136292 kB Active(file): 217980 kB Inactive(file): 301712 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 2064 kB Writeback: 0 kB AnonPages: 1709628 kB Mapped: 196696 kB Shmem: 167620 kB Slab: 81516 kB SReclaimable: 61312 kB SUnreclaim: 20204 kB KernelStack: 1696 kB PageTables: 13088 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1967896 kB Committed_AS: 3498576 kB VmallocTotal: 34359738367 kB VmallocUsed: 361304 kB VmallocChunk: 34359300731 kB HardwareCorrupted: 0 kB AnonHugePages: 157696 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 18476 kB DirectMap2M: 4059136 kB And I can't kill it. I heared that it's not good idea, but just for lulz)
(In reply to nleo from comment #0) > kswapd0 randomly load one core of CPU by 100% You cannot issue a SIGKILL to 'kswapd' since it is a kernel thread. > CommitLimit: 1967896 kB > Committed_AS: 3498576 kB ^^^^^^^ Seem to be over committing memory.
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Tue, 19 Nov 2013 19:40:40 +0000 bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=65201 > > Bug ID: 65201 > Summary: kswapd0 randomly high cpu load > Product: Memory Management > Version: 2.5 > Kernel Version: 3.12 > Hardware: x86-64 > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > Assignee: akpm@linux-foundation.org > Reporter: nleo@nm.ru > Regression: No > > kswapd0 randomly load one core of CPU by 100% > > Linux localhost 3.12.0-1-ARCH #1 SMP PREEMPT Wed Nov 6 09:06:27 CET 2013 > x86_64 > GNU/Linux > > No swap enabled > > Befor on same laptop was installed Ubuntu 12.04 and kernel 3.2 32-bit pae, > and > there is no such problem. > > [root@localhost ~]# free -mh > total used free shared buffers cached > Mem: 3.8G 2.4G 1.3G 0B 150M 508M > -/+ buffers/cache: 1.8G 2.0G > Swap: 0B 0B 0B hm, I wonder what kswapd is up to. Could you please make it happen again and then dmesg -n 7 dmesg -c echo m > /proc/sysrq-trigger echo t > /proc/sysrq-trigger dmesg -s 1000000 > foo then send us foo? > > [root@localhost ~]# cat /proc/meminfo > MemTotal: 3935792 kB > MemFree: 1381360 kB > Buffers: 154216 kB > Cached: 533096 kB > SwapCached: 0 kB > Active: 1958896 kB > Inactive: 438004 kB > Active(anon): 1740916 kB > Inactive(anon): 136292 kB > Active(file): 217980 kB > Inactive(file): 301712 kB > Unevictable: 0 kB > Mlocked: 0 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty: 2064 kB > Writeback: 0 kB > AnonPages: 1709628 kB > Mapped: 196696 kB > Shmem: 167620 kB > Slab: 81516 kB > SReclaimable: 61312 kB > SUnreclaim: 20204 kB > KernelStack: 1696 kB > PageTables: 13088 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 1967896 kB > Committed_AS: 3498576 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 361304 kB > VmallocChunk: 34359300731 kB > HardwareCorrupted: 0 kB > AnonHugePages: 157696 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > DirectMap4k: 18476 kB > DirectMap2M: 4059136 kB > > And I can't kill it. I heared that it's not good idea, but just for lulz) > > -- > You are receiving this mail because: > You are the assignee for the bug.
Created attachment 174671 [details] kmsg dump Sometimes I have same problem. I don't have swap. I have kernel 3.19.0 (i686) compiled without CONFIG_SWAP.
My Acer C720 too suffers occasionally. Turning swap on/off doesn't help. Dropping caches *does* help: # echo 3 > /proc/sys/vm/drop_caches # 1 isn't enough Next my guess would be to try to deactivate zswap.
Zswap isn't to blame, dropping caches may help or may not. There's the output of `sudo perf top`: 26,24% [kernel] [k] _raw_spin_lock 14,72% [kernel] [k] _raw_spin_unlock 6,62% [kernel] [k] super_cache_count 4,97% [kernel] [k] shrink_slab.part.12 4,92% [kernel] [k] list_lru_count_one 2,15% [i2c_designware_core] [k] 0x0000000000000099 1,86% [kernel] [k] shrink_lruvec 1,74% [kernel] [k] mem_cgroup_iter 1,61% [kernel] [k] native_read_tsc 1,55% [kernel] [k] delay_tsc 1,52% [kernel] [k] kswapd%
(In reply to Anatoli Sakhnik from comment #4) > My Acer C720 too suffers occasionally. Turning swap on/off doesn't help. I have the same hardware. After system upgrade (current running kernel version 4.2.0) I get high CPU usage after "heavy" web site opens. If suggested workaround doesn't help (dropping caches), I just quit web browser and everything returns back to normal.
Same here, also on an Acer C720 running arch. kswapd0 takes up a whole core whenever swap is being used. I run the Arch kernel, with a small patch to the chromos_laptop driver to enable my trackpad. The weird thing is memory and swap both aren't that full. Memory is at 50% utilization, and swap is only at 8%, according to xfce4-taskmanager. It seems like Google Docs is the worst offender for triggering this issue.
I had this bug, and for me it turned out to be my /tmp directory that is a tmpfs (to gain speed and save my ssd). df /tmp gave tmpfs 3880480 2449036 1431444 95% /tmp After removing junk from /tmp/ the system returned to normal. Also in my case I had no swap, and sufficient free memory. Would be interested to know if this works for you.
same problem here, c720p chromebook , happens on several different distros like arch, ubuntu, xubuntu. I downgraded to the 4.1.x kernel and the issue is less frequent (needs much more memory pressure to trigger). then I downgraded to the 3.17 kernel and the issue is gone completely. all the previous suggestions and workarrounds didn't work for me. only downgrading the kernel did.
Same problem here on Acer C720 Chromebook. I have 2GB of swap space on the SSD (I replaced the original 16GB M2 SSD with a 256GB version) and whenever swap is used I get this problem. Linux localhost 4.2.0-27-generic #32-Ubuntu SMP Fri Jan 22 04:49:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 15.10 Release: 15.10 Codename: wily echo 3 > /proc/sys/vm/drop_caches # 1 isn't enough works around the issue for me too
I didn't suffer from the bug since compiled kernel myself: https://aur.archlinux.org/packages/linux-c720/ . Apparently, I compiled out something causing the trouble, but I didn't try to bisect what was the culprit.
(In reply to Anatoli Sakhnik from comment #11) > I didn't suffer from the bug since compiled kernel myself: > https://aur.archlinux.org/packages/linux-c720/ . Apparently, I compiled out > something causing the trouble, but I didn't try to bisect what was the > culprit. This bug seems to affect 2Gb models only. Do you have the 2Gb or 4Gb version? What are the changes you made on your kernel?
Mine is 2G. I didn't change anything in the kernel source code, but switched off many options in the config file: https://aur.archlinux.org/cgit/aur.git/tree/config.x86_64?h=linux-c720 . Even today, if I boot stock arch kernel, the bug regresses; if I boot linux-c720, kswapd0 is still. In theory, I could experiment with different configurations in between stock's and mine to triage the issue.
perhaps you removed something related to http://lkml.iu.edu//hypermail/linux/kernel/1601.2/03564.html ? also relevant: https://github.com/GalliumOS/galliumos-distro/issues/52#issuecomment-174261443
I have no idea yet.
To avoid this bug I installed ChromeOS on my C720 (with 2GB RAM). I was happy with performance. Until today. I noticed lags. For some reason this bug appeared suddenly. There was no update. Kernel version is 3.8.11. Stock ChromeOS kernel.
(In reply to Anatoli Sakhnik from comment #13) > Mine is 2G. I didn't change anything in the kernel source code, but switched > off many options in the config file: > https://aur.archlinux.org/cgit/aur.git/tree/config.x86_64?h=linux-c720 . > > Even today, if I boot stock arch kernel, the bug regresses; if I boot > linux-c720, kswapd0 is still. In theory, I could experiment with different > configurations in between stock's and mine to triage the issue. could you please share your configuration for the kernel so I can try your AUR package and solve this issue once for all :) ? thanks in advance
There it is: https://aur.archlinux.org/cgit/aur.git/tree/config.x86_64?h=linux-c720
We encounter this regularly on AWS, but only on t2.small instances, which indeed are the only ones we run which have 2GB of RAM. We use the latest Ubuntu 15.10 AMIs as found here https://cloud-images.ubuntu.com/locator/ec2/. Please let me know if we can do anything to help track this down.
The workaround suggested above (echo 3 > /proc/sys/vm/drop_caches) doesn't work consistently for me on kernel 4.2.0 (Ubuntu 15.10) on an Acer C720 Chromebook. I've found another workaround that works well for me so far: create a file /etc/sysctl.d/60-workaround-kswapd-allcpu.conf with the following contents and reboot: vm.min_free_kbytes=67584 The idea behind this workaround is a post by Kirill A. Shutemov on LKML (http://lkml.iu.edu//hypermail/linux/kernel/1601.2/03564.html) and this Gallium OS bug report: https://github.com/GalliumOS/galliumos-distro/issues/52 Would be interesting to know if this helps others
Same problem here: - No swap machine - Wily (U15.10) - 4.2.0-19-generic #23-Ubuntu SMP Wed Nov 11 11:39:30 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux - 1GB RAM - `meminfo` - Should have enough RAM to not swap though buffers do seem high MemTotal: 1014932 kB MemFree: 231296 kB MemAvailable: 871180 kB Buffers: 580684 kB Cached: 47812 kB SwapCached: 0 kB Active: 547952 kB Inactive: 164364 kB Active(anon): 84280 kB Inactive(anon): 4288 kB Active(file): 463672 kB Inactive(file): 160076 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 224 kB Writeback: 0 kB AnonPages: 83800 kB Mapped: 39688 kB Shmem: 4768 kB Slab: 48008 kB SReclaimable: 31172 kB SUnreclaim: 16836 kB KernelStack: 1936 kB PageTables: 3844 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 507464 kB Committed_AS: 314640 kB VmallocTotal: 34359738367 kB VmallocUsed: 13524 kB VmallocChunk: 34359717628 kB HardwareCorrupted: 0 kB AnonHugePages: 49152 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 53248 kB DirectMap2M: 1126400 kB - kernel config: https://gist.github.com/sgnn7/cbb41ce21d3a927eca27 - strace shows nothing interesting - `perf` report: Samples: 12K of event 'cpu-clock', Event count (approx.): 3245250000 Overhead Command Shared Object Symbol 19.34% kswapd0 [kernel.kallsyms] [k] shrink_lruvec 17.04% kswapd0 [kernel.kallsyms] [k] mem_cgroup_iter 8.60% kswapd0 [kernel.kallsyms] [k] mem_cgroup_zone_lruvec 6.57% kswapd0 [kernel.kallsyms] [k] shrink_slab 5.47% kswapd0 [kernel.kallsyms] [k] global_dirty_limits 4.18% kswapd0 [kernel.kallsyms] [k] domain_dirty_limits 3.71% kswapd0 [kernel.kallsyms] [k] mem_cgroup_get_lru_size 3.59% kswapd0 [kernel.kallsyms] [k] super_cache_count 3.27% kswapd0 [kernel.kallsyms] [k] get_lru_size 3.26% kswapd0 [kernel.kallsyms] [k] throttle_vm_writeout 2.20% kswapd0 [kernel.kallsyms] [k] css_next_descendant_pre 2.15% kswapd0 [kernel.kallsyms] [k] blk_flush_plug_list 1.96% kswapd0 [kernel.kallsyms] [k] shrink_zone 1.73% kswapd0 [kernel.kallsyms] [k] _raw_spin_lock 1.59% kswapd0 [kernel.kallsyms] [k] __list_lru_count_one.isra.2 1.43% kswapd0 [kernel.kallsyms] [k] list_lru_count_one 1.37% kswapd0 [kernel.kallsyms] [k] memcg_kmem_is_active 1.27% kswapd0 [kernel.kallsyms] [k] __raw_callee_save___pv_queued_spin_unlock ... I'm going to try gdb, changing swappiness, changing vm.min_free_kbytes, and reducing buffer limits in that order and report back but most likely I'll have one shot before the bug goes away for the next few days.
Cont'd from previous post In order of attempts on a live system: - gdb didn't work at all since kernel wasn't built w/ debugging flags - hotload of 10 and 0 swappiness (from 60) didn't make the kswapd process reduce cpu usage - hotload of vm.min_free_kbytes=64K (from 4K) didn't make the process reduce cpu usage - hotload of vm.dirty_background_ratio=5 (from 10) didn't make the process reduce cpu usage - hotload of vm.dirty_ratio=10 (from 20) didn't make the process reduce cpu usage - hotload of vm.dirty_background_ratio=15 (from 5) didn't make the process reduce cpu usage - hotload of vm.dirty_ratio=25 (from 10) didn't make the process reduce cpu usage - live swapon on a new 256MB swapfile didn't reduce process use - live swapoff and swapon after that also didn't drop cpu usage Sidenote: We're using Docker so I'm not sure if that is contributing to the situation.
Good news! I was able to get rid of the bug completely by setting the `mem` kernel parameter to a value slightly less than physical memory. I own an Acer C720 (2GB model), and setting `mem=1920M` does the job. The idea sprung up in my head after reading the aforementioned bug report on github[1]. I hope this might give some clue to the issue. [1]: https://github.com/GalliumOS/galliumos-distro/issues/52
Created attachment 208411 [details] ftrace (function_graph)
Created attachment 208421 [details] ftrace (vmscan tracepoints)
Created attachment 208431 [details] /proc/vmstat (time 0)
Created attachment 208441 [details] /proc/vmstat (time 5s)
Created attachment 208451 [details] /proc/zoneinfo
Created attachment 208461 [details] /proc/pagetypeinfo
Created attachment 208471 [details] /proc/buddyinfo
Created attachment 208481 [details] vmstat -m (time 0)
Created attachment 208491 [details] vmstat -m (time 5s)
I am able to semi-reliably reproduce this (or very similar?) problem on a setup very close to one in comment #21 - kernel: 4.2.0-30-generic (ubuntu 15.10) - 2 GB RAM, 1 CPU, running under Xen (EC2 t2.small instance) - docker with LVM thin-pool storage backend, running 3 containers, no memory limits set for their memcg's - server is mostly idling (load average 0.0-0.1) To reproduce it I have to: 1. set vm.overcomit_memory=1 2. initiate some disk activity: find -xdev / -type f |xargs -P10 -n1 md5sum &>/dev/null & find /var/lib/docker -type f |xargs -P10 -n1 md5sum &>/dev/null & 3. run some memory allocations until you hit OOM for x in {1..200}; do ./memalloc & : ; done memalloc above is a simple C program which allocates 100MB and memsets it with 'x': #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main(int argc, char *argv[]) { int block_mb = 100; char *buf; printf("allocing %dMB: ", block_mb); buf = malloc(block_mb * 1024 * 1000); if (! buf) { printf("FAILED!\n"); exit(EXIT_FAILURE); } printf("ok\n"); memset(buf, 'x', block_mb * 1024 * 1000); sleep(180); return 0; } once you hit OOM, console slows down, it is time to CTRL+C, pkill memalloc and then check top. many times it spins `kswapd0` then recovers within tens of seconds, but once in a while it stays there for hours (didn't have patience to check for longer). Once I triggered bug, I tried to get as much information as possible from running system. I am attaching /proc/*info files (some taken 5 s apart), ftrace outputs for event tracer (vmscan events only), ftrace output for function_graph tester. Let me know if you need more information. To recover from situation need to free enough memory in a short period of time, sometime dropping caches helps, sometimes needed to close applications/containers as well, but never had to reboot to recover.
It would be very helpful if there was a way to get output similar to ftrace function_graph tracer, but with function args and return values, but from the look of it, `pgdat_balance` for some reason keeps returning false even that /proc/zoneinfo shows that number of free pages is much higher than any watermark. Problem description and recovery method very closely resembles discussion around kernel 3.7 (https://lkml.org/lkml/2012/11/28/88): > The zonelist reclaim in kswapd would do > nothing because all high watermarks are met, but the compaction logic > would find its own requirements unmet and loop over the zones again. > Indefinitely, until some third party would free enough memory to help > meet the higher compaction watermark.
(In reply to Anatoli Sakhnik from comment #4) > My Acer C720 too suffers occasionally. Turning swap on/off doesn't help. > Dropping caches *does* help: > > # echo 3 > /proc/sys/vm/drop_caches # 1 isn't enough > > Next my guess would be to try to deactivate zswap. above work around works for me, kernel 4.4.2 debian jessie. bug happens randomly after heavy web browsers for kernel 4.5 downgrade to 3.16 stable jessie kernel, bug gone. upgrade 4.4.2 bug came again
Same thing on Thinkpad X220 with 8 GB RAM running Ubuntu 14.04, with Ubuntu's Kernel 3.16.0-77-generic. Swap is disabled. kswapd0 runs on high CPU and the HD light is on all the time during this (no idea why). After 20 (!) minutes the OOM killer manages to kill a process to resolve the situation.
Same problem on Amazon's t2.nano instance (512MB of RAM). Seemed to be triggered by doing a bunch of file IO. This is a brand new install of Ubuntu 16.04. I have no swap enabled, and yet: top - 06:42:57 up 1:58, 1 user, load average: 2.43, 2.66, 2.31 Tasks: 125 total, 3 running, 122 sleeping, 0 stopped, 0 zombie %Cpu(s): 2.1 us, 6.9 sy, 0.0 ni, 0.0 id, 0.9 wa, 0.0 hi, 0.0 si, 90.1 st KiB Mem : 498416 total, 348096 free, 49772 used, 100548 buff/cache KiB Swap: 0 total, 0 free, 0 used. 411900 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29 root 20 0 0 0 0 R 65.0 0.0 103:16.64 kswapd0 14343 root 20 0 0 0 0 R 2.9 0.0 0:00.82 python Running "echo 1 > /proc/sys/vm/drop_caches" didn't fix the problem, but it did fix it immediately with "3". Also, my /tmp isn't full at all (6.5GB / 85% left on root).
A workaround for machines running under Xen has been found over on Ubuntu's bug tracker, see comment #69: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518457 The workaround is to disable hot-add of memory: touch /etc/udev/rules.d/40-vm-hotadd.rules reboot
I tried the same Ubuntu inspired "disable hot-add of memory" (and CPU) workaround under AWS EC2 HVM, Centos 7.x with mainline (elrepo) 4.4.15 kernel: no such luck, I still see this occasionally.
I detailed why this bug happens here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518457/comments/126 this appears to be fixed by Mel Gorman's patch series to change memory reclaim from "per zone" to "per node": https://marc.info/?l=linux-mm&m=146797052519026 So this bug should be fixed with the latest kernel.
(In reply to Dan Streetman from comment #40) > I detailed why this bug happens here: > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518457/comments/126 > > So this bug should be fixed with the latest kernel. Can you clarify, the link you mention seems to talk mainly about Xen. Do you think the latest kernel will fix it also for non-Xen machines?
(In reply to mail+kernel-bugzilla from comment #41) > (In reply to Dan Streetman from comment #40) > > I detailed why this bug happens here: > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518457/comments/126 > > > > So this bug should be fixed with the latest kernel. > > Can you clarify, the link you mention seems to talk mainly about Xen. Do you > think the latest kernel will fix it also for non-Xen machines? what does your /proc/zoneinfo look like? do you have a system with (approx) <= 4g and Normal zone with few managed pages?
(In reply to Dan Streetman from comment #42) > what does your /proc/zoneinfo look like? do you have a system with (approx) > <= 4g and Normal zone with few managed pages? My zoneinfo file right now looks like this: https://gist.github.com/nh2/7ba7375d5c8de797714f7a909e6f0c94 (I upgraded from 8 GB to 16 GB memory recently though, after I wrote comment #36.)
(In reply to mail+kernel-bugzilla from comment #43) > (In reply to Dan Streetman from comment #42) > > what does your /proc/zoneinfo look like? do you have a system with > (approx) > > <= 4g and Normal zone with few managed pages? > > My zoneinfo file right now looks like this: > https://gist.github.com/nh2/7ba7375d5c8de797714f7a909e6f0c94 > > (I upgraded from 8 GB to 16 GB memory recently though, after I wrote comment > #36.) That zoneinfo doesn't look like you're seeing the same problem, so if you are seeing consistent, sustained (not just transient) 100% cpu from kswapd, I think it's a different problem from what I described in comment 40.
I'm assuming by latest kernel you mean 4.8? If so I'm looking forward to Arch pushing it through testing :)
I am having the same issue on Fedora 24 with kernel 4.8.6. So I guess it has not been pushed there, or it does not fix anything. It is a huge job stopper as I need to transfer many files between two USB disks. Kwapd0 appears on top of processes after a while, and slowly degrades overall performance until I have to hard reboot the machine in the middle of some transfer.
My guess is Fedora didn't put the changes through or something, because 4.8 has DEFINITELY fixed it for me. I used to have to reboot about twice daily due to this, but ever since I upgraded to 4.8 it hasn't happened once.
I'm on openSUSE with 4.8.8 and still have this issue.
I'm on Debian with 4.8.7 and still have this issue.
4.8.13-100.fc23.i686+PAE #1 /dev/sda is Samsung SSD 850 EVO 250GB swapoff -va sysctl vm.drop_caches=3 Problem, causes always heavy kswapd0 load: cat /dev/sda >> /dev/zero hdparm -t /dev/sda ddrescue /dev/sda /dev/zero -vf hexdump /dev/sda dd if=/dev/sda of=/dev/zero etc. No problem (read speed ~500MB/s, except hdparm ): hdparm --direct -t /dev/sda dd iflag=direct if=/dev/sda of=/dev/zero bs=1073741824 ddrescue --direct /dev/sda /dev/zero -vf -b 4096 -c 8192
I am not sure if this is the same bug, but for me kswapd0 goes high-cpu following a page allocation failure in xhci_segment_alloc and I think that this has been occurring since moving to 4.8 on Fedora 24. I don't remember experiencing it before that. Currently on 4.8.15. I normally boot with 3 or 4 USB 3.0 disks attached and, after the upgrade to 4.8.x noticed that kswapd0 was running at 100%. I went back to 4.7.x and no problem. Searches on this issue frequently referred to USB disks so I unplugged and rebooted. If I unplug all of my USB 3.0 devices I get a normal boot, even with a USB weather station, keyboard, mouse. Sometimes, one or two USB 3.0 disks is OK too, If I boot with all of the USB 3.0 disks included, I get a kworker page allocation failure and after boot kswapd0 is high-cpu, usually split across 2-4 cores. If I boot with two USB 3.0 disks and get a normal boot (no page allocation failure and normal kswapd) and then plug in a hub with the rest of the disks (and a USB 3.0 card reader) I get the page allocation failure at that point and kswapd0 goes high-cpu. I have not looked at them all, but whenever I see kswapd0 high-cpu and I do look, there is the page allocation failure in the log. The 'perf top' command seems to show different information from time to time but the top contenders are frequently 'shrink_inactive_list', 'inactive_list_is_low', 'find_next_bit', 'shrink_none_memcg', '_raw_spin_lock' to name a few. Makes me wonder if the xhci allocation failure is the trigger, and fails to clean up on the error exit path, and kswapd0 is just a hapless victim. There is a stack trace (on ubuntu kernel) of the page allocation failure in the dmesg attached to https://bugzilla.redhat.com/show_bug.cgi?id=1395825 on this issue but I have more if it would help. I have 19GiB free on a 24GiB machine so there should be no memory shortage to prompt swapping or the page allocation failure. I had also noticed frequently that not all of my USB disks were mounted after boot and that I had to remove and reinsert a disk to use it. IIRC this affected my USB 2.0 disks too and from before the upgrade to 4.8 too.
> Problem, causes always heavy kswapd0 load: > cat /dev/sda >> /dev/zero > hdparm -t /dev/sda > ddrescue /dev/sda /dev/zero -vf > hexdump /dev/sda > dd if=/dev/sda of=/dev/zero > etc. of course those cause kswapd work, all those commands will fill your page cache and kswapd is responsible for clearing those pages out. kswapd running isn't a problem, if it's doing work. kswapd running *without* doing work is the problem. When you stop running those commands, does kswapd catch up and stop using cpu? If so, that's normal. If not, and it never stops using cpu, that's the problem. > No problem (read speed ~500MB/s, except hdparm ): > hdparm --direct -t /dev/sda > dd iflag=direct if=/dev/sda of=/dev/zero bs=1073741824 > ddrescue --direct /dev/sda /dev/zero -vf -b 4096 -c 8192 the difference is those commands bypass the page cache - so the page cache doesn't fill up and kswapd doesn't need to clear it out.
> I am not sure if this is the same bug, but for me kswapd0 goes high-cpu > following a page allocation failure in xhci_segment_alloc and I think that > this has been occurring since moving to 4.8 on Fedora 24 from your dmesg, it certainly doesn't look like the same bug.
(In reply to Dan Streetman from comment #52) > of course those cause kswapd work, all those commands will fill your page > cache and kswapd is responsible for clearing those pages out. > > kswapd running isn't a problem, if it's doing work. kswapd running > *without* doing work is the problem. When you stop running those commands, > does kswapd catch up and stop using cpu? If so, that's normal. If not, and > it never stops using cpu, that's the problem. but, why kswapd so aggressively write something to storage when no data to flush (swap not set)?
I reproduced the bug on the most recent kernel. I have extracted sysctl, meminfo and dmesg logs: please see my comments and attachments on the same bug: https://bugzilla.kernel.org/show_bug.cgi?id=110501#c15 I also wrote simple python script that eats ram and reproduces the bug 100% for me
Reproduced on latest centos kernel 3.10.0-1160.53 It's so strange that this keeps on happening I tried disabling swap and everything but it doesn't care. There's 100 GB free ram and yet it happens
I'm working with CentOS Linux kernel version 3.10.0-1160.49.1 and I also noticed that kswapd0 runs for over 20 seconds and seem to cause a kernel panic. In examining the kswapd() code, it has an infinite loop. It can only break from this loop if the function, kthread_should_stop() returns as true. This function tests the current task's flag for the bit KTHREAD_SHOULD_STOP is set. This bit will only be set if a call to to_live_kthread() that will get a pointer to the current kernel thread. If the pointer is NULL, then the KTHREAD_SHOULD_STOP bit will not be set. This may be the problem with this BUG. Anyone have a comment?
I believe this bug has be fixed in later versions of the Linux kernel. I tested kswapd0 by writing a C program that create a large memory map for a file. I then encapsulate the above C program to generate several thousand instances of the C program. My computer is running version 5.17 of the Linux kernel. The computer bogged down, but the kswapd0 did not have a problem. I believe this bug should be closed.