Immediately after boot, over 1000 kworker processes are being observed on my main linux test computer (basically a Ubuntu 16.04 server, no GUI). The worker threads appear to be idle, and do disappear after the nominal 5 minute timeout, depending on whatever other stuff might run in the meantime. However, the number of threads can hugely increase again. The issue seems as though it might be a race condition, and is fairly difficult to create manually after the boot condition settles. For SLUB, kernel bisection gave: 81ae6d03952c1bfb96e1a716809bd65e7cd14360 "mm/slub.c: replace kick_all_cpus_sync() with synchronize_sched() in kmem_cache_shrink()" The following monitoring script was used for the below examples: #!/bin/dash while [ 1 ]; do echo $(uptime) ::: $(ps -A --no-headers | wc -l) ::: $(ps aux | grep kworker | grep -v u | grep -v H | wc -l) sleep 10.0 done Example: After boot: 23:25:10 up 1 min, 2 users, load average: 2.73, 1.01, 0.36 ::: 1364 ::: 1195 23:25:20 up 1 min, 2 users, load average: 2.31, 0.97, 0.35 ::: 1364 ::: 1195 23:25:30 up 1 min, 2 users, load average: 1.95, 0.94, 0.35 ::: 1364 ::: 1195 Do some stuff: 23:28:33 up 4 min, 3 users, load average: 1.23, 0.89, 0.42 ::: 1376 ::: 1215 23:28:43 up 4 min, 3 users, load average: 1.66, 1.01, 0.47 ::: 1384 ::: 1215 23:28:53 up 4 min, 3 users, load average: 1.86, 1.07, 0.50 ::: 1965 ::: 1788 23:29:03 up 4 min, 3 users, load average: 1.58, 1.04, 0.49 ::: 1958 ::: 1788 23:29:13 up 5 min, 3 users, load average: 1.33, 1.00, 0.48 ::: 1958 ::: 1788 23:29:23 up 5 min, 3 users, load average: 1.82, 1.12, 0.53 ::: 1952 ::: 1788 23:29:33 up 5 min, 3 users, load average: 1.54, 1.08, 0.52 ::: 1895 ::: 1744 23:29:44 up 5 min, 3 users, load average: 1.30, 1.05, 0.52 ::: 1906 ::: 1739 Now do a lot (load average is real): 23:33:27 up 9 min, 3 users, load average: 0.48, 0.65, 0.46 ::: 1974 ::: 1807 23:33:37 up 9 min, 3 users, load average: 0.40, 0.63, 0.45 ::: 2084 ::: 1807 23:33:48 up 9 min, 3 users, load average: 0.65, 0.68, 0.47 ::: 2236 ::: 2068 23:33:58 up 9 min, 3 users, load average: 256.45, 57.90, 19.23 ::: 2236 ::: 2069 23:34:08 up 10 min, 3 users, load average: 217.02, 56.00, 19.02 ::: 2235 ::: 2068 And again, after the idle threads timed out: 23:49:54 up 25 min, 3 users, load average: 0.00, 2.37, 6.86 ::: 184 ::: 17 23:50:04 up 25 min, 3 users, load average: 66.62, 16.10, 11.25 ::: 640 ::: 17 23:50:14 up 26 min, 3 users, load average: 132.63, 32.41, 16.64 ::: 2892 ::: 2065 23:50:25 up 26 min, 3 users, load average: 399.02, 94.90, 37.26 ::: 2233 ::: 2066 Example (SLUB) with 81ae6d03952c reverted: After boot: 09:10:25 up 1 min, 2 users, load average: 0.98, 0.43, 0.16 ::: 195 ::: 26 09:10:35 up 1 min, 2 users, load average: 0.83, 0.41, 0.15 ::: 193 ::: 26 09:10:45 up 1 min, 2 users, load average: 0.70, 0.40, 0.15 ::: 193 ::: 26 Go directly to "do a lot", tried for ~10 minutes (load average is real): 09:21:44 up 12 min, 4 users, load average: 1151.29, 722.18, 314.75 ::: 193 ::: 24 09:21:54 up 12 min, 4 users, load average: 1289.22, 765.83, 333.31 ::: 287 ::: 24 09:22:05 up 12 min, 4 users, load average: 1555.32, 837.68, 361.19 ::: 279 ::: 24 09:22:15 up 13 min, 4 users, load average: 1316.19, 810.10, 357.32 ::: 193 ::: 24 09:22:25 up 13 min, 4 users, load average: 1113.84, 783.43, 353.49 ::: 191 ::: 22
Created attachment 239841 [details] Just a script I use to manually create the issue This is quite a viscous script. I haven't found a simpler way to manually create the issue.
As best as I am able to test, the proposed 2 patch set: https://patchwork.kernel.org/patch/9361853 https://patchwork.kernel.org/patch/9359271 resolves this bug report.
Just noticed this same issue when playing with an openSUSE host using mainline kernel 4.8.5, SLAB allocator, and memory cgroup enabled. The trigger for me, was using a pretty simple "systemd-nspawn -D /some/container --boot". That creates about 2000 kworker threads, driving the load to over 100, If I boot with cgroup_disable=memory, the issue does not occur. In production where I've been running 4.8.5 for a day on some hosts and VMs without memory cgroup controller, I also did not observe the symptoms.
(In reply to Patrick Schaaf from comment #3) > > The trigger for me, was using a pretty simple "systemd-nspawn -D > /some/container --boot". That creates about 2000 kworker threads, driving > the load to over 100, Slightly more correct: just starting the container that way does not already trigger the issue. Logging in in the container does.
@Patrick: I tried your "cgroup_disable=memory" suggestion, but the issue still occurs for me, perhaps somewhat reduced. (kernel 4.9-rc4). You mentioned "SLAB", which is covered in bug 172981, and fixed as of kernel 4.9-rc3 (I think). Does anybody know why the 2 patch set referenced above has not yet been included?
(In reply to Doug Smythies from comment #5) > @Patrick: I tried your "cgroup_disable=memory" suggestion, but the issue > still occurs for me, perhaps somewhat reduced. (kernel 4.9-rc4). @Patrick: I made a stupid mistake. Your "ipv6.disable=1" suggestion does work for me.
(In reply to Doug Smythies from comment #6) > @Patrick: I made a stupid mistake. Your "ipv6.disable=1" suggestion does > work for me. I meant to write "cgroup_disable=memory" above I originally posted this bug report as a regression and bisected. While a fix only took a few days, it has yet to be included (as of 4.9-rc6). I ask again: Does anybody know why the 2 patch set referenced above has not yet been included?
kernel 4.10-rc1 contains the above mentioned two patches. I tested it, and the problem is resolved.