Bug 172991 - [bisected] SLUB: over 2000 kworker threads
Summary: [bisected] SLUB: over 2000 kworker threads
Status: RESOLVED CODE_FIX
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Slab Allocator (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-27 18:03 UTC by Doug Smythies
Modified: 2017-01-05 22:39 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.7+
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Just a script I use to manually create the issue (941 bytes, text/plain)
2016-09-27 18:29 UTC, Doug Smythies
Details

Description Doug Smythies 2016-09-27 18:03:52 UTC
Immediately after boot, over 1000 kworker processes are being observed on my main linux test computer (basically a Ubuntu 16.04 server, no GUI). The worker threads appear to be idle, and do disappear after the nominal 5 minute timeout, depending on whatever other stuff might run in the meantime. However, the number of threads can hugely increase again. The issue seems as though it might be a race condition, and is fairly difficult to create manually after the boot condition settles.

For SLUB, kernel bisection gave:
81ae6d03952c1bfb96e1a716809bd65e7cd14360
"mm/slub.c: replace kick_all_cpus_sync() with synchronize_sched() in kmem_cache_shrink()"

The following monitoring script was used for the below examples:

#!/bin/dash

while [ 1 ];
do
  echo $(uptime) ::: $(ps -A --no-headers | wc -l) ::: $(ps aux | grep kworker | grep -v u | grep -v H | wc -l)
  sleep 10.0
done

Example:

After boot:

23:25:10 up 1 min, 2 users, load average: 2.73, 1.01, 0.36 ::: 1364 ::: 1195
23:25:20 up 1 min, 2 users, load average: 2.31, 0.97, 0.35 ::: 1364 ::: 1195
23:25:30 up 1 min, 2 users, load average: 1.95, 0.94, 0.35 ::: 1364 ::: 1195

Do some stuff:

23:28:33 up 4 min, 3 users, load average: 1.23, 0.89, 0.42 ::: 1376 ::: 1215
23:28:43 up 4 min, 3 users, load average: 1.66, 1.01, 0.47 ::: 1384 ::: 1215
23:28:53 up 4 min, 3 users, load average: 1.86, 1.07, 0.50 ::: 1965 ::: 1788
23:29:03 up 4 min, 3 users, load average: 1.58, 1.04, 0.49 ::: 1958 ::: 1788
23:29:13 up 5 min, 3 users, load average: 1.33, 1.00, 0.48 ::: 1958 ::: 1788
23:29:23 up 5 min, 3 users, load average: 1.82, 1.12, 0.53 ::: 1952 ::: 1788
23:29:33 up 5 min, 3 users, load average: 1.54, 1.08, 0.52 ::: 1895 ::: 1744
23:29:44 up 5 min, 3 users, load average: 1.30, 1.05, 0.52 ::: 1906 ::: 1739

Now do a lot (load average is real):

23:33:27 up 9 min, 3 users, load average: 0.48, 0.65, 0.46 ::: 1974 ::: 1807
23:33:37 up 9 min, 3 users, load average: 0.40, 0.63, 0.45 ::: 2084 ::: 1807
23:33:48 up 9 min, 3 users, load average: 0.65, 0.68, 0.47 ::: 2236 ::: 2068
23:33:58 up 9 min, 3 users, load average: 256.45, 57.90, 19.23 ::: 2236 ::: 2069
23:34:08 up 10 min, 3 users, load average: 217.02, 56.00, 19.02 ::: 2235 ::: 2068

And again, after the idle threads timed out:

23:49:54 up 25 min, 3 users, load average: 0.00, 2.37, 6.86 ::: 184 ::: 17
23:50:04 up 25 min, 3 users, load average: 66.62, 16.10, 11.25 ::: 640 ::: 17
23:50:14 up 26 min, 3 users, load average: 132.63, 32.41, 16.64 ::: 2892 ::: 2065
23:50:25 up 26 min, 3 users, load average: 399.02, 94.90, 37.26 ::: 2233 ::: 2066

Example (SLUB) with 81ae6d03952c reverted:

After boot:

09:10:25 up 1 min, 2 users, load average: 0.98, 0.43, 0.16 ::: 195 ::: 26
09:10:35 up 1 min, 2 users, load average: 0.83, 0.41, 0.15 ::: 193 ::: 26
09:10:45 up 1 min, 2 users, load average: 0.70, 0.40, 0.15 ::: 193 ::: 26

Go directly to "do a lot", tried for ~10 minutes (load average is real):

09:21:44 up 12 min, 4 users, load average: 1151.29, 722.18, 314.75 ::: 193 ::: 24
09:21:54 up 12 min, 4 users, load average: 1289.22, 765.83, 333.31 ::: 287 ::: 24
09:22:05 up 12 min, 4 users, load average: 1555.32, 837.68, 361.19 ::: 279 ::: 24
09:22:15 up 13 min, 4 users, load average: 1316.19, 810.10, 357.32 ::: 193 ::: 24
09:22:25 up 13 min, 4 users, load average: 1113.84, 783.43, 353.49 ::: 191 ::: 22
Comment 1 Doug Smythies 2016-09-27 18:29:25 UTC
Created attachment 239841 [details]
Just a script I use to manually create the issue

This is quite a viscous script. I haven't found a simpler way to manually create the issue.
Comment 2 Doug Smythies 2016-10-06 16:03:31 UTC
As best as I am able to test, the proposed 2 patch set:

https://patchwork.kernel.org/patch/9361853
https://patchwork.kernel.org/patch/9359271

resolves this bug report.
Comment 3 Patrick Schaaf 2016-10-30 14:44:43 UTC
Just noticed this same issue when playing with an openSUSE host using mainline kernel 4.8.5, SLAB allocator, and memory cgroup enabled.

The trigger for me, was using a pretty simple "systemd-nspawn -D /some/container --boot". That creates about 2000 kworker threads, driving the load to over 100,

If I boot with cgroup_disable=memory, the issue does not occur. In production where I've been running 4.8.5 for a day on some hosts and VMs without memory cgroup controller, I also did not observe the symptoms.
Comment 4 Patrick Schaaf 2016-10-30 14:56:12 UTC
(In reply to Patrick Schaaf from comment #3)
> 
> The trigger for me, was using a pretty simple "systemd-nspawn -D
> /some/container --boot". That creates about 2000 kworker threads, driving
> the load to over 100,

Slightly more correct: just starting the container that way does not already trigger the issue. Logging in in the container does.
Comment 5 Doug Smythies 2016-11-06 18:22:00 UTC
@Patrick: I tried your "cgroup_disable=memory" suggestion, but the issue still occurs for me, perhaps somewhat reduced. (kernel 4.9-rc4). You mentioned "SLAB", which is covered in bug 172981, and fixed as of kernel 4.9-rc3 (I think).

Does anybody know why the 2 patch set referenced above has not yet been included?
Comment 6 Doug Smythies 2016-11-07 21:36:32 UTC
(In reply to Doug Smythies from comment #5)
> @Patrick: I tried your "cgroup_disable=memory" suggestion, but the issue
> still occurs for me, perhaps somewhat reduced. (kernel 4.9-rc4).

@Patrick: I made a stupid mistake. Your "ipv6.disable=1" suggestion does work for me.
Comment 7 Doug Smythies 2016-11-22 19:08:57 UTC
(In reply to Doug Smythies from comment #6)

> @Patrick: I made a stupid mistake. Your "ipv6.disable=1" suggestion does
> work for me.

I meant to write "cgroup_disable=memory" above

I originally posted this bug report as a regression and bisected. While a fix only took a few days, it has yet to be included (as of 4.9-rc6). I ask again:

Does anybody know why the 2 patch set referenced above has not yet been included?
Comment 8 Doug Smythies 2016-12-27 17:14:05 UTC
kernel 4.10-rc1 contains the above mentioned two patches. I tested it, and the problem is resolved.

Note You need to log in before you can comment on or make changes to this bug.