Bug 124641 - Memory cgroups are not garbage-collected after release to the system
Summary: Memory cgroups are not garbage-collected after release to the system
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: File System
Classification: Unclassified
Component: SysFS (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-12 23:29 UTC by John Garcia
Modified: 2016-07-13 17:49 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description John Garcia 2016-07-12 23:29:05 UTC
Looking at MESOS-5836 https://issues.apache.org/jira/browse/MESOS-5836 and patch 9184539 https://patchwork.kernel.org/patch/9184539/ we're concerned about memory cgroup  leakage in kernel 4.2/4.4/4.5. This was first seen on CoreOS 835.6/4.2, but we've reproduced on Ubuntu 16.04/4.4 and CoreOS 1010/4.5 kernels.

When a system allocates >65336 cgroups, we'll see the following in dmesg when de-allocating them:

idr_remove called for id=65536 which is not allocated.

After that point, the memory cgroup subsystem is effectively locked until the system's page caches are dropped using:

echo 1 > /proc/sys/vm/drop_caches

We're working to determine if the patch mentioned above is a fix for this issue and will report back when we have more info.

Reproduction steps:

- Start a new instance using kernel 4.2, 4.4, or 4.5 (CoreOS 766-1010, Ubuntu 16.04) 
- ssh to the machine
- {{cat /proc/cgroups}} to determine the number of memory cgroups
- Run several docker containers using the {{--memory}} or {{-m}} option to set a memory isolator, either in parallel or in series
- Stop all containers
- {{cat /proc/cgroups}} to review the number of memory cgroups and compare to previous run
Comment 1 Greg Kroah-Hartman 2016-07-13 00:09:03 UTC
On Tue, Jul 12, 2016 at 11:29:05PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=124641
> 
>             Bug ID: 124641
>            Summary: Memory cgroups are not garbage-collected after release
>                     to the system
>            Product: File System
>            Version: 2.5
>     Kernel Version: 4.2

4.2 is old, does this happen on 4.6?

Also, can you take this to email, copying me and Tejun and the cgroups
developers and lkml?

thanks,

greg k-h
Comment 2 John Garcia 2016-07-13 00:34:44 UTC
We've tested up to 4.5, I'll fetch a 4.6 and try it out. Changing the reported version to 4.5 to reflect this.
Comment 3 John Garcia 2016-07-13 17:49:10 UTC
Tejun adds valuable context at LKML:

It's not that memcg doesn't gc the dead csses but that the memory
lying around keeps pinning the memcg struct down.  There's nothing
wrong with it.  As soon as there's memory pressure, the memory will
get reclaimed and the memcg structs will be freed.  The problem is
caused by the memcg struct keeping pinning memcg id which is a pretty
limited resource.  The above patch fixes the issue by the lifetime of
decoupling memcg id from that of memcg struct.

I've tested this in 4.6 and was _not_ able to reproduce the result. Resolving now.

Note You need to log in before you can comment on or make changes to this bug.