Bug 124641

Summary: Memory cgroups are not garbage-collected after release to the system
Product: File System Reporter: John Garcia (john.garcia)
Component: SysFSAssignee: Greg Kroah-Hartman (greg)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: bmahler
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 4.5 Subsystem:
Regression: No Bisected commit-id:

Description John Garcia 2016-07-12 23:29:05 UTC
Looking at MESOS-5836 https://issues.apache.org/jira/browse/MESOS-5836 and patch 9184539 https://patchwork.kernel.org/patch/9184539/ we're concerned about memory cgroup  leakage in kernel 4.2/4.4/4.5. This was first seen on CoreOS 835.6/4.2, but we've reproduced on Ubuntu 16.04/4.4 and CoreOS 1010/4.5 kernels.

When a system allocates >65336 cgroups, we'll see the following in dmesg when de-allocating them:

idr_remove called for id=65536 which is not allocated.

After that point, the memory cgroup subsystem is effectively locked until the system's page caches are dropped using:

echo 1 > /proc/sys/vm/drop_caches

We're working to determine if the patch mentioned above is a fix for this issue and will report back when we have more info.

Reproduction steps:

- Start a new instance using kernel 4.2, 4.4, or 4.5 (CoreOS 766-1010, Ubuntu 16.04) 
- ssh to the machine
- {{cat /proc/cgroups}} to determine the number of memory cgroups
- Run several docker containers using the {{--memory}} or {{-m}} option to set a memory isolator, either in parallel or in series
- Stop all containers
- {{cat /proc/cgroups}} to review the number of memory cgroups and compare to previous run
Comment 1 Greg Kroah-Hartman 2016-07-13 00:09:03 UTC
On Tue, Jul 12, 2016 at 11:29:05PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=124641
> 
>             Bug ID: 124641
>            Summary: Memory cgroups are not garbage-collected after release
>                     to the system
>            Product: File System
>            Version: 2.5
>     Kernel Version: 4.2

4.2 is old, does this happen on 4.6?

Also, can you take this to email, copying me and Tejun and the cgroups
developers and lkml?

thanks,

greg k-h
Comment 2 John Garcia 2016-07-13 00:34:44 UTC
We've tested up to 4.5, I'll fetch a 4.6 and try it out. Changing the reported version to 4.5 to reflect this.
Comment 3 John Garcia 2016-07-13 17:49:10 UTC
Tejun adds valuable context at LKML:

It's not that memcg doesn't gc the dead csses but that the memory
lying around keeps pinning the memcg struct down.  There's nothing
wrong with it.  As soon as there's memory pressure, the memory will
get reclaimed and the memcg structs will be freed.  The problem is
caused by the memcg struct keeping pinning memcg id which is a pretty
limited resource.  The above patch fixes the issue by the lifetime of
decoupling memcg id from that of memcg struct.

I've tested this in 4.6 and was _not_ able to reproduce the result. Resolving now.