Bug 206865
Summary: | OOM kills processes when plenty of memory available | ||
---|---|---|---|
Product: | Memory Management | Reporter: | bobbysmith013 |
Component: | Page Allocator | Assignee: | Andrew Morton (akpm) |
Status: | NEW --- | ||
Severity: | normal | CC: | deshawndylan, siarhei_k_dev_linux |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 3.10.0-1062.9.1.el7.x86_64 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
bobbysmith013
2020-03-16 18:16:42 UTC
Why no responses yet? Did I do something wrong? That's a very old kernel - we're working on 5.6! Can you please take this up with Red Hat? Red hat said they won’t help me because I’m running centos. It looks like systemd process was trying to fork. copy_process kernel function needs to duplicate task_struct which stored in slab cache. If no free task_struct in slab cache then __alloc_pages_nodemask was called to get 4 physically contiguous memory pages(16kB) (see order=2 as parameter for request). According to these lines there were plenty of 4kB and 8kB free pages only: 2020-03-16 12:20:22 hostnameRedacted kernel: Node 0 Normal: 6423893*4kB (UE) 2542493*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 46035516kB 2020-03-16 12:20:22 hostnameRedacted kernel: Node 1 Normal: 6869581*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 27478324kB So kernel memory was fragmented and no swap, that why oom-killer was invoked. It's difficult to say what different in kernel 3.10.0-957.27.2.el7.x86_64 that prevents oom-killer. (In reply to siarhei_k_dev_linux from comment #5) > It looks like systemd process was trying to fork. > > copy_process kernel function needs to duplicate task_struct which stored in > slab cache. If no free task_struct in slab cache then __alloc_pages_nodemask > was called to get 4 physically contiguous memory pages(16kB) (see order=2 as > parameter for request). > > According to these lines there were plenty of 4kB and 8kB free pages only: > 2020-03-16 12:20:22 hostnameRedacted kernel: Node 0 Normal: 6423893*4kB > (UE) 2542493*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB > 0*2048kB 0*4096kB = 46035516kB > 2020-03-16 12:20:22 hostnameRedacted kernel: Node 1 Normal: 6869581*4kB (U) > 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB > 0*4096kB = 27478324kB > > So kernel memory was fragmented and no swap, that why oom-killer was invoked. > It's difficult to say what different in kernel 3.10.0-957.27.2.el7.x86_64 > that prevents oom-killer. Thank you for responding. Any idea of a way I could workaround/prevent this? Is adding swap the only way to fix it? If so, how much swap would I have to add? (In reply to bobbysmith013 from comment #6) > (In reply to siarhei_k_dev_linux from comment #5) > > It looks like systemd process was trying to fork. > > > > copy_process kernel function needs to duplicate task_struct which stored in > > slab cache. If no free task_struct in slab cache then > __alloc_pages_nodemask > > was called to get 4 physically contiguous memory pages(16kB) (see order=2 > as > > parameter for request). > > > > According to these lines there were plenty of 4kB and 8kB free pages only: > > 2020-03-16 12:20:22 hostnameRedacted kernel: Node 0 Normal: 6423893*4kB > > (UE) 2542493*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB > > 0*2048kB 0*4096kB = 46035516kB > > 2020-03-16 12:20:22 hostnameRedacted kernel: Node 1 Normal: 6869581*4kB > (U) > > 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB > > 0*4096kB = 27478324kB > > > > So kernel memory was fragmented and no swap, that why oom-killer was > invoked. > > It's difficult to say what different in kernel 3.10.0-957.27.2.el7.x86_64 > > that prevents oom-killer. > > Thank you for responding. Any idea of a way I could workaround/prevent > this? Is adding swap the only way to fix it? If so, how much swap would I > have to add? Recommended Linux swap size depends on size of physical memory. Just look through the Internet for the best practices regarding swap size. Here how the Buddy system of linux kernel should work: "The first 16k chunk is further split into two halves of 8k (buddies) of which one is allocated for the caller and other is put into the 8k list. The second chunk of 16k is put into the 16k free list, when lower order (8k) buddies become free at some future time, they are coalesced to form a higher-order 16k block. When both 16k buddies become free, they are again coalesced to arrive at a 32k block which is put back into the free list." It's strange that there are plenty of free small chunks. But it can be possible due to specific load on your server, or issue in linux kernel. Actually there is kernel-3.10.0-1062.18.1.el7.x86_64.rpm for centos that you can try. Also you can try to monitor slab cache for "task_struct" through a special file /proc/slabinfo and compare on different kernel versions. |