Bug 206865 - OOM kills processes when plenty of memory available
Summary: OOM kills processes when plenty of memory available
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-16 18:16 UTC by bobbysmith013
Modified: 2020-04-16 20:26 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.10.0-1062.9.1.el7.x86_64
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description bobbysmith013 2020-03-16 18:16:42 UTC
After upgrading to kernel 3.10.0-1062.9.1.el7.x86_64, the OOM is killing processes on my server when there is still plenty of memory left. From the /var/log/messages , it even appears that the free memory exceeds the "min" memory in all of the memory zones. It is very important to note that if I switch back to kernel 3.10.0-957.27.2.el7.x86_64, the problem goes away.

              total        used        free      shared  buff/cache   available
Mem:            747         648          98           0           0          97
Swap:             0           0           0


2020-03-16 12:20:21  hostnameRedacted kernel: systemd invoked oom-killer: gfp_mask=0x3000d0, order=2, oom_score_adj=0
2020-03-16 12:20:21  hostnameRedacted kernel: systemd cpuset=/ mems_allowed=0-1
2020-03-16 12:20:21  hostnameRedacted kernel: CPU: 54 PID: 1 Comm: systemd Not tainted 3.10.0-1062.9.1.el7.x86_64 #1
2020-03-16 12:20:21  hostnameRedacted kernel: Hardware name: Amazon EC2 r5.24xlarge/, BIOS 1.0 10/16/2017
2020-03-16 12:20:21  hostnameRedacted kernel: Call Trace:
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff87f7ac23>] dump_stack+0x19/0x1b
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff87f75ce9>] dump_header+0x90/0x229
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff87b0a38b>] ? cred_has_capability+0x6b/0x120
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff879c1714>] oom_kill_process+0x254/0x3e0
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff87b0a46e>] ? selinux_capable+0x2e/0x40
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff879c1f66>] out_of_memory+0x4b6/0x4f0
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff879c8a6f>] __alloc_pages_nodemask+0xacf/0xbe0
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff87898f6d>] copy_process+0x1dd/0x1a50
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff8789a991>] do_fork+0x91/0x330
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff87f88a26>] ? trace_do_page_fault+0x56/0x150
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff8789acb6>] SyS_clone+0x16/0x20
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff87f8e2b4>] stub_clone+0x44/0x70
2020-03-16 12:20:21  hostnameRedacted kernel:  [<ffffffff87f8dede>] ? system_call_fastpath+0x25/0x2a
2020-03-16 12:20:21  hostnameRedacted kernel: Mem-Info:
2020-03-16 12:20:21  hostnameRedacted kernel: active_anon:175533498 inactive_anon:6347 isolated_anon:169#012 active_file:0 inactive_file:127 isolated_file:1#012 unevictable:0 dirty:0 writeback:0 unstable:0#012 slab_reclaimable:14109 slab
_unreclaimable:53140#012 mapped:3636 shmem:6389 pagetables:365837 bounce:0#012 free:18789586 free_pcp:1577 free_cma:0
2020-03-16 12:20:21  hostnameRedacted kernel: Node 0 DMA free:15908kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB mana
ged:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scan
ned:0 all_unreclaimable? yes
2020-03-16 12:20:21  hostnameRedacted kernel: lowmem_reserve[]: 0 2783 382618 382618
2020-03-16 12:20:21  hostnameRedacted kernel: Node 0 DMA32 free:1518848kB min:324kB low:404kB high:484kB active_anon:1319472kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):4kB isolated(file):0kB pres
ent:3129304kB managed:2850744kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:2316kB slab_unreclaimable:2580kB kernel_stack:848kB pagetables:5256kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0
kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
2020-03-16 12:20:22  hostnameRedacted kernel: lowmem_reserve[]: 0 0 379834 379834
2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 Normal free:46065748kB min:44708kB low:55884kB high:67060kB active_anon:339644252kB inactive_anon:8632kB active_file:0kB inactive_file:424kB unevictable:0kB isolated(anon):0kB isolated
(file):0kB present:395182080kB managed:388950244kB mlocked:0kB dirty:188kB writeback:0kB mapped:5448kB shmem:8692kB slab_reclaimable:23228kB slab_unreclaimable:105564kB kernel_stack:19808kB pagetables:673280kB unstable:0kB bounce:0kB fre
e_pcp:5400kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
2020-03-16 12:20:22  hostnameRedacted kernel: lowmem_reserve[]: 0 0 0 0
2020-03-16 12:20:22  hostnameRedacted kernel: Node 1 Normal free:27493268kB min:45068kB low:56332kB high:67600kB active_anon:361226552kB inactive_anon:16756kB active_file:0kB inactive_file:3640kB unevictable:0kB isolated(anon):752kB isol
ated(file):4kB present:398327808kB managed:392079448kB mlocked:0kB dirty:176kB writeback:0kB mapped:9720kB shmem:16864kB slab_reclaimable:30896kB slab_unreclaimable:104416kB kernel_stack:14592kB pagetables:784812kB unstable:0kB bounce:0k
B free_pcp:4152kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
2020-03-16 12:20:22  hostnameRedacted kernel: lowmem_reserve[]: 0 0 0 0
2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 DMA32: 7852*4kB (UEM) 9354*8kB (UEM) 1425*16kB (UEM) 233*32kB (UEM) 76*64kB (UEM) 37*128kB (UEM) 21*256kB (UEM) 11*512kB (UEM) 8*1024kB (EM) 3*2048kB (UEM) 329*4096kB (M) = 1519024kB
2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 Normal: 6423893*4kB (UE) 2542493*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 46035516kB
2020-03-16 12:20:22  hostnameRedacted kernel: Node 1 Normal: 6869581*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 27478324kB
2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
2020-03-16 12:20:22  hostnameRedacted kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
2020-03-16 12:20:22  hostnameRedacted kernel: Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
2020-03-16 12:20:22  hostnameRedacted kernel: 9006 total pagecache pages
2020-03-16 12:20:22  hostnameRedacted kernel: 0 pages in swap cache
2020-03-16 12:20:22  hostnameRedacted kernel: Swap cache stats: add 0, delete 0, find 0/0
2020-03-16 12:20:22  hostnameRedacted kernel: Free swap  = 0kB
2020-03-16 12:20:22  hostnameRedacted kernel: Total swap = 0kB
2020-03-16 12:20:22  hostnameRedacted kernel: 199163796 pages RAM
2020-03-16 12:20:22  hostnameRedacted kernel: 0 pages HighMem/MovableOnly
2020-03-16 12:20:22  hostnameRedacted kernel: 3189710 pages reserved
2020-03-16 12:20:22  hostnameRedacted kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
2020-03-16 12:20:22  hostnameRedacted kernel: [13236]     0 13236    13964     4024      31        0             0 systemd-journal
2020-03-16 12:20:22  hostnameRedacted kernel: [13282]     0 13282    12156      766      24        0         -1000 systemd-udevd
2020-03-16 12:20:22  hostnameRedacted kernel: [13450]     0 13450    21139       71      13        0             0 audispd
2020-03-16 12:20:22  hostnameRedacted kernel: [13578]     0 13578     5409      109      15        0             0 irqbalance
2020-03-16 12:20:22  hostnameRedacted kernel: [13580]   999 13580   153607     1888      63        0             0 polkitd
2020-03-16 12:20:22  hostnameRedacted kernel: [13581]    81 13581    17098      185      33        0          -900 dbus-daemon
2020-03-16 12:20:22  hostnameRedacted kernel: [13584]    32 13584    17319      135      38        0             0 rpcbind
2020-03-16 12:20:22  hostnameRedacted kernel: [13593]     0 13593     6595       79      18        0             0 systemd-logind
2020-03-16 12:20:22  hostnameRedacted kernel: [13606]   998 13606     5634       75      17        0             0 chronyd
2020-03-16 12:20:22  hostnameRedacted kernel: [13616]     0 13616    48776      119      34        0             0 gssproxy
2020-03-16 12:20:22  hostnameRedacted kernel: [26275]     0 26275    25724      515      49        0             0 dhclient
2020-03-16 12:20:22  hostnameRedacted kernel: [26364]     0 26364    57204     1449      65        0             0 snmpd
2020-03-16 12:20:22  hostnameRedacted kernel: [26365]    99 26365    13995      107      32        0             0 dnsmasq
2020-03-16 12:20:22  hostnameRedacted kernel: [26366]     0 26366   143551     3420     100        0             0 tuned
2020-03-16 12:20:22  hostnameRedacted kernel: [26367]     0 26367    57633      523     117        0             0 httpd
2020-03-16 12:20:22  hostnameRedacted kernel: [26429]    48 26429    58154      510     116        0             0 httpd
2020-03-16 12:20:22  hostnameRedacted kernel: [26431]    48 26431    58154      493     116        0             0 httpd
2020-03-16 12:20:22  hostnameRedacted kernel: [26432]    48 26432    58154      493     116        0             0 httpd
2020-03-16 12:20:22  hostnameRedacted kernel: [26433]    48 26433    58154      493     116        0             0 httpd
2020-03-16 12:20:22  hostnameRedacted kernel: [26434]    48 26434    58154      493     116        0             0 httpd
2020-03-16 12:20:22  hostnameRedacted kernel: [26503]     0 26503    22946      261      46        0             0 master
2020-03-16 12:20:22  hostnameRedacted kernel: [26519]    89 26519    22989      276      45        0             0 qmgr
2020-03-16 12:20:22  hostnameRedacted kernel: [26530]     0 26530    28230      267      57        0         -1000 sshd
2020-03-16 12:20:22  hostnameRedacted kernel: [26533]     0 26533    62865     3426      55        0             0 rsyslogd
2020-03-16 12:20:22  hostnameRedacted kernel: [26544]     0 26544    32101      199      19        0             0 crond
2020-03-16 12:20:22  hostnameRedacted kernel: [26546]     0 26546    27527       37      11        0             0 agetty
2020-03-16 12:20:22  hostnameRedacted kernel: [26547]     0 26547    27527       37      10        0             0 agetty
2020-03-16 12:20:22  hostnameRedacted kernel: [31726] 180002 31726  2431319   114064     453        0             0 java
2020-03-16 12:20:22  hostnameRedacted kernel: [32818] 180002 32818 11911928  2570030    5781        0             0 java
2020-03-16 12:20:22  hostnameRedacted kernel: [32819] 180002 32819 102221943 86487107  178638        0             0 java
2020-03-16 12:20:22  hostnameRedacted kernel: [32820] 180002 32820 102240764 86244630  178648        0             0 java
2020-03-16 12:20:22  hostnameRedacted kernel: [61879]    89 61879    22972      259      46        0             0 pickup
2020-03-16 12:20:22  hostnameRedacted kernel: [64039]     0 64039    60504      404      70        0             0 crond
2020-03-16 12:20:22  hostnameRedacted kernel: [64050]     0 64050    26989       27      10        0             0 sleep
2020-03-16 12:20:22  hostnameRedacted kernel: Out of memory: Kill process 32819 (java) score 441 or sacrifice child
2020-03-16 12:20:22  hostnameRedacted kernel: Killed process 32819 (java), UID 180002, total-vm:408887772kB, anon-rss:345967476kB, file-rss:368kB, shmem-rss:0kB
Comment 2 bobbysmith013 2020-03-25 19:44:34 UTC
Why no responses yet? Did I do something wrong?
Comment 3 Andrew Morton 2020-03-25 23:59:21 UTC
That's a very old kernel - we're working on 5.6!

Can you please take this up with Red Hat?
Comment 4 bobbysmith013 2020-03-26 00:00:45 UTC
Red hat said they won’t help me because I’m running centos.
Comment 5 siarhei_k_dev_linux 2020-04-15 21:37:03 UTC
It looks like systemd process was trying to fork.

copy_process kernel function needs to duplicate task_struct which stored in slab cache. If no free task_struct in slab cache then __alloc_pages_nodemask was called to get 4 physically contiguous memory pages(16kB) (see order=2 as parameter for request).

According to these lines there were plenty of 4kB and 8kB free pages only:
2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 Normal: 6423893*4kB (UE) 2542493*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 46035516kB
2020-03-16 12:20:22  hostnameRedacted kernel: Node 1 Normal: 6869581*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 27478324kB

So kernel memory was fragmented and no swap, that why oom-killer was invoked.
It's difficult to say what different in kernel 3.10.0-957.27.2.el7.x86_64 that prevents oom-killer.
Comment 6 bobbysmith013 2020-04-16 00:49:26 UTC
(In reply to siarhei_k_dev_linux from comment #5)
> It looks like systemd process was trying to fork.
> 
> copy_process kernel function needs to duplicate task_struct which stored in
> slab cache. If no free task_struct in slab cache then __alloc_pages_nodemask
> was called to get 4 physically contiguous memory pages(16kB) (see order=2 as
> parameter for request).
> 
> According to these lines there were plenty of 4kB and 8kB free pages only:
> 2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 Normal: 6423893*4kB
> (UE) 2542493*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
> 0*2048kB 0*4096kB = 46035516kB
> 2020-03-16 12:20:22  hostnameRedacted kernel: Node 1 Normal: 6869581*4kB (U)
> 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
> 0*4096kB = 27478324kB
> 
> So kernel memory was fragmented and no swap, that why oom-killer was invoked.
> It's difficult to say what different in kernel 3.10.0-957.27.2.el7.x86_64
> that prevents oom-killer.

Thank you for responding.  Any idea of a way I could workaround/prevent this?  Is adding swap the only way to fix it? If so, how much swap would I have to add?
Comment 7 siarhei_k_dev_linux 2020-04-16 20:26:42 UTC
(In reply to bobbysmith013 from comment #6)
> (In reply to siarhei_k_dev_linux from comment #5)
> > It looks like systemd process was trying to fork.
> > 
> > copy_process kernel function needs to duplicate task_struct which stored in
> > slab cache. If no free task_struct in slab cache then
> __alloc_pages_nodemask
> > was called to get 4 physically contiguous memory pages(16kB) (see order=2
> as
> > parameter for request).
> > 
> > According to these lines there were plenty of 4kB and 8kB free pages only:
> > 2020-03-16 12:20:22  hostnameRedacted kernel: Node 0 Normal: 6423893*4kB
> > (UE) 2542493*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
> > 0*2048kB 0*4096kB = 46035516kB
> > 2020-03-16 12:20:22  hostnameRedacted kernel: Node 1 Normal: 6869581*4kB
> (U)
> > 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
> > 0*4096kB = 27478324kB
> > 
> > So kernel memory was fragmented and no swap, that why oom-killer was
> invoked.
> > It's difficult to say what different in kernel 3.10.0-957.27.2.el7.x86_64
> > that prevents oom-killer.
> 
> Thank you for responding.  Any idea of a way I could workaround/prevent
> this?  Is adding swap the only way to fix it? If so, how much swap would I
> have to add?

Recommended Linux swap size depends on size of physical memory. Just look through the Internet for the best practices regarding swap size.

Here how the Buddy system of linux kernel should work:
"The first 16k chunk is further split into two halves of 8k
(buddies) of which one is allocated for the caller and other is put into the 8k list. The second
chunk of 16k is put into the 16k free list, when lower order (8k) buddies become free at
some future time, they are coalesced to form a higher-order 16k block. When both 16k
buddies become free, they are again coalesced to arrive at a 32k block which is put back into
the free list."

It's strange that there are plenty of free small chunks.
But it can be possible due to specific load on your server, or issue in linux kernel.

Actually there is kernel-3.10.0-1062.18.1.el7.x86_64.rpm for centos that you can try.

Also you can try to monitor slab cache for "task_struct" through a special file /proc/slabinfo and compare on different kernel versions.

Note You need to log in before you can comment on or make changes to this bug.