Bug 208085

Summary: page allocation failure: order:0, mode:0x400d0
Product: Memory Management Reporter: Chris Murphy (bugzilla)
Component: Page AllocatorAssignee: Andrew Morton (akpm)
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.7.0-1.fc33.x86_64 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
/proc/meminfo

Description Chris Murphy 2020-06-06 07:25:40 UTC
i7-2820QM
MemTotal:        8052740 kB
SwapTotal:      14908572 kB
$ swapon
NAME       TYPE       SIZE USED PRIO
/dev/sda5  partition 10.4G 4.4M   -2
/dev/zram0 partition  3.9G 241M    3


I'm seeing a new problem with kernel 5.7.0 I haven't previously seen. I am testing both swap on /dev/zram and swap on a partition. Two swaps. The workload is compiling webkitgtk with defaults, which is slightly overcommitted for the resources. I'm actually expecting better than 50% chance of OOM. But that's not what happens. After about 2 hours the whole thing just splats, and keeps splatting. Pretty weird.
Comment 1 Chris Murphy 2020-06-06 07:27:11 UTC
Created attachment 289541 [details]
dmesg
Comment 2 Chris Murphy 2020-06-06 07:36:12 UTC
Created attachment 289543 [details]
/proc/meminfo

This is after the splats.
Comment 3 Andrew Morton 2020-06-08 00:42:14 UTC
On Sat, 06 Jun 2020 07:25:40 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=208085
> 
>             Bug ID: 208085
>            Summary: page allocation failure: order:0, mode:0x400d0
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 5.7.0-1.fc33.x86_64
>           Hardware: x86-64
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Page Allocator
>           Assignee: akpm@linux-foundation.org
>           Reporter: bugzilla@colorremedies.com
>         Regression: No
> 
> i7-2820QM
> MemTotal:        8052740 kB
> SwapTotal:      14908572 kB
> $ swapon
> NAME       TYPE       SIZE USED PRIO
> /dev/sda5  partition 10.4G 4.4M   -2
> /dev/zram0 partition  3.9G 241M    3
> 
> 
> I'm seeing a new problem with kernel 5.7.0 I haven't previously seen.

Which previous kernel version(s) were OK?

> I am
> testing both swap on /dev/zram and swap on a partition. Two swaps. The
> workload
> is compiling webkitgtk with defaults, which is slightly overcommitted for the
> resources. I'm actually expecting better than 50% chance of OOM. But that's
> not
> what happens. After about 2 hours the whole thing just splats, and keeps
> splatting. Pretty weird.

Yes, this shouldn't happen.  Lots of anonymous memory, lots of free
swap yet the kernel declared oom.

We're in add_to_swap_cache() trying to allocate xarray metadata so
obviously we can't perform any reclaiming in this context.  I'd have
thought that in this situation we should permit xarray to dip into page
reserves.  But this is pretty old code (mainly 2017) and I don't think
we've done much to perturb it in 5.7, hence my above question.


[ 5225.501756] gnome-shell: page allocation failure: order:0, mode:0x400d0(__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_RECLAIMABLE), nodemask=(null),cpuset=/,mems_allowed=0
[ 5225.501763] CPU: 2 PID: 1155 Comm: gnome-shell Not tainted 5.7.0-1.fc33.x86_64 #1
[ 5225.501764] Hardware name: Apple Inc. MacBookPro8,2/Mac-94245A3940C91C80, BIOS    MBP81.88Z.0050.B00.1804101331 04/10/18
[ 5225.501765] Call Trace:
[ 5225.501774]  dump_stack+0x64/0x88
[ 5225.501777]  warn_alloc.cold+0x75/0xd9
[ 5225.501781]  __alloc_pages_slowpath.constprop.0+0xcfa/0xd30
[ 5225.501785]  ? alloc_slab_page+0x195/0x310
[ 5225.501786]  __alloc_pages_nodemask+0x2df/0x320
[ 5225.501787]  alloc_slab_page+0x195/0x310
[ 5225.501789]  allocate_slab+0x3c5/0x440
[ 5225.501791]  ___slab_alloc+0x40c/0x5f0
[ 5225.501793]  ? xas_nomem+0x28/0x70
[ 5225.501795]  ? xas_alloc+0x9b/0xc0
[ 5225.501797]  ? xas_nomem+0x28/0x70
[ 5225.501798]  __slab_alloc+0x1c/0x30
[ 5225.501800]  kmem_cache_alloc+0x20e/0x220
[ 5225.501802]  xas_nomem+0x28/0x70
[ 5225.501803]  add_to_swap_cache+0x321/0x400
[ 5225.501806]  __read_swap_cache_async+0x105/0x240
[ 5225.501808]  swap_cluster_readahead+0x22c/0x2e0
[ 5225.501811]  shmem_swapin+0x8e/0xc0
[ 5225.501812]  ? kmem_cache_free+0x174/0x190
[ 5225.501814]  ? xas_store+0x33a/0x5e0
[ 5225.501817]  shmem_swapin_page+0x196/0x740
[ 5225.501819]  shmem_getpage_gfp+0x3a2/0xa60
[ 5225.501822]  shmem_read_mapping_page_gfp+0x32/0x60
[ 5225.501868]  shmem_get_pages+0x155/0x5e0 [i915]
[ 5225.501874]  ? __queue_work+0x1e3/0x410
[ 5225.501897]  ? add_hole+0x115/0x170 [drm]
[ 5225.501927]  ? fence_notify+0x86/0x104 [i915]
[ 5225.501954]  ? __i915_sw_fence_complete+0x10f/0x1c0 [i915]
[ 5225.501987]  __i915_gem_object_get_pages+0x68/0xa0 [i915]
[ 5225.502022]  i915_vma_pin+0x3fe/0x6c0 [i915]
[ 5225.502056]  eb_add_vma+0x10b/0x2c0 [i915]
[ 5225.502089]  i915_gem_do_execbuffer+0x704/0x3430 [i915]
[ 5225.502094]  ? blk_account_io_start+0xda/0x140
[ 5225.502096]  ? bio_attempt_back_merge+0x83/0xe0
[ 5225.502098]  ? blk_attempt_plug_merge+0x10c/0x110
[ 5225.502102]  ? bfq_update_fin_time_enqueue+0x13d/0x1a0
[ 5225.502106]  ? idling_needed_for_service_guarantees+0x3f/0x70
[ 5225.502108]  ? bfq_better_to_idle+0x73/0x90
[ 5225.502110]  ? bfq_dispatch_request+0x6ce/0x1070
[ 5225.502113]  ? bfq_add_bfqq_busy+0xa1/0x162
[ 5225.502117]  ? __blk_mq_run_hw_queue+0x49/0x110
[ 5225.502121]  ? __alloc_pages_slowpath.constprop.0+0x15b/0xd30
[ 5225.502124]  ? generic_end_io_acct+0x91/0xf0
[ 5225.502127]  ? sched_clock+0x5/0x10
[ 5225.502131]  ? sched_clock_cpu+0xc/0xa0
[ 5225.502134]  ? _cond_resched+0x16/0x40
[ 5225.502136]  ? __kmalloc_node+0x204/0x300
[ 5225.502168]  ? i915_gem_execbuffer2_ioctl+0x93/0x3e0 [i915]
[ 5225.502201]  i915_gem_execbuffer2_ioctl+0x1ea/0x3e0 [i915]
[ 5225.502204]  ? reuse_swap_page+0x8c/0x380
[ 5225.502238]  ? i915_gem_execbuffer_ioctl+0x2a0/0x2a0 [i915]
[ 5225.502253]  drm_ioctl_kernel+0x86/0xd0 [drm]
[ 5225.502257]  ? avc_has_perm+0x3b/0x160
[ 5225.502272]  drm_ioctl+0x206/0x390 [drm]
[ 5225.502306]  ? i915_gem_execbuffer_ioctl+0x2a0/0x2a0 [i915]
[ 5225.502311]  ksys_ioctl+0x82/0xc0
[ 5225.502313]  __x64_sys_ioctl+0x16/0x20
[ 5225.502317]  do_syscall_64+0x5b/0xf0
[ 5225.502320]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 5225.502323] RIP: 0033:0x7ff76292047b
[ 5225.502326] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48
[ 5225.502328] RSP: 002b:00007fff75aed128 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 5225.502330] RAX: ffffffffffffffda RBX: 00007fff75aed170 RCX: 00007ff76292047b
[ 5225.502332] RDX: 00007fff75aed170 RSI: 0000000040406469 RDI: 000000000000000d
[ 5225.502334] RBP: 0000000040406469 R08: 000000000000000d R09: 000055ff644b09a0
[ 5225.502335] R10: 0000000000000000 R11: 0000000000000246 R12: 000055ff6446c760
[ 5225.502336] R13: 000000000000000d R14: ffffffffffffffff R15: 00007ff74af22828
[ 5225.502339] Mem-Info:
[ 5225.502345] active_anon:1433763 inactive_anon:207289 isolated_anon:182
                active_file:10333 inactive_file:8393 isolated_file:2
                unevictable:4657 dirty:100 writeback:0 unstable:0
                slab_reclaimable:16672 slab_unreclaimable:38093
                mapped:8919 shmem:4496 pagetables:10454 bounce:0
                free:26161 free_pcp:2054 free_cma:0
[ 5225.502350] Node 0 active_anon:5735052kB inactive_anon:829156kB active_file:41332kB inactive_file:33572kB unevictable:18628kB isolated(anon):728kB isolated(file):8kB mapped:35676kB dirty:400kB writeback:0kB shmem:17984kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 5225.502352] Node 0 DMA free:15344kB min:128kB low:160kB high:192kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15360kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 5225.502357] lowmem_reserve[]: 0 2069 7810 7810 7810
[ 5225.502360] Node 0 DMA32 free:40212kB min:17868kB low:22332kB high:26796kB reserved_highatomic:0KB active_anon:1640016kB inactive_anon:265148kB active_file:9856kB inactive_file:12584kB unevictable:2968kB writepending:136kB present:2255864kB managed:2157904kB mlocked:0kB kernel_stack:48kB pagetables:8524kB bounce:0kB free_pcp:2904kB local_pcp:156kB free_cma:0kB
[ 5225.502365] lowmem_reserve[]: 0 0 5741 5741 5741
[ 5225.502368] Node 0 Normal free:49088kB min:49584kB low:61980kB high:74376kB reserved_highatomic:2048KB active_anon:4098076kB inactive_anon:563004kB active_file:32212kB inactive_file:21312kB unevictable:13680kB writepending:0kB present:6027264kB managed:5879476kB mlocked:1792kB kernel_stack:7312kB pagetables:33292kB bounce:0kB free_pcp:5388kB local_pcp:780kB free_cma:0kB
[ 5225.502373] lowmem_reserve[]: 0 0 0 0 0
[ 5225.502376] Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M) = 15344kB
[ 5225.502385] Node 0 DMA32: 1160*4kB (UM) 471*8kB (UME) 84*16kB (UM) 103*32kB (UME) 116*64kB (UME) 59*128kB (UME) 26*256kB (UE) 8*512kB (E) 2*1024kB (E) 0*2048kB 0*4096kB = 40824kB
[ 5225.502394] Node 0 Normal: 4778*4kB (UMH) 1400*8kB (UMEH) 346*16kB (UMH) 270*32kB (UMEH) 20*64kB (UMEH) 20*128kB (MEH) 4*256kB (MEH) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 49352kB
[ 5225.502404] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5225.502405] 114228 total pagecache pages
[ 5225.502408] 90616 pages in swap cache
[ 5225.502410] Swap cache stats: add 52638812, delete 52548141, find 2182667/30436788
[ 5225.502411] Free swap  = 5921200kB
[ 5225.502413] Total swap = 14908572kB
[ 5225.502414] 2074779 pages RAM
[ 5225.502415] 0 pages HighMem/MovableOnly
[ 5225.502416] 61594 pages reserved
[ 5225.502417] 0 pages cma reserved
[ 5225.502418] 0 pages hwpoisoned
[ 5225.502420] SLUB: Unable to allocate memory on node -1, gfp=0xc0(__GFP_IO|__GFP_FS)
[ 5225.502422]   cache: radix_tree_node, object size: 576, buffer size: 584, default order: 2, min order: 0
[ 5225.502424]   node 0: slabs: 654, objs: 11886, free: 0
Comment 4 Chris Murphy 2020-06-08 02:12:43 UTC
With swaponzram or swapondisk by separately - 5.6.13. Same webkitgtk test.

Together? It's been maybe since 5.3 series since I tried it.

This took about 2 hours to hit. I know the kernel supports up to 32 swaps since forever, but I don't know that this is very common, so I wonder if it's related to two swaps or the fact one is zram based.
Comment 5 Andrew Morton 2020-06-08 02:16:43 UTC
Could you please resend this via emailed reply-to-all to my earlier
email?  We don't like to work on bugs via bugzilla.

Sorry, I failed to mention this earlier.