Bug 51301

Summary: [-next-20121129] memory leak/page allocator fragmentation?
Product: Memory Management Reporter: Peter Hurley (peter)
Component: Page AllocatorAssignee: Andrew Morton (akpm)
Status: RESOLVED OBSOLETE    
Severity: normal CC: alan, mtosatti, peter
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: -next-20121129 Subsystem:
Regression: No Bisected commit-id:
Attachments: kernel log showing OOM condition
sysrq/show memory without kvm shortly after boot
sysrq/show memory with kvm running a single 2gb vm
/proc/slabinfo
lsmod > modlist.txt

Description Peter Hurley 2012-12-04 16:08:46 UTC
With a single 2Gb/4-core VM on a 10Gb/8-core host, the _host_ ran out of 32kb page blocks. I can't be certain that it was kvm because the OOM condition triggered a GP fault in the SLUB allocator. (The other suspect is nouveau.)

I have attached the kernel log and a SysRq Show Memory dump of the machine shortly after boot without running kvm.
Comment 1 Peter Hurley 2012-12-04 16:09:26 UTC
Created attachment 88441 [details]
kernel log showing OOM condition
Comment 2 Peter Hurley 2012-12-04 16:11:10 UTC
Created attachment 88451 [details]
sysrq/show memory without kvm shortly after boot
Comment 3 Peter Hurley 2012-12-04 16:23:47 UTC
Created attachment 88461 [details]
sysrq/show memory with kvm running a single 2gb vm
Comment 4 Peter Hurley 2012-12-04 16:48:22 UTC
Not sure why I picked kvm/nouveau for this OOM ;)

Could be any number of apps over a 3.5 period...
Comment 5 Marcelo Tosatti 2012-12-04 21:42:00 UTC
Dec  4 09:38:25 thor kernel: [322576.251464] firefox: page allocation failure: order:4, mode:0x80c0d0

This is an order-4 allocation failure. This is due to kernel failure to allocate physically contiguous region of memory. Please reassign to "Memory Management" component.
Comment 6 Peter Hurley 2012-12-05 00:57:08 UTC
This is filesystem/block or memory manager-related.

I just built -next-20121204 and by the end of the build a 10gb machine had < 100mb of zone DMA32 and < 70mb zone NORMAL.
Comment 7 Peter Hurley 2012-12-06 19:23:46 UTC
Created attachment 88561 [details]
/proc/slabinfo
Comment 8 Peter Hurley 2012-12-06 19:24:18 UTC
Created attachment 88571 [details]
lsmod > modlist.txt
Comment 9 Peter Hurley 2012-12-06 19:28:34 UTC
md_raid is gobbling up memory. Looks like buffer_heads aren't being released. Check this out after cloning a single 8gb VM image straight after boot:

# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
buffer_head       411176 411181    432   37    4 : tunables    0    0    0 : slabdata  11113  11113      0
Comment 10 Peter Hurley 2012-12-06 19:32:27 UTC
Please reassign to IO/Storage, subcomponent LVM/DM
Comment 11 Peter Hurley 2012-12-06 20:33:32 UTC
Sorry, please disregard last comment. This is probably related to the kswapd/flusher mess going on right now.

https://lkml.org/lkml/2012/11/27/486