Bug 51301 - [-next-20121129] memory leak/page allocator fragmentation?
Summary: [-next-20121129] memory leak/page allocator fragmentation?
Status: RESOLVED OBSOLETE
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-12-04 16:08 UTC by Peter Hurley
Modified: 2013-11-19 18:26 UTC (History)
3 users (show)

See Also:
Kernel Version: -next-20121129
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel log showing OOM condition (90.85 KB, text/x-log)
2012-12-04 16:09 UTC, Peter Hurley
Details
sysrq/show memory without kvm shortly after boot (5.71 KB, text/x-log)
2012-12-04 16:11 UTC, Peter Hurley
Details
sysrq/show memory with kvm running a single 2gb vm (5.71 KB, text/x-log)
2012-12-04 16:23 UTC, Peter Hurley
Details
/proc/slabinfo (22.27 KB, application/octet-stream)
2012-12-06 19:23 UTC, Peter Hurley
Details
lsmod > modlist.txt (3.08 KB, text/plain)
2012-12-06 19:24 UTC, Peter Hurley
Details

Description Peter Hurley 2012-12-04 16:08:46 UTC
With a single 2Gb/4-core VM on a 10Gb/8-core host, the _host_ ran out of 32kb page blocks. I can't be certain that it was kvm because the OOM condition triggered a GP fault in the SLUB allocator. (The other suspect is nouveau.)

I have attached the kernel log and a SysRq Show Memory dump of the machine shortly after boot without running kvm.
Comment 1 Peter Hurley 2012-12-04 16:09:26 UTC
Created attachment 88441 [details]
kernel log showing OOM condition
Comment 2 Peter Hurley 2012-12-04 16:11:10 UTC
Created attachment 88451 [details]
sysrq/show memory without kvm shortly after boot
Comment 3 Peter Hurley 2012-12-04 16:23:47 UTC
Created attachment 88461 [details]
sysrq/show memory with kvm running a single 2gb vm
Comment 4 Peter Hurley 2012-12-04 16:48:22 UTC
Not sure why I picked kvm/nouveau for this OOM ;)

Could be any number of apps over a 3.5 period...
Comment 5 Marcelo Tosatti 2012-12-04 21:42:00 UTC
Dec  4 09:38:25 thor kernel: [322576.251464] firefox: page allocation failure: order:4, mode:0x80c0d0

This is an order-4 allocation failure. This is due to kernel failure to allocate physically contiguous region of memory. Please reassign to "Memory Management" component.
Comment 6 Peter Hurley 2012-12-05 00:57:08 UTC
This is filesystem/block or memory manager-related.

I just built -next-20121204 and by the end of the build a 10gb machine had < 100mb of zone DMA32 and < 70mb zone NORMAL.
Comment 7 Peter Hurley 2012-12-06 19:23:46 UTC
Created attachment 88561 [details]
/proc/slabinfo
Comment 8 Peter Hurley 2012-12-06 19:24:18 UTC
Created attachment 88571 [details]
lsmod > modlist.txt
Comment 9 Peter Hurley 2012-12-06 19:28:34 UTC
md_raid is gobbling up memory. Looks like buffer_heads aren't being released. Check this out after cloning a single 8gb VM image straight after boot:

# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
buffer_head       411176 411181    432   37    4 : tunables    0    0    0 : slabdata  11113  11113      0
Comment 10 Peter Hurley 2012-12-06 19:32:27 UTC
Please reassign to IO/Storage, subcomponent LVM/DM
Comment 11 Peter Hurley 2012-12-06 20:33:32 UTC
Sorry, please disregard last comment. This is probably related to the kswapd/flusher mess going on right now.

https://lkml.org/lkml/2012/11/27/486

Note You need to log in before you can comment on or make changes to this bug.