Bug 9214 - panic in show_mem on when out_of_memory
Summary: panic in show_mem on when out_of_memory
Status: CLOSED CODE_FIX
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-23 06:10 UTC by Bernd Pfrommer
Modified: 2007-10-24 04:20 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.21.5
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Bernd Pfrommer 2007-10-23 06:10:08 UTC
Most recent kernel where this bug did not occur: unknown
Distribution: Fedora Core 6 x86_64
Hardware Environment: 2 x Quad Xeon Supermicro with 4GB of memory
Software Environment: 64bit
Problem Description:
When running an application that reads a large file (8GB) into memory (4GB), a kernel panic occurs. No output occured on the serial port, so the following info was copied from the screen manually.

The process that caused the panic was "init".

RIP: show_mem + 0x8d/0x140


Stack:
out_of_memory + 0x75
__alloc_pages
__do_page_cache_readahead
mntput_no_expire
link_path_walk
filemap_nopage
__handle_mm_fault
do_page_fault
error_exit

This is happening on an 8-way SMP (2 quad xeon processors), supermicro 6015t-tv
Some kernel config info:

X86_64_ACPI_NUMA is switched on
no forced preemption
NUMA support is on
page migration is on
CC_STACKPROTECTOR is on
processor family is MCORE2


Steps to reproduce:

run an application that reads a large file into memory
Comment 1 Randy Dunlap 2007-10-23 13:45:21 UTC
How much swap space to you have?
Comment 2 Bernd Pfrommer 2007-10-23 16:06:19 UTC
I had configured 16GB of swap, but when I checked to make sure, swapon -s showed no swap at all. Turns out that somehow the partition labelling by the FC6 installer must not have worked properly, because after fixing /etc/fstab, and running mkswap, the swap now shows up in /proc/swaps.

In summary, NO swap was configured. I suspect that after the process reached the 4GB physical memory limit, the kernel must have tried to kill the process, at which point show_mem() must have been called.

I since found that there was a bug in show_mem() in the 2.6.21.5 kernel, which may have been patched in 2.6.21.6. See http://lkml.org/lkml/2007/6/27/195.

I'm compiling a 2.6.23 kernel now, and will do some testing with it later in the night.
 
Comment 3 Bernd Pfrommer 2007-10-24 04:20:43 UTC
Ran the application again with latest kernel 2.6.23. The oom-killer kicked in and killed the offending process, no kernel panic or the like. Consider this problem resolved, presumably with 2.6.21.6, but definitely with 2.6.23.

Note You need to log in before you can comment on or make changes to this bug.