Most recent kernel where this bug did not occur: unknown Distribution: Fedora Core 6 x86_64 Hardware Environment: 2 x Quad Xeon Supermicro with 4GB of memory Software Environment: 64bit Problem Description: When running an application that reads a large file (8GB) into memory (4GB), a kernel panic occurs. No output occured on the serial port, so the following info was copied from the screen manually. The process that caused the panic was "init". RIP: show_mem + 0x8d/0x140 Stack: out_of_memory + 0x75 __alloc_pages __do_page_cache_readahead mntput_no_expire link_path_walk filemap_nopage __handle_mm_fault do_page_fault error_exit This is happening on an 8-way SMP (2 quad xeon processors), supermicro 6015t-tv Some kernel config info: X86_64_ACPI_NUMA is switched on no forced preemption NUMA support is on page migration is on CC_STACKPROTECTOR is on processor family is MCORE2 Steps to reproduce: run an application that reads a large file into memory
How much swap space to you have?
I had configured 16GB of swap, but when I checked to make sure, swapon -s showed no swap at all. Turns out that somehow the partition labelling by the FC6 installer must not have worked properly, because after fixing /etc/fstab, and running mkswap, the swap now shows up in /proc/swaps. In summary, NO swap was configured. I suspect that after the process reached the 4GB physical memory limit, the kernel must have tried to kill the process, at which point show_mem() must have been called. I since found that there was a bug in show_mem() in the 2.6.21.5 kernel, which may have been patched in 2.6.21.6. See http://lkml.org/lkml/2007/6/27/195. I'm compiling a 2.6.23 kernel now, and will do some testing with it later in the night.
Ran the application again with latest kernel 2.6.23. The oom-killer kicked in and killed the offending process, no kernel panic or the like. Consider this problem resolved, presumably with 2.6.21.6, but definitely with 2.6.23.