Bug 8928

Summary: PROBLEM: Kernel oops during interrupt context memory allocation
Product: Memory Management Reporter: Thomas Jarosch (thomas.jarosch)
Component: Page AllocatorAssignee: Andrew Morton (akpm)
Status: CLOSED CODE_FIX    
Severity: high CC: protasnb
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.23-rc3 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Patch from the bug description for easier access

Description Thomas Jarosch 2007-08-23 05:16:02 UTC
Hey there,

I've found a problem in the kernel memory handler which can be triggered by netfilter modules allocating memory in interrupt context. The bug results in a hard crash from time to time. It only happens on machines using highmem,
so you need at least 2 GB RAM. Here are three easy steps to force the problem:

1. Apply the following patch on a box with 2+ GB RAM:

diff -u linux-2.6.23.clean/net/netfilter/xt_comment.c linux-2.6.23/net/netfilter/xt_comment.c
--- linux-2.6.23.clean/net/netfilter/xt_comment.c       Thu Aug 23 11:55:47 2007
+++ linux-2.6.23/net/netfilter/xt_comment.c     Thu Aug 23 11:45:44 2007
@@ -25,6 +25,13 @@
       unsigned int protooff,
       bool *hotdrop)
 {
+       unsigned long mem;
+       if ((mem = get_zeroed_page(GFP_ATOMIC)) == 0) {
+           printk("ipt_comment: Out of memory!\n");
+       } else {
+           free_page(mem);
+       }
+
        /* We always match */
        return true;
 }

2. Insert the following netfilter rules:

   iptables -I INPUT -m comment --comment "crash"
   iptables -I OUTPUT -m comment --comment "crash"

3. Download something big

The box will oops within seconds. Before I figured out this easy way to reproduce it, I wrote an extra netfilter module called ipt_CRASH doing get_zeroed_page(GFP_ATOMIC)/free_page(). Here's a backtrace of the oops
with kernel 2.6.23-rc3 using the default config and ipt_CRASH:

------------[ cut here ]------------
kernel BUG at arch/i386/mm/highmem.c:38!
invalid opcode: 0000 [#1]
Modules linked in: ipt_CRASH iptable_filter ip_tables x_tables deflate 
zlib_deflate twofish twofish_common serpent blowfish des cbc ecb blkcipher 
aes xcbcd
CPU:    0
EIP:    0060:[<c011401f>]    Not tainted VLI
EFLAGS: 00010286   (2.6.23-rc3 #1)
EIP is at kmap_atomic_prot+0x24/0x83
eax: 0000000c   ebx: c16eba60   ecx: c0002edc   edx: 00000003
esi: 00000163   edi: 00000001   ebp: c16eba60   esp: f75f1b9c
ds: 007b   es: 007b   fs: 0000  gs: 0000  ss: 0068
Process info_iponline (pid: 3469, ti=f75f0000 task=f7269ab0 task.ti=f75f0000)
Stack: c16eba60 00000000 c013c695 00000001 00000044 40000000 00000001 00000000
       00028020 c0327a3c c0327648 00000000 00000001 00000000 00000001 00000000
       00008020 00008020 00000000 c0327a38 c013c778 00000044 63d210ac 00000000
Call Trace:
 [<c013c695>] get_page_from_freelist+0x1ed/0x284
 [<c013c778>] __alloc_pages+0x4c/0x282
 [<c013ca31>] get_zeroed_page+0x3c/0x4a
 [<f894e000>] ipt_crash_target+0x0/0x48 [ipt_CRASH]
 [<f894e012>] ipt_crash_target+0x12/0x48 [ipt_CRASH]
 [<f8958331>] ipt_do_table+0x26d/0x2c8 [ip_tables]
 [<c0258484>] nf_iterate+0x38/0x6a
 [<c02585ed>] nf_hook_slow+0x4d/0xb5
 [<c025d1b3>] ip_local_deliver_finish+0x0/0x16b
 [<c025d85c>] ip_local_deliver+0x72/0x1ea
 [<c025d1b3>] ip_local_deliver_finish+0x0/0x16b
 [<c025d7bd>] ip_rcv+0x3fb/0x428
 [<c01287c4>] getnstimeofday+0x2b/0xaf
 [<c0127776>] ktime_get_real+0xf/0x2b
 [<c024540c>] netif_receive_skb+0x219/0x269
 [<f8998fa0>] e1000_clean_rx_irq+0x35a/0x42a [e1000]
 [<f8998c46>] e1000_clean_rx_irq+0x0/0x42a [e1000]
 [<f8997f88>] e1000_clean+0x304/0x4ae [e1000]
 [<c014f745>] get_unused_fd_flags+0x42/0xaa
 [<c0161327>] mntput_no_expire+0x11/0x47
 [<c01580c1>] __link_path_walk+0x83f/0x9de
 [<c01394cc>] do_generic_mapping_read+0x3cd/0x3d5
 [<c02471ff>] net_rx_action+0x52/0xe9
 [<c011b83c>] __do_softirq+0x35/0x75
 [<c011b89e>] do_softirq+0x22/0x26
 [<c0105e8c>] do_IRQ+0x58/0x6c
 [<c0138e42>] find_lock_page+0x12/0x62
 [<c0104573>] common_interrupt+0x23/0x28
 [<c0141996>] __do_fault+0x136/0x2d1
 [<c0142eac>] handle_mm_fault+0x2c5/0x571
 [<c0145665>] vma_merge+0x120/0x178
 [<c0113560>] do_page_fault+0x212/0x58a
 [<c011334e>] do_page_fault+0x0/0x58a
 [<c0292f2a>] error_code+0x6a/0x70
 [<c0290000>] tpacket_rcv+0x142/0x38e
 =======================
Code: 03 05 00 eb 38 c0 c3 56 89 ce 53 89 c3 89 e0 25 00 e0 ff ff ff 40 14 8b 
0d f0 17 38 c0 8d 04 95 00 00 00 00 29 c1 83 39 00 74 04 <0f> 0b eb fe 8b 03
EIP: [<c011401f>] kmap_atomic_prot+0x24/0x83 SS:ESP 0068:f75f1b9c
Kernel panic - not syncing: Fatal exception in interrupt
------------------------------------

Link to my original report on linux-kernel:
http://marc.info/?l=linux-kernel&m=118764032022075

I tried to take a look at the problem myself, but it's way past
my knowledge of the inner kernel workings.

Thanks in advance,
Thomas
Comment 1 Thomas Jarosch 2007-08-23 05:18:38 UTC
Created attachment 12501 [details]
Patch from the bug description for easier access
Comment 2 Andrew Morton 2007-08-23 14:05:53 UTC
I agree.  Please review and test
alloc_pages-permit-get_zeroed_pagegfp_atomic-from-interrupt-context.patch
Comment 3 Thomas Jarosch 2007-08-24 01:15:29 UTC
Thanks for the patch, Andrew!

Now the kernel dies during boot. Here's a backtrace of 2.6.23-rc3:

Memory: 3112512k/3145216k available (1614k kernel code, 31336k reserved, 698k data, 184k init, 2227712k highmem)
virtual kernel memory layout:
    fixmap  : 0xfffa8000 - 0xfffff000   ( 348 kB)
    pkmap   : 0xff800000 - 0xffc00000   (4096 kB)
    vmalloc : 0xf8800000 - 0xff7fe000   ( 111 MB)
    lowmem  : 0xc0000000 - 0xf8000000   ( 896 MB)
      .init : 0xc0346000 - 0xc0374000   ( 184 kB)
      .data : 0xc02938c2 - 0xc03422e4   ( 698 kB)
      .text : 0xc0100000 - 0xc02938c2   (1614 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.

------------[ cut here ]------------
kernel BUG at lib/ioremap.c:25!
invalid opcode: 0000 [#1]
Modules linked in:
CPU:    0
EIP:    0060:[<c019c2fc>]    Not tainted VLI
EFLAGS: 00010282   (2.6.23-rc3 #4)
EIP is at ioremap_page_range+0x94/0xd4
eax: 06500000   ebx: f8800000   ecx: fed00000   edx: dfff7004
esi: c0374f88   edi: f8801000   ebp: f8801000   esp: c0345ee8
ds: 007b   es: 007b   fs: 0000  gs: 0000  ss: 0068
Process swapper (pid: 0, ti=c0344000 task=c031b2c0 task.ti=c0344000)
Stack: fed00000 07800000 00001000 f8800000 fed00000 fed00000 c0113a3b 00000073
       00000073 fed00000 00099800 00000400 007c6007 c0113a6d c0366920 00099800
       c033e000 c0354f01 dfff8000 000000d0 c014dcb3 00000008 000000d0 dffff4c0
Call Trace:
 [<c0113a3b>] __ioremap+0xc0/0xe1
 [<c0113a6d>] ioremap_nocache+0x11/0x74
 [<c0354f01>] hpet_enable+0x2e/0x256
 [<c014dcb3>] cache_alloc_refill+0x2af/0x3d5
 [<c014e033>] do_tune_cpucache+0x100/0x1a5
 [<c014e25e>] enable_cpucache+0x54/0x7b
 [<c0358feb>] kmem_cache_init+0x2df/0x2f0
 [<c034c1f0>] hpet_time_init+0x5/0x13
 [<c034695d>] start_kernel+0x1e2/0x24b
 [<c0346317>] unknown_bootoption+0x0/0x195
 =======================
Code: e2 00 f0 ff ff 01 c2 81 fa 00 00 00 40 74 3e 8b 04 24 81 ea fc ff ff 3f 03 44 24 04 8d 0c 18 81 e1 00 f0 ff ff 83 7a fc 00 74 04 <0f> 0b eb fe 8b 44
EIP: [<c019c2fc>] ioremap_page_range+0x94/0xd4 SS:ESP 0068:c0345ee8
Kernel panic - not syncing: Attempted to kill the idle task!
Comment 4 Anonymous Emailer 2007-08-24 01:37:50 UTC
Reply-To: akpm@linux-foundation.org

odd.  Does that kernel have any extra patches applied?

Are you able to identify when this started happening?  Was
2.6.22 OK?  2.6.23-rc1?  etc?
Comment 5 Thomas Jarosch 2007-08-24 02:01:21 UTC
It is a vanilla 2.6.23-rc3 + alloc_pages-permit-get_zeroed_pagegfp_atomic-from-interrupt-context.patch.
If I revert the patch it boots fine.

2.6.22.5 dies at the same place if I apply the patch.

Right now I'm testing different alloc functions (kzalloc / __get_free_page + memset) if they are affected by the same problem as get_zeroed_page().
Comment 6 Thomas Jarosch 2007-08-24 02:36:00 UTC
Ok, all other combinations of alloc functions work fine, it's only get_zeroed_page(). Guess Edward A. Murphy Jr. would be smiling in heaven now...
Comment 7 Peter Müller 2007-10-27 11:13:12 UTC
The same problem occours, while calling vmalloc_to_page, on kernel 2.6.22.9-61.fc6 (Fedora Core 6). The problem won't exists on kernel 2.6.18-1.2798.fc6 (older Fedora Core 6).


kernel BUG at arch/i386/mm/highmem.c:38!
invalid opcode: 0000 [#1]
SMP 
last sysfs file: /devices/pci0000:00/0000:00:03.0/0000:02:01.0/irq
Modules linked in: vfat fat lirc_serial(F)(U) lirc_dev(F)(U) ipv6 nfs lockd nfs_acl sunrpc dm_mirror dm_mod video sbs buttond
CPU:    1
EIP:    0060:[<c041f971>]    Tainted: PF      VLI
EFLAGS: 00010206   (2.6.22.9-61.fc6 #1)
EIP is at kmap_atomic_prot+0x31/0x80
eax: 000000a8   ebx: c16dd120   ecx: c0004e44   edx: 0000000f
esi: 0000002a   edi: 00000163   ebp: f0371f00   esp: c07cef54
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process irqbalance (pid: 2289, ti=c07ce000 task=c1944600 task.ti=f6e6f000)
Stack: 00000aa8 00000000 f6f78001 c0466ece f8eaa3c8 000000bc f8946f51 00000050 
       f8947a71 f8eaa3c8 f8947c09 f6d50002 f6ee8c00 00000015 f8947d94 00000000 
       00000001 ffffc041 c07cefb4 f6ee8c00 00000000 00000000 f89480f9 ffff0001 
Call Trace:
 [<c0466ece>] vmalloc_to_page+0x36/0x5c
 [<f8946f51>] vmap_to_dma_addr+0x8/0x1e [linuxdvb]
 [<f8947a71>] __end_IWrDebiComPara+0x7/0x42 [linuxdvb]
 [<f8947c09>] Rps1Paket.seiteOk+0x5/0x9 [linuxdvb]
 [<f8947d94>] StartTransAktion.tLoop+0xd/0x2b [linuxdvb]
 [<f89480f9>] DebiIntFkt.p1Ist0+0x7/0x8 [linuxdvb]
 [<f89443f7>] dvb_irq+0xc1/0x167 [linuxdvb]
 [<c0455842>] handle_IRQ_event+0x1a/0x3f
 [<c0456a5f>] handle_fasteoi_irq+0x72/0xa6
 [<c04569ed>] handle_fasteoi_irq+0x0/0xa6
 [<c04071f7>] do_IRQ+0xac/0xd1
 [<c040592b>] common_interrupt+0x23/0x28
 [<c0467af2>] unmap_vmas+0x4d7/0x4ff
 [<c046a6bf>] unmap_region+0x8f/0xf8
 [<c046b0ac>] do_munmap+0x15a/0x1ac
 [<c046b12e>] sys_munmap+0x30/0x3e
 [<c0404f8e>] syscall_call+0x7/0xb
 =======================
Code: c3 89 e0 25 00 f0 ff ff ff 40 14 64 a1 08 30 7a c0 6b c0 1b 8b 0d b0 c2 7f c0 8d 34 10 8d 04 b5 00 00 00 00 29 c1 83 3 
EIP: [<c041f971>] kmap_atomic_prot+0x31/0x80 SS:ESP 0068:c07cef54
Kernel panic - not syncing: Fatal exception in interrupt
Comment 8 Natalie Protasevich 2008-02-07 00:18:10 UTC
Any update on this bug please. Is it still happening with recent kernels?
Thanks.
Comment 9 Peter Müller 2008-12-26 11:42:33 UTC
Hi,

after a long time i update to a newer version 2.6.27.9-159.fc10.i686 of kernel. With this kernel version i can't reproduce the bug. Seems to be fixed for me :-)

With best regards
peter
Comment 10 Alan 2009-03-23 11:32:38 UTC
Thanks for confirming it fixed