Bug 15680 - kswapd NULL pointer dereference
Summary: kswapd NULL pointer dereference
Status: RESOLVED UNREPRODUCIBLE
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-02 13:01 UTC by Steinar H. Gunderson
Modified: 2012-06-18 16:30 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.34-rc2
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Steinar H. Gunderson 2010-04-02 13:01:00 UTC
Hi,

My server, running 2.6.34-rc2 for the occasion, suddenly got a _ton_ of load compared to what it usually does (including a few VLC processes that gobble up several hundred GB of address space due to some bug -- that might be related), and suddenly gave me:

[584163.116507] BUG: unable to handle kernel NULL pointer dereference at (null)
[584163.117456] IP: [<ffffffff810af4ea>] page_referenced+0xef/0x1d5
[584163.117456] PGD 0 
[584163.117456] Oops: 0000 [#1] SMP 
[584163.117456] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
[584163.117456] CPU 0 
[584163.117456] Modules linked in: ipt_REJECT iptable_filter ip_tables af_packet tun ext2 ext4 jbd2 crc16 coretemp w83627ehf hwmon_vid psmouse ide_generic ide_gd_mod ide_cd_mod cdrom forcedeth i2c_i801 pcspkr i2c_core rtc_cmos rtc_core rtc_lib evdev ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot dm_mod usbhid raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 md_mod ide_pci_generic ide_core e1000e uhci_hcd ehci_hcd sd_mod unix [last unloaded: scsi_wait_scan]
[584163.117456] 
[584163.117456] Pid: 320, comm: kswapd0 Not tainted 2.6.34-rc2 #2 C2SBC-Q/C2SBC-Q
[584163.117456] RIP: 0010:[<ffffffff810af4ea>]  [<ffffffff810af4ea>] page_referenced+0xef/0x1d5
[584163.117456] RSP: 0018:ffff88023fe6dc20  EFLAGS: 00010206
[584163.117456] RAX: ffff880169111fc8 RBX: ffffffffffffffe0 RCX: ffff8801f3fa6080
[584163.117456] RDX: ffff880169111fc1 RSI: 0000000000000000 RDI: ffff880169111fc0
[584163.117456] RBP: ffff88023fe6dca0 R08: 0000000000000020 R09: ffff880215c390c0
[584163.117456] R10: ffffffff814afee8 R11: ffffffff814afde8 R12: ffffea0004979968
[584163.117456] R13: 0000000000000000 R14: ffff880169111fc0 R15: ffff88023fe6dd40
[584163.117456] FS:  0000000000000000(0000) GS:ffff880001800000(0000) knlGS:0000000000000000
[584163.117456] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[584163.117456] CR2: 0000000000000000 CR3: 00000000014ee000 CR4: 00000000000006f0
[584163.117456] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[584163.117456] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[584163.117456] Process kswapd0 (pid: 320, threadinfo ffff88023fe6c000, task ffff88023fe34380)
[584163.117456] Stack:
[584163.117456]  0000000000000001 ffff880169111fc8 ffff88023402c240 0000000000002000
[584163.117456] <0> 0000000000000000 0000000000000000 ffff88023a940538 00000000000011e2
[584163.117456] <0> 00000000000011e2 00000014fffffffe ffff88023fe6dca0 ffffea0004979968
[584163.117456] Call Trace:
[584163.117456]  [<ffffffff8109a511>] shrink_active_list+0x1be/0x289
[584163.117456]  [<ffffffff81306087>] ? schedule+0x7a0/0x867
[584163.117456]  [<ffffffff8109bbfc>] kswapd+0x41d/0x865
[584163.117456]  [<ffffffff8109972e>] ? isolate_pages_global+0x0/0x1f2
[584163.117456]  [<ffffffff8104e6ae>] ? autoremove_wake_function+0x0/0x38
[584163.117456]  [<ffffffff8109b7df>] ? kswapd+0x0/0x865
[584163.117456]  [<ffffffff8104e252>] kthread+0x7d/0x85
[584163.117456]  [<ffffffff81002cd4>] kernel_thread_helper+0x4/0x10
[584163.117456]  [<ffffffff8104e1d5>] ? kthread+0x0/0x85
[584163.117456]  [<ffffffff81002cd0>] ? kernel_thread_helper+0x0/0x10
[584163.117456] Code: 3b 56 10 73 1e 48 83 fa f2 74 18 4d 89 f8 48 8d 4d cc 4c 89 e7 e8 44 f2 ff ff 41 01 c5 83 7d cc 00 74 19 48 8b 43 20 48 8d 58 e0 <48> 8b 43 20 0f 18 08 48 8d 43 20 48 39 45 88 75 a7 41 fe 06 e9 
[584163.117456] RIP  [<ffffffff810af4ea>] page_referenced+0xef/0x1d5
[584163.117456]  RSP <ffff88023fe6dc20>
[584163.117456] CR2: 0000000000000000
[584163.418575] ---[ end trace 689f7702fb2ed439 ]---

I haven't seen it before, and it's only happened once so far.
Comment 1 Alex Montana 2011-03-03 11:16:28 UTC
Hi,

I've got almost the same issue with 2.6.35.7 kernel.
The machine simply died with no warnings, no load and after the reboot I found this:

//=====================================================
Mar  3 03:49:30  kernel: BUG: unable to handle kernel NULL pointer dereference at (nil)
Mar  3 03:49:30  kernel: IP: [<ffffffff811939ac>] 
Mar  3 03:49:30  kernel: PGD 205131067 PUD 141261067 PMD 0 
Mar  3 03:49:30  kernel: Oops: 0000 [#1] SMP 
Mar  3 03:49:30  kernel: last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Mar  3 03:49:30  kernel: CPU 0 
Mar  3 03:49:30  kernel: Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_recent xt_owner xt_conntrack iptable_mangle ipt_REJECT ipt_LOG xt_limit xt_multiport xt_state iptable_filter ip_tables ipv6 dm_mirror dm_multipath dm_region_hash dm_log shpchp ohci_hcd
Mar  3 03:49:30  kernel: 
Mar  3 03:49:30  kernel: Pid: 500, comm: kswapd0 Not tainted 2.6.35.7-grsec #1 X8DTL/X8DTL
Mar  3 03:49:30  kernel: RIP: 0010:[<ffffffff811939ac>]  [<ffffffff811939ac>] 
Mar  3 03:49:30  kernel: RSP: 0018:ffff88023e863d30  EFLAGS: 00010297
Mar  3 03:49:30  kernel: RAX: ffff880074775670 RBX: ffffffffffffff18 RCX: ffff880074775660
Mar  3 03:49:30  kernel: RDX: ffff880074775670 RSI: ffff880074775658 RDI: ffff88007477579c
Mar  3 03:49:30  kernel: RBP: ffff88023e863d70 R08: 0000000000000000 R09: 0000000000000000
Mar  3 03:49:30  kernel: R10: 0000000000000000 R11: 00000000ffffff02 R12: ffff880074775670
Mar  3 03:49:30  kernel: R13: ffff8800747756f0 R14: ffff880074775660 R15: 0000000000000012
Mar  3 03:49:30  kernel: FS:  0000000000000000(0000) GS:ffff880001c00000(0000) knlGS:0000000000000000
Mar  3 03:49:30  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar  3 03:49:30  kernel: CR2: 0000000000000000 CR3: 0000000001841000 CR4: 00000000000006f0
Mar  3 03:49:30  kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar  3 03:49:30  kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar  3 03:49:30  kernel: Process kswapd0 (pid: 500, threadinfo ffff88023e862000, task ffff88023f459900)
Mar  3 03:49:30  kernel: Stack:
Mar  3 03:49:30  kernel:  ffff8801d805e298 ffff8801d805e298 ffff88023e863d40 0000000000004bc8
Mar  3 03:49:30  kernel: <0> ffffffff818dffe0 0000000000000081 00000000000000d0 0000000000000000
Mar  3 03:49:30  kernel: <0> ffff88023e863dc0 ffffffff8109bcf5 ffff88023e863dc0 000000000018ecb6
Mar  3 03:49:30  kernel: Call Trace:
Mar  3 03:49:30  kernel:  [<ffffffff8109bcf5>] 
Mar  3 03:49:30  kernel:  [<ffffffff8109c780>] 
Mar  3 03:49:30  kernel:  [<ffffffff81055dc0>] ? 
Mar  3 03:49:30  kernel:  [<ffffffff8109c3b0>] ? 
Mar  3 03:49:30  kernel:  [<ffffffff810559ee>] 
Mar  3 03:49:30  kernel:  [<ffffffff81003b94>] 
Mar  3 03:49:30  kernel:  [<ffffffff81492911>] ? 
Mar  3 03:49:30  kernel:  [<ffffffff81055960>] ? 
Mar  3 03:49:30  kernel:  [<ffffffff81003b90>] ? 
Mar  3 03:49:30  kernel: Code: 89 10 4d 89 64 24 08 4c 89 a3 e8 00 00 00 f0 80 a3 90 00 00 00 fb 41 fe 85 ac 00 00 00 48 8b 9b e8 00 00 00 48 81 eb e8 00 00 00 <48> 8b 83 e8 00 00 00 4c 8d a3 e8 00 00 00 0f 18 08 49 81 fc d0 
Mar  3 03:49:30  kernel: RIP  [<ffffffff811939ac>] 
Mar  3 03:49:30  kernel:  RSP <ffff88023e863d30>
Mar  3 03:49:30  kernel: CR2: 0000000000000000
Mar  3 03:49:30  kernel: ---[ end trace 09c41ba9fa71fb72 ]---
Mar  3 03:49:31  kernel: swap_free: Bad swap offset entry 00800000
Mar  3 03:49:31  kernel: BUG: Bad page map in process httpd  pte:100000000 pmd:74775067
Mar  3 03:49:31 cx96 kernel: addr:0000000002af3000 vm_flags:00100073 anon_vma:ffff88006e09cc00 mapping:(nil) index:2af3
Mar  3 03:49:31 kernel: Pid: 3970, comm: httpd Tainted: G      D     2.6.35.7-grsec #1
Mar  3 03:49:31 kernel: Call Trace:
Mar  3 03:49:31 kernel:  [<ffffffff810a492b>] 
Mar  3 03:49:31 kernel:  [<ffffffff810a5bb5>] 
Mar  3 03:49:31 kernel:  [<ffffffff810aacc2>] 
Mar  3 03:49:31 kernel:  [<ffffffff810396f6>] 
Mar  3 03:49:31 kernel:  [<ffffffff810cb516>] 
Mar  3 03:49:31 kernel:  [<ffffffff8110de18>] 
Mar  3 03:49:31 kernel:  [<ffffffff81095336>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff8102a7db>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810989e4>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810afc22>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810989e4>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810a6afe>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810a5442>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810a6e48>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff811cb915>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff81203d61>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff8110b380>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810a7184>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810ca08f>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810bdfe9>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810ca009>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810ca425>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff810ca644>] 
Mar  3 03:49:31 kernel:  [<ffffffff810cc58e>] 
Mar  3 03:49:31 kernel:  [<ffffffff81207f3a>] ? 
Mar  3 03:49:31 kernel:  [<ffffffff8100b4c9>] 
Mar  3 03:49:31 kernel:  [<ffffffff810031ea>] 
//=====================================================

It appears to happen intermittently and I can't find a reason (this error has happened two times for the last 10 days).
Comment 2 Alan 2012-06-18 16:30:09 UTC
Not much we can do with this data versus the old kernel alas, so closing

Note You need to log in before you can comment on or make changes to this bug.