Bug 59901 - OOM deadlock
Summary: OOM deadlock
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-19 08:03 UTC by Bart Van Assche
Modified: 2013-08-27 09:30 UTC (History)
1 user (show)

See Also:
Kernel Version: v3.10-rc6
Tree: Mainline
Regression: No


Attachments
Kernel config (82.66 KB, text/plain)
2013-06-19 13:04 UTC, Bart Van Assche
Details

Description Bart Van Assche 2013-06-19 08:03:23 UTC
Apparently with kernel v3.10-rc6 when triggering an OOM enough times the mm subsystem locks up (haven't tried any other kernel versions yet). Leaving the following script running for a few minutes is sufficient to trigger the lockup:

#!/bin/bash
swapoff -a
while true; do
    free="$(free | sed -n 's|.*buffers/cache:[[:blank:]]*[^[:blank:]]*[[:blank:]]*||p')"
    for ((i=free-15000;i<free;i+=1000)); do
        echo "size $i KB ..."
	dd if=/dev/zero of=/dev/null bs=${i}K count=1
    done
done

When the lockup occurs sometimes the following kernel messages are generated:

[  840.360097] INFO: task rtkit-daemon:1538 blocked for more than 120 seconds.  
[  840.360117] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
[  840.360122] rtkit-daemon    D ffffffffffffffff     0  1538      1 0x00000000 
[  840.360129]  ffff880029d19dc0 0000000000000046 ffff880029d10000 ffff880029d19
[  840.360136]  ffff880029d19fd8 ffff880029d19fd8 ffff880036dc0000 ffff880029d10
[  840.360142]  ffff880029d10000 ffff88003bc347f0 ffff88003bc347f8 ffff88003bc34
[  840.360149] Call Trace:                                                      
[  840.360158]  [<ffffffff8152b9e9>] schedule+0x29/0x70                         
[  840.360162]  [<ffffffff8152c415>] rwsem_down_write_failed+0xe5/0x190         
[  840.360167]  [<ffffffff8108046d>] ? sched_clock_local+0x1d/0x90              
[  840.360173]  [<ffffffff81245d23>] call_rwsem_down_write_failed+0x13/0x20     
[  840.360177]  [<ffffffff8152a1aa>] ? down_write+0x9a/0xc0                     
[  840.360182]  [<ffffffff811330e4>] ? vm_mmap_pgoff+0x54/0xb0
[  840.360186]  [<ffffffff811330e4>] vm_mmap_pgoff+0x54/0xb0
[  840.360190]  [<ffffffff8114561b>] SyS_mmap_pgoff+0x5b/0x280
[  840.360194]  [<ffffffff81245dbe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  840.360198]  [<ffffffff81006882>] SyS_mmap+0x22/0x30
[  840.360208]  [<ffffffff815367d9>] system_call_fastpath+0x16/0x1b
[  840.360211] 1 lock held by rtkit-daemon/1538:
[  840.360213]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff811330e4>] vm_mmap_p
Comment 1 Bart Van Assche 2013-06-19 13:04:49 UTC
Created attachment 105381 [details]
Kernel config
Comment 2 Andrew Morton 2013-06-27 21:32:23 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 19 Jun 2013 08:03:23 +0000 (UTC) bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=59901
> 
>            Summary: OOM deadlock
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: v3.10-rc6
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: akpm@linux-foundation.org
>         ReportedBy: bvanassche@acm.org
>         Regression: No
> 
> 
> Apparently with kernel v3.10-rc6 when triggering an OOM enough times the mm
> subsystem locks up (haven't tried any other kernel versions yet). Leaving the
> following script running for a few minutes is sufficient to trigger the
> lockup:
> 
> #!/bin/bash
> swapoff -a
> while true; do
>     free="$(free | sed -n
> 's|.*buffers/cache:[[:blank:]]*[^[:blank:]]*[[:blank:]]*||p')"
>     for ((i=free-15000;i<free;i+=1000)); do
>         echo "size $i KB ..."
>     dd if=/dev/zero of=/dev/null bs=${i}K count=1
>     done
> done
> 
> When the lockup occurs sometimes the following kernel messages are generated:
> 
> [  840.360097] INFO: task rtkit-daemon:1538 blocked for more than 120
> seconds.  
> [  840.360117] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this 
> [  840.360122] rtkit-daemon    D ffffffffffffffff     0  1538      1
> 0x00000000 
> [  840.360129]  ffff880029d19dc0 0000000000000046 ffff880029d10000
> ffff880029d19
> [  840.360136]  ffff880029d19fd8 ffff880029d19fd8 ffff880036dc0000
> ffff880029d10
> [  840.360142]  ffff880029d10000 ffff88003bc347f0 ffff88003bc347f8
> ffff88003bc34
> [  840.360149] Call Trace:                                                    
> [  840.360158]  [<ffffffff8152b9e9>] schedule+0x29/0x70                       
> [  840.360162]  [<ffffffff8152c415>] rwsem_down_write_failed+0xe5/0x190       
> [  840.360167]  [<ffffffff8108046d>] ? sched_clock_local+0x1d/0x90            
> [  840.360173]  [<ffffffff81245d23>] call_rwsem_down_write_failed+0x13/0x20   
> [  840.360177]  [<ffffffff8152a1aa>] ? down_write+0x9a/0xc0                   
> [  840.360182]  [<ffffffff811330e4>] ? vm_mmap_pgoff+0x54/0xb0
> [  840.360186]  [<ffffffff811330e4>] vm_mmap_pgoff+0x54/0xb0
> [  840.360190]  [<ffffffff8114561b>] SyS_mmap_pgoff+0x5b/0x280
> [  840.360194]  [<ffffffff81245dbe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [  840.360198]  [<ffffffff81006882>] SyS_mmap+0x22/0x30
> [  840.360208]  [<ffffffff815367d9>] system_call_fastpath+0x16/0x1b
> [  840.360211] 1 lock held by rtkit-daemon/1538:
> [  840.360213]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff811330e4>]
> vm_mmap_p
> 
> -- 
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
Comment 3 Sergey Dyasly 2013-07-02 10:23:18 UTC
I was able to successfully reproduce this bug in QEMU (kvm disabled) with
3.10 kernel (config from attachement) and following params: "-smp 2 -m 128M".
I was using the script from Bart Van Assche and also my own OOM killer test.
Results are the same.

I don't have messages about hung tasks, but system locks up.

Looking at gdb backtrace I always see prune_super shrinkers are working.
Root FS type is ext3 if that matters.

example gdb backtrace:

(gdb) thr apply all bt

Thread 2 (Thread 2):
#0  get_current () at /home/dserrg/git/mainline/linux-git/arch/x86/include/asm/current.h:14
#1  lock_acquired (lock=0xffffffff81a5f1f8, ip=18446744071580237963) at kernel/lockdep.c:3826
#2  0xffffffff8155f4c2 in __raw_spin_lock (lock=0xffffffff81a5f1e0) at include/linux/spinlock_api_smp.h:144
#3  _raw_spin_lock (lock=0xffffffff81a5f1e0) at kernel/spinlock.c:137
#4  0xffffffff8115408b in spin_lock (lock=0xffffffff81a5f1e0) at include/linux/spinlock.h:285
#5  grab_super_passive (sb=0xffff8800061cd000) at fs/super.c:371
#6  0xffffffff81154152 in prune_super (shrink=0xffff8800061cd830, sc=0xffff8800063ad9f8) at fs/super.c:71
#7  0xffffffff8110d929 in do_shrinker_shrink (nr_to_scan=0, sc=0xffff8800063ad9f8, 
    shrinker=0xffff8800061cd830) at mm/vmscan.c:185
#8  shrink_slab (shrink=<optimized out>, nr_pages_scanned=32, lru_pages=12) at mm/vmscan.c:234
#9  0xffffffff811101f3 in do_try_to_free_pages (zonelist=0xffff880007ff9e00, sc=0xffff8800063ad9b0, 
    shrink=0xffff8800063ad9f8) at mm/vmscan.c:2195
#10 0xffffffff811105b4 in try_to_free_pages (zonelist=0xffff880007ff9e00, order=0, gfp_mask=<optimized out>, 
    nodemask=<optimized out>) at mm/vmscan.c:2389
#11 0xffffffff81105f78 in __perform_reclaim (nodemask=0x0, zonelist=0xffff880007ff9e00, order=0, 
    gfp_mask=131546) at mm/page_alloc.c:2251
#12 __alloc_pages_direct_reclaim (did_some_progress=0xffff8800063adb48, migratetype=2, 
    preferred_zone=0xffff880007ff8780, alloc_flags=<optimized out>, nodemask=0x0, high_zoneidx=ZONE_MOVABLE, 
    zonelist=0xffff880007ff9e00, order=0, gfp_mask=131546) at mm/page_alloc.c:2272
#13 __alloc_pages_slowpath (migratetype=2, preferred_zone=0xffff880007ff8780, nodemask=0x0, 
    high_zoneidx=ZONE_MOVABLE, zonelist=0xffff880007ff9e00, order=0, gfp_mask=131546) at mm/page_alloc.c:2512
#14 __alloc_pages_nodemask (gfp_mask=<optimized out>, order=0, zonelist=0xffff880007ff9e00, 
    nodemask=<optimized out>) at mm/page_alloc.c:2660
#15 0xffffffff8113d4b4 in alloc_pages_current (gfp=131546, order=0) at mm/mempolicy.c:2053
#16 0xffffffff810fd15f in alloc_pages (order=0, gfp_mask=131546) at include/linux/gfp.h:334
#17 __page_cache_alloc (gfp=131546) at mm/filemap.c:528
#18 0xffffffff810ff510 in page_cache_alloc_cold (x=0xffff880006676598) at include/linux/pagemap.h:235
#19 page_cache_read (offset=<optimized out>, file=0xffff88000633be00) at mm/filemap.c:1510
#20 filemap_fault (vma=0xffff8800062d9ac8, vmf=0xffff8800063adcc8) at mm/filemap.c:1685
#21 0xffffffff81120846 in __do_fault (mm=0xffff880006226f80, vma=0xffff8800062d9ac8, address=4299363, 
    pmd=0xffff880006334010, pgoff=25, flags=<optimized out>, orig_pte=...) at mm/memory.c:3329
#22 0xffffffff8112353d in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, 
    address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>)
    at mm/memory.c:3515
#23 handle_pte_fault (mm=0xffff880006226f80, vma=0xffff8800062d9ac8, address=4299363, pte=0xffff880005c260c8, 
    pmd=0xffff880006334010, flags=<optimized out>) at mm/memory.c:3714
#24 0xffffffff81123e6f in handle_mm_fault (mm=0xffff880006226f80, vma=0xffff8800062d9ac8, address=4299363, 
    flags=<optimized out>) at mm/memory.c:3847
#25 0xffffffff81563ff9 in __do_page_fault (regs=0xffff8800063adf58, error_code=20) at arch/x86/mm/fault.c:1188
#26 0xffffffff8156406e in do_page_fault (regs=<optimized out>, error_code=<optimized out>)
    at arch/x86/mm/fault.c:1230
#27 <signal handler called>
#28 0x0000000000419a63 in ?? ()
#29 0x0000000000000000 in ?? ()

Thread 1 (Thread 1):
#0  __spin_lock_debug (lock=0xffffffff81a5f1e0) at lib/spinlock_debug.c:112
#1  do_raw_spin_lock (lock=0xffffffff81a5f1e0) at lib/spinlock_debug.c:137
#2  0xffffffff8155f4b6 in __raw_spin_lock (lock=0xffffffff81a5f1e0) at include/linux/spinlock_api_smp.h:144
#3  _raw_spin_lock (lock=<optimized out>) at kernel/spinlock.c:137
#4  0xffffffff8115303d in spin_lock (lock=0xffffffff81a5f1e0) at include/linux/spinlock.h:285
#5  put_super (sb=0xffff8800061cc000) at fs/super.c:277
#6  0xffffffff81153132 in drop_super (sb=0xffff8800061cc000) at fs/super.c:504
#7  0xffffffff81154262 in prune_super (shrink=0xffff8800061cc830, sc=<optimized out>) at fs/super.c:108
#8  0xffffffff8110d929 in do_shrinker_shrink (nr_to_scan=0, sc=0xffff880006f817f8, 
---Type <return> to continue, or q <return> to quit---
    shrinker=0xffff8800061cc830) at mm/vmscan.c:185
#9  shrink_slab (shrink=<optimized out>, nr_pages_scanned=32, lru_pages=12) at mm/vmscan.c:234
#10 0xffffffff811101f3 in do_try_to_free_pages (zonelist=0xffff880007ff9e00, sc=0xffff880006f817b0, 
    shrink=0xffff880006f817f8) at mm/vmscan.c:2195
#11 0xffffffff811105b4 in try_to_free_pages (zonelist=0xffff880007ff9e00, order=0, gfp_mask=<optimized out>, 
    nodemask=<optimized out>) at mm/vmscan.c:2389
#12 0xffffffff81105f78 in __perform_reclaim (nodemask=0x0, zonelist=0xffff880007ff9e00, order=0, gfp_mask=16)
    at mm/page_alloc.c:2251
#13 __alloc_pages_direct_reclaim (did_some_progress=0xffff880006f81948, migratetype=0, 
    preferred_zone=0xffff880007ff8780, alloc_flags=<optimized out>, nodemask=0x0, high_zoneidx=ZONE_NORMAL, 
    zonelist=0xffff880007ff9e00, order=0, gfp_mask=16) at mm/page_alloc.c:2272
#14 __alloc_pages_slowpath (migratetype=0, preferred_zone=0xffff880007ff8780, nodemask=0x0, 
    high_zoneidx=ZONE_NORMAL, zonelist=0xffff880007ff9e00, order=0, gfp_mask=16) at mm/page_alloc.c:2512
#15 __alloc_pages_nodemask (gfp_mask=<optimized out>, order=0, zonelist=0xffff880007ff9e00, 
    nodemask=<optimized out>) at mm/page_alloc.c:2660
#16 0xffffffff8113d4b4 in alloc_pages_current (gfp=16, order=0) at mm/mempolicy.c:2053
#17 0xffffffff81186fe8 in alloc_pages (order=0, gfp_mask=<optimized out>) at include/linux/gfp.h:334
#18 bio_copy_user_iov (q=0xffff8800061d0c20, map_data=0x0, iov=0xffff880006f81a88, iov_count=1, 
    write_to_vm=1, gfp_mask=16) at fs/bio.c:1146
#19 0xffffffff81187205 in bio_copy_user (q=<optimized out>, map_data=<optimized out>, uaddr=<optimized out>, 
    len=<optimized out>, write_to_vm=<optimized out>, gfp_mask=<optimized out>) at fs/bio.c:1208
#20 0xffffffff81187241 in bio_copy_kern (q=<optimized out>, data=0xffff880006f81c28, len=<optimized out>, 
    gfp_mask=<optimized out>, reading=1) at fs/bio.c:1530
#21 0xffffffff8127adb3 in blk_rq_map_kern (q=0xffff8800061d0c20, rq=0xffff880005cf0178, kbuf=<optimized out>, 
    len=<optimized out>, gfp_mask=<optimized out>) at block/blk-map.c:307
#22 0xffffffff81380819 in scsi_execute (sdev=0xffff8800061c9000, cmd=0xffff880006f81c16 "J\001", 
    data_direction=<optimized out>, buffer=0xffff880006f81c28, bufflen=8, sense=0xffff880004995360 "", 
    timeout=3000, retries=3, flags=0, resid=0x0) at drivers/scsi/scsi_lib.c:242
#23 0xffffffff813810cb in scsi_execute_req_flags (sdev=0xffff8800061c9000, cmd=0xffff880006f81c16 "J\001", 
    data_direction=2, buffer=0xffff880006f81c28, bufflen=<optimized out>, sshdr=0xffff880006f81c30, 
    timeout=3000, retries=3, resid=0x0, flags=0) at drivers/scsi/scsi_lib.c:292
#24 0xffffffff8138ff47 in scsi_execute_req (resid=0x0, bufflen=8, buffer=0xffff880006f81c28, 
    cmd=0xffff880006f81c16 "J\001", retries=3, timeout=3000, sshdr=0xffff880006f81c30, data_direction=2, 
    sdev=<optimized out>) at include/scsi/scsi_device.h:406
#25 sr_get_events (sdev=<optimized out>) at drivers/scsi/sr.c:204
#26 sr_check_events (cdi=<optimized out>, clearing=0, slot=<optimized out>) at drivers/scsi/sr.c:242
#27 0xffffffff813cdd8c in cdrom_update_events (clearing=<optimized out>, cdi=0xffff880006f85578)
    at drivers/cdrom/cdrom.c:1415
#28 cdrom_check_events (cdi=0xffff880006f85578, clearing=<optimized out>) at drivers/cdrom/cdrom.c:1425
#29 0xffffffff81390399 in sr_block_check_events (disk=<optimized out>, clearing=<optimized out>)
    at drivers/scsi/sr.c:606
#30 0xffffffff8127e93b in disk_check_events (ev=<optimized out>, clearing_ptr=<optimized out>)
    at block/genhd.c:1619
#31 0xffffffff8127ea4c in disk_events_workfn (work=<optimized out>) at block/genhd.c:1605
#32 0xffffffff8105c902 in process_one_work (worker=0xffff880006f25f80, work=0xffff880006192b20)
    at kernel/workqueue.c:2173
#33 0xffffffff8105d9a5 in worker_thread (__worker=0xffff880006f25f80) at kernel/workqueue.c:2288
#34 0xffffffff81064666 in kthread (_create=0xffff880006cbbc30) at kernel/kthread.c:200
#35 <signal handler called>
#36 0x0000000000000000 in ?? ()
#37 0x0000000000000000 in ?? ()
Comment 4 Bart Van Assche 2013-08-27 09:30:22 UTC
So far I haven't been able to reproduce this with kernel 3.11-rc7.

Note You need to log in before you can comment on or make changes to this bug.