Bug 15552

Summary: nfs is blocking processes
Product: File System Reporter: Andreas Radke (andyrtr)
Component: NFSAssignee: Trond Myklebust (trondmy)
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Tree: Mainline
Regression: Yes
Attachments: NFS: Avoid a deadlock in nfs_release_page

Description Andreas Radke 2010-03-17 05:38:56 UTC
Since various process lockups occour like this:

INFO: task crond:3509 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
crond         D ffff880008c0f948     0  3509      1 0x00000000
 ffff880221ae3498 0000000000000086 ffff880221ae3460 ffff880221ae345c
 0000000000000000 ffff880008c0f948 ffff880008d152c0 ffff88022504df70
 00000001002ef773 ffff880221ae3fd8 ffff880221ae2000 ffff880221ae2000
Call Trace:
 [<ffffffffa0554290>] ? nfs_wait_bit_uninterruptible+0x0/0x10 [nfs]
 [<ffffffff8135970e>] io_schedule+0x6e/0xb0
 [<ffffffffa0554299>] nfs_wait_bit_uninterruptible+0x9/0x10 [nfs]
 [<ffffffff81359e77>] __wait_on_bit+0x57/0x80
 [<ffffffffa0554290>] ? nfs_wait_bit_uninterruptible+0x0/0x10 [nfs]
 [<ffffffff81359f13>] out_of_line_wait_on_bit+0x73/0x90
 [<ffffffff81073090>] ? wake_bit_function+0x0/0x40
 [<ffffffffa0554287>] nfs_wait_on_request+0x27/0x30 [nfs]
 [<ffffffffa05593d0>] nfs_sync_mapping_wait+0x120/0x280 [nfs]
 [<ffffffffa0559ba9>] nfs_wb_page+0x79/0xd0 [nfs]
 [<ffffffffa05488bf>] nfs_release_page+0x5f/0x80 [nfs]
 [<ffffffff810dabfb>] try_to_release_page+0x2b/0x50
 [<ffffffff810e8af3>] shrink_page_list+0x453/0x5f0
 [<ffffffff810e8f59>] shrink_inactive_list+0x2c9/0x770
 [<ffffffff810088ee>] ? __switch_to+0x29e/0x360
 [<ffffffff810e3e45>] ? determine_dirtyable_memory+0x15/0x30
 [<ffffffff810e3ee2>] ? get_dirty_limits+0x22/0x2e0
 [<ffffffff810e9671>] shrink_zone+0x271/0x460
 [<ffffffff810e9a2a>] ? shrink_slab+0x13a/0x170
 [<ffffffff810ea47e>] do_try_to_free_pages+0xde/0x420
 [<ffffffff810ea94b>] try_to_free_pages+0x6b/0x70
 [<ffffffff810e7df0>] ? isolate_pages_global+0x0/0x270
 [<ffffffff810e2b6d>] __alloc_pages_nodemask+0x41d/0x700
 [<ffffffff810e4c90>] __do_page_cache_readahead+0x110/0x280
 [<ffffffff810e4e1c>] ra_submit+0x1c/0x20
 [<ffffffff810dc7df>] filemap_fault+0x37f/0x440
 [<ffffffff810f625f>] __do_fault+0x4f/0x590
 [<ffffffff8107735c>] ? lock_hrtimer_base+0x2c/0x60
 [<ffffffff810f8a0b>] handle_mm_fault+0x19b/0xa20
 [<ffffffff8135af60>] ? do_nanosleep+0x40/0xd0
 [<ffffffff81077a5f>] ? hrtimer_nanosleep+0xbf/0x180
 [<ffffffff81031683>] do_page_fault+0x133/0x320
 [<ffffffff8135c965>] page_fault+0x25/0x30

I can almost always reproduce the lockup when using tab completion in bash on a nfs mount. The nfs server is running latest I haven't seen it in initial 2.6.33 release but did not use it for long so maybe I only missed it there. I'll try to reproduce it there. Maybe it's realted to the stable fix:
Trond Myklebust (2):
      SUNRPC: Handle EINVAL error returns from the TCP connect operation
      NFS: Fix an allocation-under-spinlock bug

fs/nfs/dns_resolve.c                    |   18 
fs/nfsd/nfs4state.c                     |    2 

I'll try to reproduce it in 2.6.33
Comment 1 Trond Myklebust 2010-03-17 13:12:27 UTC
The above just means that the thread is waiting for another write request to complete. In order to figure out why it is hanging, you need to look at the other threads to figure out which one is actually responsible for doing the writeout and see why it is hanging.

That said, there is a known problem with a deadlock in shrink_page_list() and I've got a patch queued up to fix it. Could you perhaps give it a try, and see if it fixes your hang.
Comment 2 Trond Myklebust 2010-03-17 13:13:30 UTC
Created attachment 25567 [details]
NFS: Avoid a deadlock in nfs_release_page
Comment 3 kernel-bug-tracker 2010-03-22 14:56:33 UTC
I may be seeing a similar error.  When copying large amounts (hundreds of GBs) of data over NFS the copying process (rsync, tar, etc.) will get into a D state and then flush and kswapd enter a D state.  Finally any process writing to disk will block.

This occurs with the Debian 2.6.33-1~experimental ( kernel and the Debian 2.6.32-10 ( equivalent) kernels.

I applied the provided patch to the Debian kernel and it did not solve the problem.  Below is the call trace from the blocked process.  This did not kill the process.

INFO: task rsync:17708 blocked for more than 600 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rsync         D 0000000000000002     0 17708  17707 0x00000000
 ffff880c7cfb6200 0000000000000086 0000000000000000 ffff88044b409334
 0000000000000286 000000000000f8e0 ffff88044b409fd8 0000000000015680
 0000000000015680 ffff88067bc6e900 ffff88067bc6ebf0 0000000181066af9
Call Trace:
 [<ffffffff8100f789>] ? read_tsc+0xa/0x20
 [<ffffffff8109411e>] ? delayacct_end+0x74/0x7f
 [<ffffffffa03c3c04>] ? nfs_wait_bit_uninterruptible+0x0/0xd [nfs]
 [<ffffffff812ed1cb>] ? io_schedule+0x73/0xb7
 [<ffffffffa03c3c0d>] ? nfs_wait_bit_uninterruptible+0x9/0xd [nfs]
 [<ffffffff812ed6f6>] ? __wait_on_bit+0x41/0x70
 [<ffffffff81183d3c>] ? __lookup_tag+0xad/0x11b
 [<ffffffffa03c3c04>] ? nfs_wait_bit_uninterruptible+0x0/0xd [nfs]
 [<ffffffff812ed790>] ? out_of_line_wait_on_bit+0x6b/0x77
 [<ffffffff8105f300>] ? wake_bit_function+0x0/0x23
 [<ffffffffa03c7b87>] ? nfs_sync_mapping_wait+0xfa/0x227 [nfs]
 [<ffffffffa03c7d48>] ? nfs_wb_page+0x94/0xc3 [nfs]
 [<ffffffffa03badfc>] ? nfs_release_page+0x3a/0x57 [nfs]
 [<ffffffff810b9382>] ? shrink_page_list+0x48d/0x617
 [<ffffffff8103a9a1>] ? pick_next_task_fair+0xca/0xd5
 [<ffffffff810b843d>] ? isolate_pages_global+0x1a0/0x20f
 [<ffffffff810b9ee3>] ? shrink_zone+0x710/0xae9
 [<ffffffff810b52ba>] ? __alloc_pages_nodemask+0x2a7/0x5e1
 [<ffffffff81279ab1>] ? tcp_established_options+0x2d/0xa9
 [<ffffffff8103dcac>] ? find_busiest_group+0x3b0/0x875
 [<ffffffff8100f789>] ? read_tsc+0xa/0x20
 [<ffffffff810bb121>] ? do_try_to_free_pages+0x1ce/0x31e
 [<ffffffff810bb396>] ? try_to_free_pages+0x72/0x78
 [<ffffffff810b829d>] ? isolate_pages_global+0x0/0x20f
 [<ffffffff810b53cb>] ? __alloc_pages_nodemask+0x3b8/0x5e1
 [<ffffffff81053aaf>] ? lock_timer_base+0x26/0x4b
 [<ffffffff810e1314>] ? new_slab+0x42/0x1ca
 [<ffffffff810e168c>] ? __slab_alloc+0x1f0/0x3a2
 [<ffffffff810b0a58>] ? mempool_alloc+0x55/0x106
 [<ffffffff810b0a58>] ? mempool_alloc+0x55/0x106
 [<ffffffff810e1d66>] ? kmem_cache_alloc+0x7f/0xf0
 [<ffffffff810b0a58>] ? mempool_alloc+0x55/0x106
 [<ffffffff8105f2d2>] ? autoremove_wake_function+0x0/0x2e
 [<ffffffffa03c840e>] ? nfs_writedata_alloc+0x19/0x98 [nfs]
 [<ffffffffa03c84a1>] ? nfs_flush_one+0x14/0xce [nfs]
 [<ffffffffa03c3abc>] ? nfs_pageio_doio+0x2a/0x51 [nfs]
 [<ffffffffa03c3ba8>] ? nfs_pageio_add_request+0xc5/0xd5 [nfs]
 [<ffffffffa03c6e01>] ? nfs_do_writepage+0xe0/0x103 [nfs]
 [<ffffffffa03c72e7>] ? nfs_writepages_callback+0xf/0x21 [nfs]
 [<ffffffff810b5c38>] ? write_cache_pages+0x1be/0x2a4
 [<ffffffffa03c72d8>] ? nfs_writepages_callback+0x0/0x21 [nfs]
 [<ffffffffa03c7297>] ? nfs_writepages+0xde/0x11f [nfs]
 [<ffffffffa03c848d>] ? nfs_flush_one+0x0/0xce [nfs]
 [<ffffffffa03c8336>] ? nfs_write_mapping+0x55/0x8e [nfs]
 [<ffffffffa03bb273>] ? nfs_do_fsync+0x1c/0x3c [nfs]
 [<ffffffff810e820a>] ? filp_close+0x37/0x62
 [<ffffffff810e82c9>] ? sys_close+0x94/0xcd
 [<ffffffff81008ac2>] ? system_call_fastpath+0x16/0x1b
Comment 4 Trond Myklebust 2010-03-22 15:24:13 UTC
Yes. Looking again at the traces, I'm thinking this might be the same bug as Bugzilla entry 15578. Can you try the proposed patch at


and see if that fixes the hang?
Comment 5 kernel-bug-tracker 2010-03-22 20:23:32 UTC
I've installed the proposed patch and have not had any problems yet.  It seems to take many GBs of NFS writing to cause the problem, so I can't yet say if it has fixed the issue.
Comment 6 Andreas Radke 2010-03-22 20:39:54 UTC
My problem occoured in 2.6.33 and I've tried the patch appended here in this bug. So far it seems to solve my problem.
Comment 7 Trond Myklebust 2010-03-23 21:34:13 UTC
OK. Tentatively marking this bug as a duplicate of 15578 due to the similarity of the reported traces.

*** This bug has been marked as a duplicate of bug 15578 ***