Bug 14739

Summary: BUG: scheduling while atomic: nfsd/3369/0x00000002
Product: File System Reporter: Krzysztof Oledzki (ole)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: CLOSED OBSOLETE    
Severity: normal CC: alan, bfields, dmonakhov, trondmy
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31.6 Subsystem:
Regression: No Bisected commit-id:

Description Krzysztof Oledzki 2009-12-05 11:38:21 UTC
BUG: scheduling while atomic: nfsd/3369/0x00000002
 Pid: 3369, comm: nfsd Tainted: G        W  2.6.31.6-o0 #1
 Call Trace:
  [<ffffffff815a49dd>] ? schedule+0xc9/0x90d
  [<ffffffff815a6f8c>] ? _spin_lock_irqsave+0x18/0x34
  [<ffffffff815a6ddd>] ? __down_read+0x8f/0xa9
  [<ffffffff810d3ad6>] ? dquot_reserve_space+0x33/0x6f
  [<ffffffff8111fe44>] ? ext4_da_get_block_prep+0x139/0x277
  [<ffffffff810b53e5>] ? __block_prepare_write+0x1a8/0x387
  [<ffffffff8111fd0b>] ? ext4_da_get_block_prep+0x0/0x277
  [<ffffffff8106b06e>] ? add_to_page_cache_locked+0xa5/0xc8
  [<ffffffff810b572d>] ? block_write_begin+0x7a/0xc8
  [<ffffffff81122581>] ? ext4_da_write_begin+0x166/0x1ed
  [<ffffffff8111fd0b>] ? ext4_da_get_block_prep+0x0/0x277
  [<ffffffff8106ba58>] ? generic_file_buffered_write+0x130/0x30b
  [<ffffffff8106c112>] ? __generic_file_aio_write_nolock+0x349/0x37d
  [<ffffffff8106c8ea>] ? generic_file_aio_write+0x64/0xc4
  [<ffffffff8111ab46>] ? ext4_file_write+0x93/0x115
  [<ffffffff8111aab3>] ? ext4_file_write+0x0/0x115
  [<ffffffff81095ac5>] ? do_sync_readv_writev+0xc0/0x107
  [<ffffffff814bb047>] ? release_sock+0x19/0xb4
  [<ffffffff8104ce98>] ? autoremove_wake_function+0x0/0x2e
  [<ffffffff81187f59>] ? nfsd_acceptable+0x0/0xd2
  [<ffffffff81095951>] ? rw_copy_check_uvector+0x6d/0xe4
  [<ffffffff81096132>] ? do_readv_writev+0xb2/0x198
  [<ffffffff8118e826>] ? nfsd_setuser+0x1c7/0x221
  [<ffffffff815a5cd4>] ? __mutex_lock_slowpath+0x261/0x287
  [<ffffffff8111aa5d>] ? ext4_file_open+0xb4/0xc3
  [<ffffffff81189b4c>] ? nfsd_vfs_write+0x11d/0x336
  [<ffffffff81094899>] ? __dentry_open+0x115/0x1f0
  [<ffffffff8118a364>] ? nfsd_open+0x171/0x1b2
  [<ffffffff8118a699>] ? nfsd_write+0xc5/0xe2
  [<ffffffff81191379>] ? nfsd3_proc_write+0xeb/0x109
  [<ffffffff811853a9>] ? nfsd_dispatch+0xf2/0x1ce
  [<ffffffff81560b91>] ? svc_process+0x40f/0x715
  [<ffffffff8103742e>] ? default_wake_function+0x0/0x9
  [<ffffffff815a713a>] ? _spin_unlock_irq+0x11/0x2b
  [<ffffffff815a6d82>] ? __down_read+0x34/0xa9
  [<ffffffff81185819>] ? nfsd+0x0/0x129
  [<ffffffff81185900>] ? nfsd+0xe7/0x129
  [<ffffffff8104cba1>] ? kthread+0x8b/0x93
  [<ffffffff8100c94a>] ? child_rip+0xa/0x20
  [<ffffffff812ec464>] ? generic_unplug_device+0x0/0x30
  [<ffffffff8104cb16>] ? kthread+0x0/0x93
  [<ffffffff8100c940>] ? child_rip+0x0/0x20
Comment 1 Trond Myklebust 2009-12-05 17:28:12 UTC
Hmm... This looks to me like it is rather an ext4 bug. Reassigning to
the ext4 folks for the moment.
Comment 2 Aneesh Kumar K.V 2009-12-07 17:09:41 UTC
On Sat, Dec 05, 2009 at 05:28:13PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14739
> 
> 
> Trond Myklebust <trond.myklebust@fys.uio.no> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |trond.myklebust@fys.uio.no
>           Component|NFS                         |ext4
>          AssignedTo|trond.myklebust@fys.uio.no  |fs_ext4@kernel-bugs.osdl.or
>                    |                            |g
> 
> 
> 
> 
> --- Comment #1 from Trond Myklebust <trond.myklebust@fys.uio.no>  2009-12-05
> 17:28:12 ---
> Hmm... This looks to me like it is rather an ext4 bug. Reassigning to
> the ext4 folks for the moment.
> 

Hmm we are doing dquot_reserve_space with i_block_reservation_lock held.
ie

spin_lock(&EXT4_I(inode)->i_block_reservation_lock);  
- vfs_dq_reserve_block
  - down_read(&sb_dqopt(inode->i_sb)->dqptr_sem)


Adding Mingming to CC:
-aneesh
Comment 3 Dmitry Monakhov 2009-12-11 12:11:33 UTC
Proposed patch http://patchwork.ozlabs.org/patch/40896/

tests performed:
 mounted : ext4  with generic quota and journalled quota 
 1) fsstress -p16 -d /mnt/test -l999999999999 -n9999999999999999
   Note: many task are necessary in order to catch race condition introduced
   in [v2] of the patch 
 2) ./write-chown-truncate /mnt/ 9999999999999