Bug 4820 - Kernel Crash & system hang if copying of file continues even no more space left
Kernel Crash & system hang if copying of file continues even no more space left
Status: REJECTED INSUFFICIENT_DATA
Product: File System
Classification: Unclassified
Component: XFS
i386 Linux
: P2 high
Assigned To: XFS Guru
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-06-30 01:52 UTC by Francism
Modified: 2007-02-17 11:59 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.12
Tree: Mainline
Regression: ---


Attachments
generate termcap file on the share (43 bytes, text/plain)
2005-07-01 02:18 UTC, Francism
Details
copy and compare,occupy all space in the volume (145 bytes, text/plain)
2005-07-01 02:21 UTC, Francism
Details
LVM setup (1.38 KB, application/octet-stream)
2005-07-01 02:48 UTC, Francism
Details

Description Francism 2005-06-30 01:52:50 UTC
Distribution:Fedora 3
Hardware Environment: P4 2.4Ghz, 512MB RAM
Software Environment: LVM,SAMBA
Problem Description:
 Kernel Crash & system hang if copying of file continues
even no more space left.
Steps to reproduce:
1)Create share with LVM using XFS
2)Mount shares on windows
3)run scripts  on windows that will copy file
to used the entire space in a share.(if you 
want the script that i used, i can send it to 
you)
Run the script until space is 100% full and the
kelnel will crash,
Please refer to the kernel ring buffer messsage below.


XFS mounting filesystem dm-0
Ending clean XFS mount for filesystem: dm-0
NET: Registered protocol family 5
Unable to handle kernel NULL pointer dereference at virtual address 00000004
 printing eip:
c0289b74
*pde = 00000000
Oops: 0002 [#1]
PREEMPT SMP 
Modules linked in: appletalk psnap llc drbd bonding i2c_i801 i2c_dev i2c_core 
nls_cp437 aic7xxx e1000 e100 sym53c8xx
CPU:    0
EIP:    0060:[<c0289b74>]    Not tainted VLI
EFLAGS: 00010286   (2.6.10) 
EIP is at xfs_ail_insert+0x84/0xd0
eax: 00000000   ebx: ffffffff   ecx: fffffc19   edx: ffffffff
esi: c9d9bdfc   edi: d3164818   ebp: cd0631f4   esp: df653e00
ds: 007b   es: 007b   ss: 0068
Process xfslogd/0 (pid: 178, threadinfo=df652000 task=deef3520)
Stack: 00001d3c 00000002 00000000 00000000 cd0631f4 d3164818 d3164800 c7f5cdfc 
       c0289924 d3164818 cd0631f4 00000000 00001d3c 00000002 00000000 cd0631f4 
       00001d3c 00000002 c02894e8 d3164800 cd0631f4 00001d3c 00000002 00000000 
Call Trace:
 [<c0289924>] xfs_trans_update_ail+0x54/0xb0
 [<c02894e8>] xfs_trans_chunk_committed+0x158/0x1f0
 [<c02892cc>] xfs_trans_committed+0x3c/0x100
 [<c027cebe>] xlog_state_do_callback+0x20e/0x2c0
 [<c027cfc3>] xlog_state_done_syncing+0x53/0x70
 [<c027b967>] xlog_iodone+0x47/0xb0
 [<c0296c67>] pagebuf_iodone_work+0x37/0x40
 [<c0127f37>] worker_thread+0x197/0x230
 [<c0296c30>] pagebuf_iodone_work+0x0/0x40
 [<c0115930>] default_wake_function+0x0/0x20
 [<c0115930>] default_wake_function+0x0/0x20
 [<c0127da0>] worker_thread+0x0/0x230
 [<c012bc47>] kthread+0xa7/0xb0
 [<c012bba0>] kthread+0x0/0xb0
 [<c01008dd>] kernel_thread_helper+0x5/0x18
Code: 03 00 00 31 db 89 c8 89 da 85 d2 7e 0a 8b 76 04 eb b5 90 8d 74 26 00 7c 
05 83 f8 00 77 ef 8b 06 89 75 04 89 45 00 89 2e 8b 45 00 <89> 68 04 83 c4 10 5b 
5e 5f 5d c3 90 8b 5c 24 08 8b 0c 24 31 c0 
 <6>note: xfslogd/0[178] exited with preempt_count 1
Comment 1 Eric Sandeen 2005-06-30 05:30:17 UTC
Looks similar to an internal sgi bug, nr 927915, orignally reported at
http://www.icglink.com/cluster-debug-2.html

How easily can you reproduce this?

Can you see if you can simplify this test case, for example by taking SAMBA out of the equation
& run your copy script locally, and perhaps try a simple filesystem without lvm?

It'd be good to have a simple testcase to reproduce this, thanks.  (And if it does require
lvm/samba, then so be it) :)

Thanks,

-Eric
Comment 2 Francism 2005-06-30 18:34:14 UTC
Dear Eric,

This easily reproducible if system are with scsi in LVM and XFS.
Also can reproduce this if samba are out of the equation.

Same situation of the link that you've attached.


thanks,
francism
Comment 3 Eric Sandeen 2005-06-30 20:07:12 UTC
So samba is not part of the testcase... do you only see it with lvm?  

Please go ahead & include details of your lvm setup, as well as any script you may use to fill the
fs to reproduce the problem.

thanks,

-Eric
Comment 4 Francism 2005-07-01 02:18:39 UTC
Created attachment 5244 [details]
generate  termcap file on the share 

Runme first
Comment 5 Francism 2005-07-01 02:21:02 UTC
Created attachment 5245 [details]
copy and compare,occupy all space in the volume
Comment 6 Francism 2005-07-01 02:48:47 UTC
Created attachment 5246 [details]
LVM setup
Comment 7 Francism 2005-07-01 02:59:48 UTC
Dear Eric,

first create separate volume with xfs, then
copy & run attachment(id=5244)on a volume 
copy & run attachment(id=5245)on a volume,this will copy
file to accupy all space,please ignore any out of space messsages.
keep running the script until the system will crash.

also attached is my lvm config.


thanks,
francism

Comment 8 Adrian Bunk 2006-12-07 07:49:51 UTC
Is this issue still present in kernel 2.6.19?
Comment 9 Adrian Bunk 2007-02-17 11:59:42 UTC
Please reopen this bug if it's still present with kernel 2.6.20.

Note You need to log in before you can comment on or make changes to this bug.