Bug 4191 - Process hangs while running file system stress test on XFS
Summary: Process hangs while running file system stress test on XFS
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: File System
Classification: Unclassified
Component: XFS (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: XFS Guru
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-02-10 01:06 UTC by Sharyathi
Modified: 2006-07-28 02:02 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.11-rc3
Tree: Mainline
Regression: ---


Attachments

Description Sharyathi 2005-02-10 01:06:03 UTC
Distribution: SLES 9 
Hardware Environment: x235, 2 way, memory 1.5 GB
Software Environment: 2.6.11-rc3-mm1
Problem Description:  We started stress tests on mounted XFS file system, after
5-6 hours we noticed rm command getting hung and we are not able to delete all
the files from the XFS file system and the file system couldn't be unmounted.
The test process can be killed but even after that the rm command called by the
test process still remained. 

The stack trace of the rm process is as follows:
rm            D E5851F94     0  1077  32083                     (NOTLB)
c76dbf38 00000082 c1de6540 e5851f94 f3c12580 f3c125ac e5851f94 e3d5ca60
       c0116efb 00000001 00000001 c1c11020 00000001 0003640e 27a1b000 000f9b95
       00000006 e3d5ca60 e3d5cbb4 00002000 00000001 f348b7cc f348b7d4 00000246
Call Trace:
 [<c0116efb>] do_page_fault+0x1ab/0x57a
 [<c03089eb>] __down+0x7b/0xf0
 [<c011a7c0>] default_wake_function+0x0/0x10
 [<c0169270>] filldir64+0x0/0x100
 [<c0308b87>] __down_failed+0x7/0xc
 [<c0169428>] .text.lock.readdir+0x8/0x20
 [<c01693d7>] sys_getdents64+0x67/0xb0
 [<c01685d5>] do_fcntl+0x115/0x170
 [<c0103e19>] sysenter_past_esp+0x52/0x79

Stack trace of xms deamons running at that instance is as follows:

xfslogd/0     S 00000002     0 12979     19         12980  1123 (L-TLB)
xfslogd/1     S C01EAC2A     0 12980     19         12981 12979 (L-TLB)
 [<f928df10>] pagebuf_iodone_work+0x0/0x40 [xfs]
xfslogd/2     S C01EAC2A     0 12981     19         12982 12980 (L-TLB)
 [<f928df10>] pagebuf_iodone_work+0x0/0x40 [xfs]
xfslogd/3     S C0119AD5     0 12982     19         12983 12981 (L-TLB)
xfsdatad/0    S C0123194     0 12983     19         12984 12982 (L-TLB)
xfsdatad/1    S C0119AD5     0 12984     19         12985 12983 (L-TLB)
xfsdatad/2    S C0119AD5     0 12985     19         12986 12984 (L-TLB)
xfsdatad/3    S C0119AD5     0 12986     19         20555 12985 (L-TLB)
xfsbufd       S C0104D9A     0 12987      1         12989 11221 (L-TLB)
 [<f928ec4a>] pagebuf_daemon+0x6a/0x1e0 [xfs]
 [<f928ebe0>] pagebuf_daemon+0x0/0x1e0 [xfs]
xfssyncd      S C1C0AF9C     0 12989      1         32410 12987 (L-TLB)
 [<f9293c8b>] xfssyncd+0x7b/0x1b0 [xfs]
 [<f9293c10>] xfssyncd+0x0/0x1b0 [xfs]
 [<f92734b0>] xlog_grant_push_ail+0x130/0x170 [xfs]
 [<f9269572>] .text.lock.xfs_iget+0x72/0x160 [xfs]
 [<f928775f>] xfs_lock_inodes+0xaf/0x120 [xfs]
 [<f928769a>] xfs_lock_dir_and_entry+0xda/0xf0 [xfs]
 [<f9287927>] xfs_remove+0x157/0x3f0 [xfs]
 [<f9280c00>] xfs_trans_unlocked_item+0x20/0x40 [xfs]
 [<f9280c00>] xfs_trans_unlocked_item+0x20/0x40 [xfs]
 [<f9285e97>] xfs_access+0x37/0x40 [xfs]
 [<f9291bb0>] linvfs_permission+0x0/0x20 [xfs]
 [<f9291913>] linvfs_unlink+0x13/0x40 [xfs]
 [<f926defc>] xfs_ichgtime+0xfc/0x104 [xfs]
 [<f9291bb0>] linvfs_permission+0x0/0x20 [xfs]


Steps to reproduce:
mount a xfs file system 
and run File System Stress test on the File System.
after running it for 5-6 hours we found out we were not able to unmount the file
system.
Note: We encountered this problem after following these steps, not sure if the
deffect can be reproduced.
Comment 1 Sharyathi 2005-02-10 04:43:09 UTC
Please consider the test being conducted on 2.6.11-rc3 and not on 2.6.11-rc3-
mm1 as I mentioned earlier
Comment 2 Adrian Bunk 2006-03-02 08:57:24 UTC
Is this issue still present in recent 2.6 kernels?
Comment 3 Adrian Bunk 2006-07-28 02:02:24 UTC
Please reopen this bug if it's still present in kernel 2.6.17.

Note You need to log in before you can comment on or make changes to this bug.