Bug 5299 - Xfs_io hangs up under lvm2 lvcreate snapshot and copying data from remote host to smb share at the same time
Summary: Xfs_io hangs up under lvm2 lvcreate snapshot and copying data from remote hos...
Status: REJECTED UNREPRODUCIBLE
Alias: None
Product: File System
Classification: Unclassified
Component: XFS (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: XFS Guru
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-23 08:20 UTC by Hubert
Modified: 2006-12-12 07:09 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.14-rc2
Tree: Mainline
Regression: ---


Attachments

Description Hubert 2005-09-23 08:20:04 UTC
Most recent kernel where this bug did not occur: not stated
Distribution: Linux/GNU Debian 3.0 Sarge
Hardware Environment: 2x Pentium Xeon 2.8, 512MB DDR, 3ware 9xxx SATA raid
controller, 6x 250MB SATA HDDs, Raid0 = 1.36T
Software Environment: libdevmapper 1.01.04 and lvm 2.01.09 stable (latest dm
1.01.05 and lvm 2.01.15 frome cvs tested too), xfsprogs stable .deb, samba 3.0.14a

Problem Description:

Under lvcreate/lvremove snapshot of XFS logical volume and copying data from
remote host to smb share at the same time xfs_io hangs up and stays in memory as
death process. It occures 1 at 20 or 10 or 4 iteration (randomly). 

Steps to reproduce:

pvcreate /dev/sda
vgcreate vg /dev/sda
lvcreate -L 698G -n lv vg /dev/sda
mkfs.xfs /dev/vg0
mount -t xfs -o usrquota,grpquota,noatime,nodiratime /dev/vg/lv /mnt/lv

do some smb share /mnt/smb
start copying data from remote host to smb share and under this process lvcreate
/ lvremove 2 big snapshots (347G both) with xfs_freeze -f/u interaction:

xfs_freeze -f /mnt/lv
lvcreate -s -l 11168 -n S1 -p rw /dev/vg/lv &
sleep 7
xfs_freeze -u /mnt/lv
mount -t xfs -o noatime,nodiratime,nouuid,ro /dev/vg/S1 /snapshots/S1

sleep 30

create second snapshot (named S2) identical like S1 above and mount it too

sleep 3m

xfs_freeze -f /mnt/lv
lvremove /dev/vg/lv/S1 &
sleep 1
xfs_freeze -u /mnt/lv

sleep 30

remove second snapshot identically like S1

make few iterations of lvcreate/lvremove 2 snapshots like is shown above

Without xfs_freeze -f/u lvcreate and lvremove snapshot hangs up always if
copying data at the same time. 

Thanks for any help
Best Regards,

Hubert
Comment 1 Eric Sandeen 2005-09-23 08:41:18 UTC
If you are so inclined, you could get the cvs kernel from oss.sgi.com, which
has kdb, and when xfs_io hangs you could break into the debugger and issue
a "ps D" to see which threads are in D state, and backtrace them.

This could offer some good clues on where things are hung up.

It might also be interesting to take smb out of the picture; do you see the
same issues with local IO to filesystem on the lvm device?  That might be a
simpler case.
Comment 2 Hubert 2005-09-28 11:18:14 UTC
Thx for your tips Eric. I did it. When xfs_io hangs i issued "ps D" in KDB and
backtraced them by "btp process_id" but i don't know how to write its output to
txt file. Do you ? Maybe i should use crash or other tool ? 

I cannot to take smb out of the picture becaus i neeed smb shares for copying
data from remote host. I noticed some abnormal execution of kernel pdflush under
local IO to filesystem on the lvm device it sometimes hangs and then xfssyncd
hangs too and after all any IO to this lv aren't possible. Sometimes pdflush
hangs under copying data to smb share and lvcreate or lvremove snapshot at the
same time. I think it could be some bug.
Comment 3 Adrian Bunk 2006-12-07 07:50:11 UTC
Is this issue still present in kernel 2.6.19?
Comment 4 Hubert 2006-12-12 06:23:07 UTC
Hi Adrian,
Sorry I don't know that, because I don't work about it any more at all.
Maybe someone else checked it out or could check it for you.
Best Regards,
Hubert

Note You need to log in before you can comment on or make changes to this bug.