Latest working kernel version: 2.6.24
Earliest failing kernel version: 2.6.25-rc5-git4
Hardware Environment: AUS A8V, Athlon 64 x2 4200+
Software Environment: raid1, cryptsetup (luks), lvm2, xfs
Problem Description: "sometimes" a rm command stalls. The files are deleted but the xterm or console are frozen.
A strace on the pid stalls as well without any message after the "attached to pid" message.
Then, it is impossible to sync the filestem (command hangs) or to umount the device (busy).
I've never seen this problem with 2.6.24 (this doesnt mean it doesnt exist). Maybe it was not existing with 2.6.25-rc2 but I've not used it too much.
I have it once or twice a day on 2.6.25-rc5.
The rm process is not killeable. I need to reboot to get rid of it. The filsystem, after playing the journals doesnt appear to be corrupted (xfs_check dosnt report any error).
Steps to reproduce: rm -rf /xxx/xxxx
(I got it mostly cleaning a tree via a script after building a debian package on my machine).
The filesystem is an xfs filesystem.
It is built on a raid1, encrypted with cryptsetup using luks and it is a lvm2 logical volume over this raid1.
Does it only happen on luks volumes?
try "echo w > /proc/sysrq-trigger" to see which tasks that are in uninterruptable (blocked) state. If possible, attach here.
Created attachment 15267 [details]
task blocked, syslog for w to sysrq-trigger
I've never had the problem on non-luks volume. But non luks have poor write/delete activity (root filesystem, /usr)
I've had the problem no doing a rm but running a c++ compilation.
assume this is the known dm-crypt regression - we're working on a patch
(a ref counting bug meaning in certain circumstances dm-crypt layer holds onto i/o for ever and never reports it completed)
Is it a duplicate of Bug #10207?
Most likely we won't know for sure whether it's the same as bug #10207 until there's a fix for which Jean-Luc can verify whether or not it fixes the problem for him?
> Most likely we won't know for sure whether it's the same as bug #10207 until
> there's a fix for which Jean-Luc can verify whether or not it fixes the
> for him?
Please try patch in http://lkml.org/lkml/2008/3/14/347
I've tested the patch (on 2.6.25-rc5-git4).
I've stressed a bit the system and I've no more the problem so far.
Latest patch for dm-crypt in http://lkml.org/lkml/2008/3/17/214
(the same patch mentioned in bug 10207)
Please test the patch in comment 9 - I think that one's ready to submit.
Patch : http://lkml.org/lkml/2008/3/27/293
*** Bug 10207 has been marked as a duplicate of this bug. ***
fixed by commit 3f1e9070f63b0eecadfa059959bf7c9dbe835962