Bug 16538 - lvm hangs when creating snapshots of the root partition with 2.6.35
Summary: lvm hangs when creating snapshots of the root partition with 2.6.35
Status: CLOSED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: LVM2/DM (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Alasdair G Kergon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-07 14:36 UTC by François Valenduc
Modified: 2011-02-24 15:25 UTC (History)
3 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Output of lvmdump (29.27 KB, application/x-gzip)
2010-08-07 16:37 UTC, François Valenduc
Details
Output of strace when running lvcreate (63.00 KB, text/plain)
2010-08-07 20:09 UTC, François Valenduc
Details

Description François Valenduc 2010-08-07 14:36:00 UTC
Hello,
When I run lvcreate -L1G -n backup -s /dev/gentoo/root to create a snapshot of my root partition, the process hangs. I can only stop my computer with SysRq keys. When I restart the computer, the volume seems to have been created but it is not listed as a snapshot. I have tried LVM 2.02.70 and 2.02.72 and the problem occur with these 2 versions. I am using udev 160. Does anybody know what's happening ? With kernel 2.6.34.2, everything works correctly.

Thanks in advance for your help.
Comment 1 Alasdair G Kergon 2010-08-07 15:03:24 UTC
Try to supply some diagnostics -
e.g. the process backtraces sysrq will show and 'lvmdump' after rebooting.
Comment 2 François Valenduc 2010-08-07 16:37:16 UTC
Created attachment 27372 [details]
Output of lvmdump

So here is the output of lvmdump after having restarted the computer.
Comment 3 François Valenduc 2010-08-07 16:38:51 UTC
So, the logical volume is well created but it's not a snapshot. When I try to mount it, I get the following error:

mount: unknown filesystem type 'DM_snapshot_cow'
zsh: exit 32    LC_ALL='C' mount /dev/gentoo/backup /mnt/backup
Comment 4 François Valenduc 2010-08-07 20:03:53 UTC
I noticed that this problem occurs when I take a snapshot of the root partition.
I also tried git bisect but it doesn't reveals anything conclusive. The first bad commit is supposed to be the following:

f3b99be19ded511a1bf05a148276239d9f13eefa is the first bad commit
commit f3b99be19ded511a1bf05a148276239d9f13eefa
Author: NeilBrown <neilb@suse.de>
Date:   Thu Jun 24 13:31:03 2010 +1000

    Restore partition detection of newly created md arrays.

However, I don't use RAID but only LVM so this is not the real first bad commit.
Comment 5 François Valenduc 2010-08-07 20:09:56 UTC
Created attachment 27375 [details]
Output of strace when running lvcreate

Maybe you will find the output of strace interesting. Unfortunately, this is so long that I can only get the end. This is thus the result of
Comment 6 François Valenduc 2010-08-07 20:11:16 UTC
I did something wrong. So, the strace is the result of
strace lvcreate -L1G -n backup -s /dev/gentoo/root. Since it completely fills the console, I only have the end of it.
Comment 7 François Valenduc 2010-08-11 20:22:52 UTC
I finally find the problematic commit with git-bisect. This is the following one:

commit 6b0310fbf087ad6e9e3b8392adca97cd77184084
Author: Eric Sandeen <sandeen@redhat.com>
Date:   Sun May 16 02:00:00 2010 -0400

    ext4: don't return to userspace after freezing the fs with a mutex held

If I revert it, the problem doesn't occur.
Comment 8 François Valenduc 2010-08-11 21:34:47 UTC
This problem is in fact fixed by the following commit:

commit d2aa412a7e270bc679a586dda415344f2b8ad01e
Author: Eric Sandeen <sandeen@sandeen.net>
Date:   Sun Aug 1 17:33:29 2010 -0400

    ext4: fix freeze deadlock under IO

Note You need to log in before you can comment on or make changes to this bug.