Bug 7040

Summary: Oops when removing full snapshot
Product: IO/Storage Reporter: Damian Pietras (daper)
Component: LVM2/DMAssignee: Alasdair G Kergon (agk)
Status: CLOSED CODE_FIX    
Severity: normal CC: gmazyland
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.17.7 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg with oops when removing a full snapshot

Description Damian Pietras 2006-08-22 04:24:46 UTC
Most recent kernel where this bug did not occur: 2.6.16.27 (only stable tested)
Distribution: custom/Debian
Hardware Environment: P IV, 3ware SATA controller, 1GB RAM
Software Environment: LVM 
Problem Description: Invoking lvremove /dev/vg/snapshot when snapshot is full
(100% displayed by lvs in Snap% column) causes Oops in most cases. THe snapshot
can be removed after reboot.

Steps to reproduce:

1. Create a volume and a snapshot:

lvcreate -n lv -L 2G vg
lvcreate -s -n snap -L 128M vg

2. Write to the volume more than the snapshot size:

dd if=/dev/zero of=/dev/vg/lv bs=1M count=256

3. Try to remove the snapshot:
lvremove -f /dev/vg/snap

Now you should see segmentation fault and an Oops. The snapshot still exists.
Comment 1 Damian Pietras 2006-08-22 04:26:17 UTC
Created attachment 8848 [details]
dmesg with oops when removing a full snapshot
Comment 2 Damian Pietras 2006-09-12 11:35:44 UTC
I found out that reversing this patch:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=76df1c651b66bdf07d60b3d60789feb5f58d73e3
fixes the problem: I can remove the snapshot without Oops and everything works fine.
Comment 3 Alasdair G Kergon 2006-09-13 09:12:27 UTC
Possible patch:
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-snapshot-fix-freeing-pending-exception.patch
(to apply may require earlier snapshot patches listed in the series file)
Comment 4 Damian Pietras 2006-09-14 02:43:39 UTC
I've applied the patch to my 2.6.17.7 (I had to do it by hand in some places)
and it resolves the problem. Thanks.
Comment 5 Alasdair G Kergon 2006-09-14 15:12:18 UTC
Would you try the following patch to see if it's sufficient?

Alasdair


Index: linux-2.6.17/drivers/md/dm-snap.c
===================================================================
--- linux-2.6.17.orig/drivers/md/dm-snap.c	2006-09-14 23:05:18.000000000 +0100
+++ linux-2.6.17/drivers/md/dm-snap.c	2006-09-14 23:15:29.000000000 +0100
@@ -691,6 +691,7 @@ static void pending_complete(struct pend
 
 		free_exception(e);
 
+		remove_exception(&pe->e);
 		error_snapshot_bios(pe);
 		goto out;
 	}

Comment 6 Damian Pietras 2006-09-15 03:59:56 UTC
This simple patch also works.
Comment 7 Alasdair G Kergon 2006-09-22 15:09:43 UTC
OK.  The fix is already in the queue for 2.6.19, but let's get that one line
patch queued for -stable.