Bug 13676

Summary: unmount after fsstress on a ramdisk causes orphan inode list corruption
Product: File System Reporter: Eric Sandeen (sandeen)
Component: ext3Assignee: Jan Kara (jack)
Status: RESOLVED CODE_FIX    
Severity: normal CC: akpm, jack
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30-6.fc12 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: debug messages & oops.
Fix truncation of a long symlink after we failed to allocate a block for it

Description Eric Sandeen 2009-06-29 18:38:29 UTC
Created attachment 22143 [details]
debug messages & oops.

Running a test like this on 2.6.30-6.fc12 :

#!/bin/bash

mkfs.ext3 /dev/ram0

i=0
while (true); do
	i=`expr $i + 1`
	echo -------------------------------------------------------------
	echo Cycle $i
	date
	echo Mounting
	sleep 1
	mount -t ext3 /dev/ram0 /mnt/test || exit 1
	echo Removing old fsstress data
	rm -rf /mnt/test/work
	mkdir /mnt/test/work || exit 1
	echo Starting fsstress
	fsstress -d /mnt/test/work -p 3 -n 100000000 &
	echo Sleeping 30 seconds
	sleep 30
	echo Stopping fsstress
	while (ps -e | grep fsstress);do
		pkill fsstress
		sleep 1
	done
	echo Unmounting
	umount /mnt/test || exit 1
	echo Checking
	sleep 1
	e2fsck -fvp /dev/ram0 || exit 1
done

I get an assertion failure on the unmount, see attachment.

This testcase was originally reported at http://lkml.org/lkml/2008/11/14/121, though the end result was different, in that case corruption was found.
Comment 1 Andrew Morton 2009-06-29 18:47:49 UTC
More likely to be a ramdisk bug.

<checks>

yup, according to Adrian's report, it happened after the introduction of brd.

<marks as regression, assigns to Nick>

hm, we don't have a category for ramdisk.  I'll make it IO/Storage, Block layer.
Comment 2 Eric Sandeen 2009-06-30 16:18:31 UTC
Hm, ok.  FWIW, running the same test w/ xfs found no errors, and xfs generally is quite good at letting you know if something got corrupted.  *shrug*

note that Adrian reported a different problem than I'm seeing now ...
Comment 3 Jan Kara 2009-07-15 15:39:37 UTC
It's a genuine ext3/4 bug appearing when allocation of block for a long symlink fails and I'm the one who wrote it :(. Anyway, attached patch should fix it.
Comment 4 Jan Kara 2009-07-15 15:41:24 UTC
Created attachment 22356 [details]
Fix truncation of a long symlink after we failed to allocate a block for it
Comment 5 Jan Kara 2009-07-15 15:42:25 UTC
Eric, can you test the patch? It fixes the issue for me...
Comment 6 Eric Sandeen 2009-07-15 16:37:20 UTC
Sure thing, thanks Jan!  This bug had dropped off my radar TBH ...
Comment 7 Jan Kara 2009-08-03 17:54:39 UTC
The patch worked for me and is already upstream... I'm closing this as fixed. Please reopen if you see the bug again. Thanks.