Bug 14472
Summary: | EXT4 corruption | ||
---|---|---|---|
Product: | File System | Reporter: | Rafael J. Wysocki (rjw) |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | parag.lkml, sandeen |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32-rc4 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 14230 |
Description
Rafael J. Wysocki
2009-10-26 16:52:41 UTC
On Thursday 29 October 2009, Andrew Lutomirski wrote:
> On Mon, Oct 26, 2009 at 2:55 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.31. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14472
> > Subject : EXT4 corruption
> > Submitter : Shawn Starr <shawn.starr@rogers.com>
> > Date : 2009-10-13 2:07 (14 days old)
> > References : http://marc.info/?l=linux-kernel&m=125539997508256&w=4
> > Handled-By : Theodore Tso <tytso@mit.edu>
> >
>
>
> This but is *not* fixed. I just triggered it a few minutes ago by
> abusing i915 and drm, which caused a panic. This is slightly newer
> than 2.6.32-rc5, with a couple of i915 bugfixes thrown in.
>
> Photos are here:
> http://web.mit.edu/luto/www/ext4_crashphotos/
>
> This is a very nasty regression, for obvious reasons.
I looked at the fsck pics - I have gone through this a few days ago. Aneesh suggested to apply the below patch and after applying it and crashing the machine couple times I have not observed the corruption. So I have reason to hope this patch below on top of today's git should improve things. Please try. commit a8836b1d6f92273e001012c7705ae8f4c3d5fb65 Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Date: Tue Oct 27 15:36:38 2009 +0530 ext4: discard preallocation during truncate We need to make sure when we drop and reacquire the inode's i_data_sem we discard the inode preallocation. Otherwise we could have blocks marked as free in bitmap but still belonging to prealloc space. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5c5bc5d..a1ef1c3 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -209,6 +209,12 @@ static int try_to_extend_transaction(handle_t *handle, struct inode *inode) up_write(&EXT4_I(inode)->i_data_sem); ret = ext4_journal_restart(handle, blocks_for_truncate(inode)); down_write(&EXT4_I(inode)->i_data_sem); + /* + * We have dropped i_data_sem. So somebody else could have done + * block allocation. So discard the prealloc space created as a + * part of block allocation + */ + ext4_discard_preallocations(inode); return ret; } Lest champagne break out too early, I have still seen corruption with this patch in place, while running my testcase (mentioned in bug #14354) -Eric On Tuesday 17 November 2009, Andy Lutomirski wrote:
> I'm think this was the journal checksumming bug, which is fixed.
>
>
> On Nov 16, 2009, at 5:37 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.31. Please verify if it still should be listed and let me
> > know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=14472
> > Subject : EXT4 corruption
> > Submitter : Shawn Starr <shawn.starr@rogers.com>
> > Date : 2009-10-13 2:07 (35 days old)
> > References : http://marc.info/?l=linux-kernel&m=125539997508256&w=4
> > Handled-By : Theodore Tso <tytso@mit.edu>
|