Bug 25352 - resizing ext4 will corrupt filesystem
Summary: resizing ext4 will corrupt filesystem
Status: CLOSED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks: 21782
  Show dependency tree
 
Reported: 2010-12-20 20:56 UTC by Kees Cook
Modified: 2011-03-31 19:24 UTC (History)
6 users (show)

See Also:
Kernel Version: 2.6.37-rc6
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
script that will demo a corrupted ext4 after resize (438 bytes, text/plain)
2010-12-20 20:57 UTC, Kees Cook
Details
Proposed patch (1.41 KB, patch)
2010-12-21 03:33 UTC, Theodore Tso
Details | Diff

Description Kees Cook 2010-12-20 20:56:44 UTC
Using resize2fs on an ext4 will result in a corrupted filesystem. This is a regression (obviously).

I would expect "fsck" to be clean on a recently resized filesystem, but it is not:

Pass 5: Checking group summary information
Block bitmap differences:  +(2621440--2621951) +(2654210--2655360) +(2686976--2687487) +(2719744--2720255) +(2752512--2753023) +(2785280--2785791) +(2818048--2818559) +(2850816--2851327) +(2883584--2884095) +(2916352--2916863) +(2949120--2949631) +(2981888--2982399) +(3014656--3015167) +(3047424--3047935) +(3080192--3080703) +(3112960--3113471) +(3145728--3146239) +(3178496--3179007) +(3211264--3211775) +(3244032--3244543) +(3276800--3277311) +(3309568--3310079) +(3342336--3342847) +(3375104--3375615) +(3407872--3408383) +(3440640--3441151) +(3473408--3473919) +(3506176--3506687) +(3538944--3539455) +(3571712--3572223) +(3604480--3604991) +(3637248--3637759) +(3670016--3670527) +(3702784--3703295) +(3735552--3736063) +(3768320--3768831) +(3801088--3801599) +(3833856--3834367) +(3866624--3867135) +(3899392--3899903)

etc

Reproducer script attached...
Comment 1 Kees Cook 2010-12-20 20:57:53 UTC
Created attachment 41062 [details]
script that will demo a corrupted ext4 after resize

This has already been reported to Ubuntu, but was reproduced with an upstream kernel, so I've opened this report as well.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/692704
Comment 2 Theodore Tso 2010-12-21 03:33:48 UTC
Created attachment 41142 [details]
Proposed patch

Yes, this is a regression new to 2.6.37-rc1, which was introduced by
commit a31437b85: ext4: use sb_issue_zeroout in setup_new_group_blocks.

When we replaced the loop zero'ing the inode table blocks with
sb_issue_zeroout, we accidentally also removed this little tidbit:

-               ext4_set_bit(bit, bh->b_data);

... which was responsible for setting the block allocation bitmap to
reserve the block descriptor blocks and inode table blocks.  Oops...

I believe this patch should fix things.
Comment 3 Theodore Tso 2010-12-21 04:05:20 UTC
On Mon, Dec 20, 2010 at 08:56:46PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> Using resize2fs on an ext4 will result in a corrupted filesystem. This is a
> regression (obviously).

Yes, this is a regression new to 2.6.37-rc1, which was introduced by
commit a31437b85: ext4: use sb_issue_zeroout in setup_new_group_blocks.

When we replaced the loop zero'ing the inode table blocks with
sb_issue_zeroout, we accidentally also removed this little tidbit:

-               ext4_set_bit(bit, bh->b_data);

... which was responsible for setting the block allocation bitmap to
reserve the block descriptor blocks and inode table blocks.  Oops...

I believe this patch should fix things.

						- Ted
Comment 4 Kees Cook 2010-12-21 04:26:12 UTC
Thanks for tracking it down! After a fsck, I'm still seeing fs corruption, unfortunately:

[177266.375628] EXT4-fs error (device dm-1): htree_dirblock_to_tree:586: inode #12255304: block 88074025: comm rm: bad entry in directory: rec_len is smaller than minimal - offset=0(4096), inode=0, rec_len=0, name_len=0
[177266.375872] EXT4-fs error (device dm-1): htree_dirblock_to_tree:586: inode #12255304: block 88074026: comm rm: bad entry in directory: rec_len is smaller than minimal - offset=0(8192), inode=0, rec_len=0, name_len=0
[177266.376135] EXT4-fs error (device dm-1): empty_dir:1922: inode #12255304: block 88074025: comm rm: bad entry in directory: rec_len is smaller than minimal - offset=0(4096), inode=0, rec_len=0, name_len=0
[177266.376360] EXT4-fs error (device dm-1): empty_dir:1922: inode #12255304: block 88074026: comm rm: bad entry in directory: rec_len is smaller than minimal - offset=0(8192), inode=0, rec_len=0, name_len=0

fsck didn't notice this problem, but walking the tree seems to trigger it. I've been trying to clean it up by just removing the offending directory, but it I figured I'd mention it since it seems to be a problem that fsck -f didn't see.
Comment 5 Lukas Czerner 2010-12-21 12:31:33 UTC
Oops indeed. Ted, thanks for the patch, it seems to fix the problem
completely.

-Lukas
Comment 6 Lukas Czerner 2010-12-21 13:10:28 UTC
On Mon, 20 Dec 2010, Ted Ts'o wrote:

> On Mon, Dec 20, 2010 at 08:56:46PM +0000, bugzilla-daemon@bugzilla.kernel.org
> wrote:
> > 
> > Using resize2fs on an ext4 will result in a corrupted filesystem. This is a
> > regression (obviously).
> 
> Yes, this is a regression new to 2.6.37-rc1, which was introduced by
> commit a31437b85: ext4: use sb_issue_zeroout in setup_new_group_blocks.
> 
> When we replaced the loop zero'ing the inode table blocks with
> sb_issue_zeroout, we accidentally also removed this little tidbit:
> 
> -               ext4_set_bit(bit, bh->b_data);
> 
> ... which was responsible for setting the block allocation bitmap to
> reserve the block descriptor blocks and inode table blocks.  Oops...
> 
> I believe this patch should fix things.
> 
>                                               - Ted
> 
> 
> 

Oops indeed. Ted, thanks for the patch, it seems to fix the problem
completely.

-Lukas
Comment 7 Theodore Tso 2010-12-21 14:19:17 UTC
Kees, was this (comment #4) using your resize-corruption.sh patch?  After applying the patch I've enclosed, I've rerun your script, and it showed no problems.  I then mounted the testfs file system, and ran ls -lR on /mnt/test, and still no problems...
Comment 8 Kees Cook 2010-12-21 18:03:21 UTC
Ted, no, sorry; I didn't mean to confuse. Those are just left-over corruption from my initial fs hit. I just thought I'd report the fact that fsck didn't notice this when cleaning up from the original corruption.

I.e. here's my timeline for this corruption:

resize
get errors in dmesg
umount
fsck -f (for half a day, cleans up tons)
mount
delete all of lost+found
continue using fs
more dmesg errors
umount
fsck -f (returns without error)
mount
continue using fs
still dmesg errors
rm offending directory completely
no more errors

So, it seemed like a flaw in fsck that it didn't find the bad directory, but since it was related to the corruption introduced by this kernel bug, I thought I'd bring it up in this thread.
Comment 9 Theodore Tso 2010-12-21 19:19:46 UTC
Ah, thanks for the clarification.

Ok, I think I see what's going on.  It's a difference of how e2fsck treats a case of rec_len == 0 for block sizes less than 64k compared to the kernel.   It's an edge case, but it's one we should definitely fix.  Thanks for pointing it out.
Comment 10 Rafael J. Wysocki 2010-12-21 22:32:29 UTC
Handled-By : Theodore Tso <tytso@mit.edu>
Comment 12 Martin Steigerwald 2010-12-30 13:47:11 UTC
I had a corrupted ext4 yesterday after I made a ThinkPad T42 BIOS update while I just let the kernel hibernate. The kernel consequently oopsed after resuming after the BIOS update - well whether it did so consequently, but it did it, I made a screenshot of it, some ACPI related stuff AFAIR. Now I wonder whether it was me wanting to save boot and uptime causing the issue or whether it was the online resize a few days before - and I just didn't notice it cause actually I did not reboot since then before.

Can you have a short log at the following to see whether that might have been the same online resizing issue? I'd just like to know what might have been the cause for that filesystem issue - cause I doubt that my risk based approach of doing the BIOS update could have caused such a corruption. I will use the shutdown and reboot method on any subsequent BIOS updated anyway - that much I learned.

I already recovered by rsync'ing changed files to my backup as far as possible and then redoing Ext4 from scratch with mkfs.ext4 and then restoring from backup. I do not have the old state available anymore as I do not have a spare 220 GB to dd the filesystem to.

Thus I just like to know whether the following hints at this online resizing issue or not. I have full output logs available on request. This is with:

martin@shambhala:~> cat /proc/version 
Linux version 2.6.37-rc7-tp42-ata-eh-dbg-dirty (martin@shambhala) (gcc version 4.4.5 (Debian 4.4.5-8) ) #1 PREEMPT Wed Dec 22 11:41:20 CET 2010

Which is a plain 2.6.37-rc7 + a libata debug patch in order to get to the cause of bug #25392.

Here is some excerpt of fsck.ext4 output:

sh-4.1# fsck.ext4 -y /dev/shambhala/home 
e2fsck 1.41.12 (17-May-2010)
home enthält ein fehlerhaftes Dateisystem, Prüfung erzwungen.
Durchgang 1: Prüfe Inodes, Blocks, und Größen
Inode 13107224 hat Imagic-Flag gesetzt.  Bereinige? ja

HTREE Verzeichnis Inode 13107224 hat eine zu große Verzeichnistiefe von (0)
Repariere? ja

Inode 13107225 ist in Benutzung, aber hat dtime gesetzt.  Repariere? ja

Inode 13107225 hat Imagic-Flag gesetzt.  Bereinige? ja

HTREE Verzeichnis Inode 13107225 hat eine zu große Verzeichnistiefe von (0)
Repariere? ja

Inode 13107226 ist in Benutzung, aber hat dtime gesetzt.  Repariere? ja

Inode 13107226 hat Imagic-Flag gesetzt.  Bereinige? ja

HTREE Verzeichnis Inode 13107226 hat eine zu große Verzeichnistiefe von (0)
Repariere? ja

Inode 13107227 ist in Benutzung, aber hat dtime gesetzt.  Repariere? ja

Inode 13107227 hat Imagic-Flag gesetzt.  Bereinige? ja

HTREE Verzeichnis Inode 13107227 hat eine zu große Verzeichnistiefe von (0)
Repariere? ja

Inode 13107228 ist in Benutzung, aber hat dtime gesetzt.  Repariere? ja

Inode 13107228 hat Imagic-Flag gesetzt.  Bereinige? ja

HTREE Verzeichnis Inode 13107228 hat eine zu große Verzeichnistiefe von (0)
Repariere? ja

Inode 13107228 should not have EOFBLOCKS_FL set (size 7163927836232864823, lblk -1)
Bereinige? ja

Inode 13107228, i_size ist 7163927836232864823, sollte sein 0.  Repariere? ja

Inode 13107228, i_Blocks ist 54105622007074, sollte sein 0.  Repariere? ja

Inode 13107229 ist in Benutzung, aber hat dtime gesetzt.  Repariere? ja



It took to long for me and I thought it the result wouldn't make sense anyway - thus I aborted it:

Inode 13111740, i_size ist 3695299676852405555, sollte sein 0.  Repariere? ja

Inode 13111740, i_Blocks ist 54353923893113, sollte sein 0.  Repariere? ja

Inode 13111683^Ckomprimierenion Flag gesetzt auf Dateisystem ohne komprimierenion Unterstützung.  Bereinige? ja

Inode 13111683 hat unzulässigen Block(s).  Bereinige? ja

Illegal(er) Block #Block Nr.0 (1377831437) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.1 (1114388856) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.2 (1432639345) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.3 (1767001172) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.4 (810839411) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.5 (1869950832) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.6 (1917276759) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.7 (2051289946) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.8 (1516727671) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.9 (959854137) in Inode 13111683.  BEREINIGT.
Illegal(er) Block #Block Nr.10 (1951023915) in Inode 13111683.  BEREINIGT.
Zu viele unzulässige Blocks in Inode 13111683.
Bereinige Inode? ja

home: e2fsck abgebrochhen.

home: ***** DATEISYSTEM WURDE VERÄNDERT *****

home: ********** WARNUNG: Noch Fehler im Dateisystem  **********

This was from within a KNOPPIX 6.4.



Later on I tried my Debian Squeeze/Sid/Experimental installation, / was unharmed - but also not resized so this might be a hint. This time it aborted with a memory error:

Illegal(er) Block #Block Nr.11 (2155151360) in Inode 13203817.  BEREINIGT.
Illegal(er) Block #doppelt indirekte Blöcke (1460142208) in Inode 13203817.  BEREINIGT.
Illegal(er) Block #Block Nr.11546678 (620001054) in Inode 13203817.  BEREINIGT.
Illegal(er) Block #Block Nr.11546679 (3841423132) in Inode 13203817.  BEREINIGT.
Illegal(er) Block #Block Nr.11546680 (2438523380) in Inode 13203817.  BEREINIGT.
Illegal(er) Block #Block Nr.11546681 (3656408572) in Inode 13203817.  BEREINIGT.
Zu viele unzulässige Blocks in Inode 13203817.
Bereinige Inode? ja

Inode 13203847 hat defekten erweiterte Eigenschaft Block 1048576.  Bereinige? ja

Inode 13203847 hat unzulässigen Block(s).  Bereinige? ja

Illegal(er) Block #Block Nr.0 (184657162) in Inode 13203847.  BEREINIGT.
Illegal(er) Block #Block Nr.3 (3936944128) in Inode 13203847.  BEREINIGT.
Illegal(er) Block #Block Nr.5 (2836004992) in Inode 13203847.  BEREINIGT.
Illegal(er) Block #Block Nr.7 (1079508992) in Inode 13203847.  BEREINIGT.
Illegal(er) Block #Block Nr.9 (2903113856) in Inode 13203847.  BEREINIGT.
Illegal(er) Block #Block Nr.11 (2153250816) in Inode 13203847.  BEREINIGT.
Illegal(er) Block #doppelt indirekte Blöcke (2970222720) in Inode 13203847.  BEREINIGT.
Illegal(er) Block #Block Nr.11546682 (1962632602) in Inode 13203847.  BEREINIGT.
Fehler beim Speichern Verzeichnis Block Informationen (Inode=13203847, Block=0, num=11018850): Memory allocation fae2fsck: abgebrochen 


After this Ext4 was not mountable at all:

shambhala:~#8> mount /home
mount: wrong fs type, bad option, bad superblock on /dev/mapper/shambhala-home,
       missing codepage or helper program, or other error
       Manchmal liefert das Syslog wertvolle Informationen – versuchen
       Sie  dmesg | tail  oder so

# 

I think it was this:

Dec 29 21:58:37 shambhala kernel: EXT4-fs (dm-1): ext4_check_descriptors: Checksum for group 0 failed (16277!=0)
Dec 29 21:58:37 shambhala kernel: EXT4-fs (dm-1): group descriptors corrupted!
Dec 29 21:59:50 shambhala kernel: EXT4-fs (dm-1): warning: mounting unchecked fs, running e2fsck is recommended
Dec 29 21:59:50 shambhala kernel: EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)

But in my case it actually didn't mount it - at least I thought so, cause I saw the error on the console.

I used fsck.ext4 again and canceled it right after it repaired those checksums it complained about in dmesg:

shambhala:~#32> fsck.ext4 -y /dev/shambhala/home

Prüfsumme von Gruppe -Deskriptor 1 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 2 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 3 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 4 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 5 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 6 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 7 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 8 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 9 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 10 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 11 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 12 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 13 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 14 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 15 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 16 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 17 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 18 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 19 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 20 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 21 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 22 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 23 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 24 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 25 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 26 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 27 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 28 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 29 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 30 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 31 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 36 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 37 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 38 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 39 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 40 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 256 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 257 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 258 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 259 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 260 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 261 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 262 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 896 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 897 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 898 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 262 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 896 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 897 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 898 ist ungültig.  REPARIERT.
Prüfsumme von Gruppe -Deskriptor 899 ist ungültig.  REPARIERT.
home enthält ein fehlerhaftes Dateisystem, Prüfung erzwungen.
Root Inode ist kein Verzeichnis.  Zurücksetzen? ja
Durchgang 1: Prüfe Inodes, Blocks, und Größen
^Chome: e2fsck abgebrochhen.

home: ***** DATEISYSTEM WURDE VERÄNDERT *****

home: ********** WARNUNG: Noch Fehler im Dateisystem  **********

After this I was able to mount it. Then I did the rsync thing as I wanted to save as much of my new data since the last backup as possible. Despite the long fsck.ext4 outputs I think almost all of the new files are accessible.

Hmmm, it seems I had Ext4 issue prior to that BIOS update yesterday:

martin@shambhala:~> grep -i ext4 /var/log/syslog
Dec 27 11:45:27 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160032766 blocks in bitmap, 32254 in gd
Dec 27 11:57:24 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160132766 blocks in bitmap, 32254 in gd
Dec 29 13:59:36 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160232766 blocks in bitmap, 32254 in gd
Dec 29 13:59:43 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160332766 blocks in bitmap, 32254 in gd
Dec 29 13:59:54 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160432766 blocks in bitmap, 32254 in gd
Dec 29 14:00:00 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160532766 blocks in bitmap, 32254 in gd
Dec 29 14:00:09 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160632766 blocks in bitmap, 32254 in gd
Dec 29 14:00:18 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160732766 blocks in bitmap, 32254 in gd
Dec 29 14:00:40 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160832766 blocks in bitmap, 32254 in gd
Dec 29 14:00:49 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 160932766 blocks in bitmap, 32254 in gd
Dec 29 14:00:55 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 161032766 blocks in bitmap, 32254 in gd
Dec 29 14:01:17 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 161132766 blocks in bitmap, 32254 in gd
Dec 29 14:01:26 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 161232766 blocks in bitmap, 32254 in gd
Dec 29 14:02:39 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 161332766 blocks in bitmap, 32254 in gd
Dec 29 14:05:48 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 161432766 blocks in bitmap, 32254 in gd
Dec 29 14:05:58 shambhala kernel: EXT4-fs error (device dm-1): ext4_mb_generate_buddy:726: group 161532766 blocks in bitmap, 32254 in gd
Dec 29 21:51:57 shambhala kernel: EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
Dec 29 21:51:57 shambhala kernel: EXT4-fs (dm-0): re-mounted. Opts: (null)
Dec 29 21:51:57 shambhala kernel: EXT4-fs (dm-0): re-mounted. Opts: (null)
Dec 29 21:51:57 shambhala kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
Dec 29 21:51:57 shambhala kernel: EXT4-fs (dm-1): warning: mounting fs with errors, running e2fsck is recommended
Dec 29 21:51:57 shambhala kernel: EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
Dec 29 21:58:37 shambhala kernel: EXT4-fs (dm-1): ext4_check_descriptors: Checksum for group 0 failed (16277!=0)
Dec 29 21:58:37 shambhala kernel: EXT4-fs (dm-1): group descriptors corrupted!
Dec 29 21:59:50 shambhala kernel: EXT4-fs (dm-1): warning: mounting unchecked fs, running e2fsck is recommended
Dec 29 21:59:50 shambhala kernel: EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
Dec 29 22:03:46 shambhala kernel: EXT4-fs (dm-2): warning: maximal mount count reached, running e2fsck is recommended
Dec 29 22:03:46 shambhala kernel: EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null)
Dec 29 22:04:50 shambhala kernel: EXT4-fs (dm-1): error count: 16
Dec 29 22:04:50 shambhala kernel: EXT4-fs (dm-1): initial error at 1293446727: ext4_mb_generate_buddy:726
Dec 29 22:04:50 shambhala kernel: EXT4-fs (dm-1): last error at 1293627958: ext4_mb_generate_buddy:726
Dec 29 22:06:20 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413041: comm rsync: bogus i_mode (0)
Dec 29 22:06:20 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413441: comm rsync: bogus i_mode (0)
Dec 29 22:06:20 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413457: comm rsync: bogus i_mode (0)
Dec 29 22:06:20 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8414273: comm rsync: bogus i_mode (0)
Dec 29 22:06:21 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415857: comm rsync: bogus i_mode (0)
Dec 29 22:06:21 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415873: comm rsync: bogus i_mode (0)
Dec 29 22:06:21 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415969: comm rsync: bogus i_mode (0)
Dec 29 22:06:21 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8416609: comm rsync: bogus i_mode (0)
Dec 29 22:06:21 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8416817: comm rsync: bogus i_mode (0)
Dec 29 22:06:21 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8417601: comm rsync: bogus i_mode (0)
Dec 29 22:06:22 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8419809: comm rsync: bogus i_mode (0)
Dec 29 22:06:22 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8420097: comm rsync: bogus i_mode (0)
Dec 29 22:06:22 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8421009: comm rsync: bogus i_mode (0)
Dec 29 22:07:14 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413041: comm rsync: bogus i_mode (0)
Dec 29 22:07:14 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413441: comm rsync: bogus i_mode (0)
Dec 29 22:07:14 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413457: comm rsync: bogus i_mode (0)
Dec 29 22:07:14 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8414273: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415857: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415873: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415969: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8416609: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8416817: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8417601: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8419809: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8420097: comm rsync: bogus i_mode (0)
Dec 29 22:07:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8421009: comm rsync: bogus i_mode (0)
Dec 29 22:07:42 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #1049089: comm rsync: bogus i_mode (0)
Dec 29 22:08:09 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8408817: comm rsync: bogus i_mode (0)
Dec 29 22:08:09 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8408881: comm rsync: bogus i_mode (0)
Dec 29 22:08:10 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8409441: comm rsync: bogus i_mode (0)
Dec 29 22:08:10 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8409505: comm rsync: bogus i_mode (0)
Dec 29 22:08:10 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8411265: comm rsync: bogus i_mode (0)
Dec 29 22:08:46 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #2633361: comm rsync: bogus i_mode (0)
Dec 29 22:09:27 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8433233: comm rsync: bogus i_mode (0)
Dec 29 22:23:12 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413041: comm rsync: bogus i_mode (0)
Dec 29 22:23:12 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413441: comm rsync: bogus i_mode (0)
Dec 29 22:23:12 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8413457: comm rsync: bogus i_mode (0)
Dec 29 22:23:13 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8414273: comm rsync: bogus i_mode (0)
Dec 29 22:23:13 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415857: comm rsync: bogus i_mode (0)
Dec 29 22:23:13 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415873: comm rsync: bogus i_mode (0)
Dec 29 22:23:13 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8415969: comm rsync: bogus i_mode (0)
Dec 29 22:23:13 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8416609: comm rsync: bogus i_mode (0)
Dec 29 22:23:13 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8416817: comm rsync: bogus i_mode (0)
Dec 29 22:23:14 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8417601: comm rsync: bogus i_mode (0)
Dec 29 22:23:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8419809: comm rsync: bogus i_mode (0)
Dec 29 22:23:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8420097: comm rsync: bogus i_mode (0)
Dec 29 22:23:15 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8421009: comm rsync: bogus i_mode (0)
Dec 29 22:29:52 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #1049089: comm rsync: bogus i_mode (0)
Dec 29 22:32:37 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8408817: comm rsync: bogus i_mode (0)
Dec 29 22:32:40 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8408881: comm rsync: bogus i_mode (0)
Dec 29 22:33:29 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8409441: comm rsync: bogus i_mode (0)
Dec 29 22:33:32 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8409505: comm rsync: bogus i_mode (0)
Dec 29 22:33:43 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8411265: comm rsync: bogus i_mode (0)
Dec 29 22:38:10 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #2633361: comm rsync: bogus i_mode (0)
Dec 29 22:41:30 shambhala kernel: EXT4-fs error (device dm-1): ext4_iget:5011: inode #8433233: comm rsync: bogus i_mode (0)
Dec 29 22:49:38 shambhala kernel: EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
Dec 29 23:00:22 shambhala kernel: EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
Dec 29 23:00:22 shambhala 50mounted-tests: debug: mounted as ext4 filesystem
Dec 29 23:00:22 shambhala kernel: EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
Dec 29 23:00:22 shambhala 50mounted-tests: debug: mounted as ext4 filesystem
Dec 30 10:20:14 shambhala kernel: EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
Dec 30 10:22:23 shambhala kernel: EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
Dec 30 10:22:23 shambhala kernel: EXT4-fs (dm-0): re-mounted. Opts: (null)
Dec 30 10:22:23 shambhala kernel: EXT4-fs (dm-0): re-mounted. Opts: (null)
Dec 30 10:22:23 shambhala kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
Dec 30 10:22:23 shambhala kernel: EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
Dec 30 12:11:57 shambhala kernel: EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
Dec 30 12:11:57 shambhala kernel: EXT4-fs (dm-0): re-mounted. Opts: (null)
Dec 30 12:11:57 shambhala kernel: EXT4-fs (dm-0): re-mounted. Opts: (null)
Dec 30 12:11:57 shambhala kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
Dec 30 12:11:57 shambhala kernel: EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
Dec 30 12:16:05 shambhala kernel: EXT4-fs (dm-2): warning: maximal mount count reached, running e2fsck is recommended
Dec 30 12:16:05 shambhala kernel: EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null)

Jup, 27th of December. Thats exactly when I enlarged /home by 20 GB.

I found some other issues on the way: Recovery mode of my Debian installation did not work: On each keypress to enter my password it showed the passwort prompt again.

All in all this didn't raise my confidence in the stability of Ext4 especially as fsck.ext4 did not even complete due to a failed memory allocation. Thus is seems 2 GB of RAM and 4 GB of RAM does not seem to be enough to fsck my 220GB /home in some occasions at least.

I used Ext4 again nonetheless, as I do not yet want to use btrfs for /home on my main notebook.
Comment 13 Martin Steigerwald 2010-12-30 14:12:08 UTC
Hmmm, the test script produces different fsck.ext4 output. But then my Ext4 filesystem had about two days to grow the initial corruption. And the syslog shows first problems on the 27th of December while I did the BIOS update yesterday evening.
Comment 14 Florian Mickler 2011-03-30 21:59:55 UTC
Should this be reopened or a new bug filed?
Comment 15 Theodore Tso 2011-03-31 16:26:28 UTC
Martin Steigerwald's problem should be filed as a separate bug if he can reproduce it.  It would also be nice if he gave us e2fsck transcripts in *english*, and was very clear which e2fsck output came from which machine, and exactly what was done on each machine that might have been related to problems showing up.

If someone wants to translate this to english, and separate out his different complaints into separate bug reports, that would be great.  But I don't think the ext4 developers have time to try to decrypt his bug reports....

I'm going to close the original problem as fixed since the original problem was fixed.
Comment 16 Florian Mickler 2011-03-31 16:43:11 UTC
Thanks for the feedback.
Comment 17 Martin Steigerwald 2011-03-31 19:24:08 UTC
Well I pretty much think that it was the same resize bug, but then it might as well have been the BIOS thing (although I doubt it).

FWIW it was one machine, that ThinkPad T42, but different fsck.ext4 runs. When the fsck.ext4 ran till the out of memory error the filesystem was unmountable afterwards. When I quitted it before I was able to mount it.

Sorry for not using LANG=C with fsck.ext4, but at that time all I wanted is to have my data back - nothing else mattered.

Anyway should I face an issue like that again, I will open a reparate bug report and only hint that it might be the same problem as some other bug. I also try to remember to use LANG=C.

So leave closed - I believe I hit that resize bug and all is well again.

Note You need to log in before you can comment on or make changes to this bug.