ERROR ===== EXT4-fs (dm-46): first meta block group too large: 1152 (group descriptor block count 1096) Environment =========== I have a filesystem, which is ext4, on a machine with 4.17.2-1.el7.elrepo.x86_64. The logical volume is 8.56 TiB. This filesystem has been resized many times since it was first created. e2fsck ====== e2fsck 1.42.9 (28-Dec-2013) /dev/vg_areca/lv_MYLVNAME: clean, 59060/574488576 files, 2294844696/2297954304 blocks DESC ==== After the last "resize2fs", I started receiving this error via "dmesg". I was running kernel 3.10.0-514.26.2.el7.x86_64 and after some googling, found that the current version of linux corrected a bug that produced this error. However, after updating the kernel, I am still getting the error. My thoughts are that my multiple resizes has surfaced an issue, but I'm not sure how to address it. LINKS ===== Previous Fix to Kernel with similar error, but appears to be symptomatic https://www.novell.com/support/kb/doc.php?id=7018898 Results of "debugfs" output (minus 70k entries for Group 70127: (Blocks 2297921536-2297954303) [INODE_UNINIT, ITABLE_ZEROED]") https://pastebin.com/EN1xyBAM
So I'm really interested in how the file system got to that state. If you have the history of how the file system was resized up until now, that would be really useful. In any case, the file system really is corrupted, although the good news is that should be a relatively simple thing to fix; you just need to upgrade to a non-prehistoric version of e2fsprogs. It looks like you are using RHEL 7 kernel and e2fsprogs. As such, you should really be getting support from Red Hat --- and they may very well tell you that using a file system this big isn't something Red Hat doesn't support. Given that they are using super-ancient versions of the kernel and e2fsprogs (possibly with some bug fixes and features backported), that might be quite fair. But because they do backport code, it's really not something that upstream developers can really support. This is why Red Hat customers pay the Big Buckets to Red Hat. :-) In any case, e2fsck from e2fsprogs 1.44.2 should be able to repair it. Using it may void your Red Hat support contract, though --- in which case the right answer is to file a bug with Red Hat and ask them to fix it. # (MKE2FS_FIRST_META_BG=1152 mke2fs -t ext4 -O meta_bg,^resize_inode -b 4k /tmp/foo.img 2297954304) mke2fs 1.44.2 (14-May-2018) Creating regular file /tmp/foo.img Creating filesystem with 2297954304 4k blocks and 287244288 inodes Filesystem UUID: 02abae05-96a0-4cbe-85fe-3c2d0c97cf4e Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848, 512000000, 550731776, 644972544, 1934917632 Allocating group tables: done Writing inode tables: done Creating journal (262144 blocks): done Writing superblocks and filesystem accounting information: done # mount /tmp/foo.img /mnt mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop1, missing codepage or helper program, or other error. # dmesg | tail -2 [110950.298537] EXT4-fs (loop1): mounted filesystem with ordered data mode. Opts: (null) [111476.144952] EXT4-fs (loop1): first meta block group too large: 1152 (group descriptor block count 1096) # e2fsck -f /tmp/foo.img e2fsck 1.44.2 (14-May-2018) First_meta_bg is too big. (1152, max value 1096). Clear<y>? yes Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: -(1097--1152) Fix<y>? yes Free blocks count wrong for group #0 (27482, counted=27538). Fix<y>? yes Free blocks count wrong for group #1 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #3 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #5 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #7 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #9 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #25 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #27 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #49 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #81 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #125 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #243 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #343 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #625 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #729 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #2187 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #2401 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #3125 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #6561 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #15625 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #16807 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #19683 (31615, counted=31671). Fix<y>? yes Free blocks count wrong for group #59049 (31615, counted=31671). Fix<y>? yes Free blocks count wrong (2279572611, counted=2279573899). Fix<y>? yes /tmp/foo.img: ***** FILE SYSTEM WAS MODIFIED ***** /tmp/foo.img: 11/287244288 files (0.0% non-contiguous), 18380405/2297954304 blocks # mount /tmp/foo.img /mnt # df /mnt Filesystem 1K-blocks Used Available Use% Mounted on /dev/loop1 9118295620 24 8658688352 1% /mnt
By the way, e2fsprogs 1.42.9 has a huge number of resize2fs bugs when doing off-line resizes. I don't know how many of them Red Hat may or may not have backported into that ancient version of e2fsprogs, but if you insist on using Red Hat's e2fsprogs, I'd strongly recommend that you stick with on-line resizes (that is, with the file system mounted). The 3.10 kernel also has an untold number of bugs that have since fixed upstream; again, I can't speak to how many of them have been backported to Red Hat's RHEL kernel. So if you're going to use RHEL 7 software, I strongly suggest you get Red Hat support.
@theodore RE: So I'm really interested in how the file system got to that state. If you have the history of how the file system was resized up until now, that would be really useful. I went through my "~/.bash_history" to pull out the commands used that lead to the error. The commands listed under "HISTORY" were performed throughout 2017 and the LV was grown incrementally over time with relative sizing. HISTORY ======= lvextend --size +200G /dev/vg_areca/lv_mylvname resize2fs -f /dev/vg_areca/lv_mylvname lvextend --size +100G /dev/vg_areca/lv_mylvname resize2fs /dev/vg_areca/lv_mylvname lvextend --size +100G /dev/vg_areca/lv_mylvname resize2fs -f /dev/vg_areca/lv_mylvname lvextend --size +500G /dev/vg_areca/lv_mylvname lvextend --size +500G /dev/vg_areca/lv_mylvname resize2fs -f /dev/vg_areca/lv_mylvname These "HISTORY" commands are many months old, the command which led to the error is here.. in which this LV was locked down to a specific size, as no other files were going to be added to it.. NOTE: the "tune2fs" which reduced free blocks, my retrieving of "--getbsz" to get an absolutely blocks needed for the subsequent "lvreduce --size 8766G" OP THAT LEAD TO ERROR ===================== e2fsck -f /dev/vg_areca/lv_mylvname tune2fs -m 0.0 /dev/vg_areca/lv_mylvname blockdev --getbsz /dev/vg_areca/lv_mylvname resize2fs /dev/vg_areca/lv_mylvname 8766G lvreduce --size 8766G /dev/vg_areca/lv_mylvname mount /dev/vg_areca/lv_mylvname