Hardware Environment: qemu x86 Software Environment: Minimal Debian sid (unstable) Problem Description: [I really thought I had already reported this, but since I can't find it either via bugzilla or google, I assume I haven't.] Hi, Unfortunately this is one of those bugs that I can't find a way to reproduce except by randomly breaking one fs after another. This happens with ext3 and ext4, but so far I haven't seen it happen with ext2. On doing rm -rf on an intentionally corrupted ext3/ext4 filesystem, I occasionally hit bugs like this (ext3 backtrace from -rc3, two ext4 traces from -rc5). If you want me to try to reproduce the ext3 crash on latest -rc, just mention. ---------- *** seed 270, ext3, 2.6.27-rc3 *** EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 1479317508, count = 1 EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 4718764, count = 1 attempt to access beyond end of device hdb: rw=0, want=1048578, limit=20480 EXT3-fs error (device hdb): ext3_free_branches: Read failure, inode=1428, block=524288 EXT3-fs warning (device hdb): empty_dir: bad directory (dir #1360) - no `.' or `..' EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1332: directory entry across blocks - offset=0, inode=1332, rec_len= BUG: unable to handle kernel paging request at c7c3240c IP: [<c02e4be6>] empty_dir+0xe1/0x305 *pde = 00007067 *pte = 07c32160 Oops: 0000 [#1] DEBUG_PAGEALLOC [ 1306.100454] Pid: 24302, comm: rm Not tainted (2.6.27-rc3 #2) EIP: 0060:[<c02e4be6>] EFLAGS: 00000246 CPU: 0 EIP is at empty_dir+0xe1/0x305 EAX: c7c3240c EBX: c3fa7cc4 ECX: 00000534 EDX: 00000534 ESI: c7c2a400 EDI: c74d4888 EBP: c1e6cef4 ESP: c1e6cec0 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process rm (pid: 24302, ti=c1e6c000 task=c5664d00 task.ti=c1e6c000) Stack: 00000000 c1e6cee4 c7aab400 00000058 38583e14 72b9e783 00000002 c7c3240c c7aaa800 00000000 c7440000 c744471c fffffffb c1e6cf28 c02e7910 00000246 c0620de0 c3c67690 c0620de0 c3c67688 c3fa7cc4 c3f6e230 c7cab9a0 00000000 Call Trace: [<c02e7910>] ? ext3_rmdir+0xb7/0x18f [<c026ba2d>] ? vfs_rmdir+0x7e/0xb3 [<c026d2b7>] ? do_rmdir+0xb7/0xc3 [<c026d2f4>] ? sys_unlinkat+0x31/0x36 [<c0202f3e>] ? syscall_call+0x7/0xb ======================= Code: 08 5c b4 5d c0 c7 44 24 04 a4 26 55 c0 8b 45 ec 89 04 24 e8 47 45 00 00 b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 85 c0 74 86 8d 56 08 b8 6c cb 5f c0 e8 a8 9d 17 00 85 c0 EIP: [<c02e4be6>] empty_dir+0xe1/0x305 SS:ESP 0068:c1e6cec0 ---[ end trace 3a33b21de407e362 ]--- ---------- *** seed 451, ext4, 2.6.27-rc5 *** attempt to access beyond end of device hdb: rw=0, want=268435458, limit=20480 EXT4-fs error (device hdb): ext4_xattr_delete_inode: inode 507: block 134217728 read error EXT4-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #653: directory entry across blocks - offset=0, inode=653, rec_len=16 BUG: unable to handle kernel paging request at c7d2540c IP: [<c02fb496>] empty_dir+0xe1/0x305 *pde = 00007067 *pte = 07d25160 Oops: 0000 [#1] DEBUG_PAGEALLOC [ 2151.877484] Pid: 20705, comm: rm Not tainted (2.6.27-rc5 #2) EIP: 0060:[<c02fb496>] EFLAGS: 00000246 CPU: 0 EIP is at empty_dir+0xe1/0x305 EAX: c7d2540c EBX: c48440e0 ECX: 0000028d EDX: 0000028d ESI: c7d21400 EDI: c1b99428 EBP: c1bd7ef4 ESP: c1bd7ec0 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process rm (pid: 20705, ti=c1bd7000 task=c1a38000 task.ti=c1bd7000) Stack: 00000000 c1bd7ee4 c6169800 0000007e e18fea3c 54ed2757 00000001 c7d2540c c6169400 00000000 c4a35020 c4982138 fffffffb c1bd7f28 c02fe5ef 00000246 c0620de0 c485bbe8 c0620de0 c485bbe0 c48440e0 c4a15dc8 c2b7a5c8 00000000 Call Trace: [<c02fe5ef>] ? ext4_rmdir+0xd5/0x1e8 [<c026bd5d>] ? vfs_rmdir+0x7e/0xb3 [<c026d5e7>] ? do_rmdir+0xb7/0xc3 [<c026d624>] ? sys_unlinkat+0x31/0x36 [<c0202f3e>] ? syscall_call+0x7/0xb ======================= Code: 08 54 b4 5d c0 c7 44 24 04 a4 34 55 c0 8b 45 ec 89 04 24 e8 73 4b 00 00 b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 8 EIP: [<c02fb496>] empty_dir+0xe1/0x305 SS:ESP 0068:c1bd7ec0 ---[ end trace 79e4e3dfd3fb9e7d ]--- umount: /mnt: device is busy ---------- *** seed 10000193, ext4, 2.6.27-rc5 *** EXT4-fs warning (device hdb): empty_dir: bad directory (dir #733) - no `.' or `..' EXT4-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #461: directory entry across blocks - offset=0, inode=461, rec_len=82 BUG: unable to handle kernel paging request at c769940c IP: [<c02fb496>] empty_dir+0xe1/0x305 *pde = 079e7163 *pte = 07699160 Oops: 0000 [#1] DEBUG_PAGEALLOC [ 961.774442] Pid: 4518, comm: rm Not tainted (2.6.27-rc5 #2) EIP: 0060:[<c02fb496>] EFLAGS: 00000246 CPU: 0 EIP is at empty_dir+0xe1/0x305 EAX: c769940c EBX: c3fc36c8 ECX: 000001cd EDX: 000001cd ESI: c7697400 EDI: c3fc8380 EBP: c7a6cef4 ESP: c7a6cec0 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process rm (pid: 4518, ti=c7a6c000 task=c78bc360 task.ti=c7a6c000) Stack: 00000000 c7a6cee4 c532ec00 0000007e 1da9562e eb3f2f99 00000001 c769940c c532e000 00000000 c3ee0020 c3eada08 fffffffb c7a6cf28 c02fe5ef 00000246 c0620de0 c747c560 c0620de0 c747c558 c3fc36c8 c3fc8d90 c76965f0 00000000 Call Trace: [<c02fe5ef>] ? ext4_rmdir+0xd5/0x1e8 [<c026bd5d>] ? vfs_rmdir+0x7e/0xb3 [<c026d5e7>] ? do_rmdir+0xb7/0xc3 [<c026d624>] ? sys_unlinkat+0x31/0x36 [<c0202f3e>] ? syscall_call+0x7/0xb ======================= Code: 08 54 b4 5d c0 c7 44 24 04 a4 34 55 c0 8b 45 ec 89 04 24 e8 73 4b 00 00 b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 8 EIP: [<c02fb496>] empty_dir+0xe1/0x305 SS:ESP 0068:c7a6cec0 ---[ end trace 7aaee6ca8f8adc20 ]--- ----------
Reply-To: akpm@linux-foundation.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Tue, 9 Sep 2008 11:27:52 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11525 > > Summary: Unable to handle paging request at ext3_rmdir() and > ext4_rmdir() on intentionally corrupted fs > Product: File System > Version: 2.5 > KernelVersion: 2.6.27-rc5 (ext4), 2.6.27-rc3 (ext3) > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: ext3 > AssignedTo: akpm@osdl.org > ReportedBy: sliedes@cc.hut.fi > > > Hardware Environment: qemu x86 > Software Environment: Minimal Debian sid (unstable) > Problem Description: > > [I really thought I had already reported this, but since I can't find it > either > via bugzilla or google, I assume I haven't.] > > Hi, > > Unfortunately this is one of those bugs that I can't find a way to reproduce > except by randomly breaking one fs after another. This happens with ext3 and > ext4, but so far I haven't seen it happen with ext2. > > On doing rm -rf on an intentionally corrupted ext3/ext4 filesystem, I > occasionally hit bugs like this (ext3 backtrace from -rc3, two ext4 traces > from > -rc5). If you want me to try to reproduce the ext3 crash on latest -rc, just > mention. > > ---------- > *** seed 270, ext3, 2.6.27-rc3 *** > EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone > - > block = 1479317508, count = 1 > EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone > - > block = 4718764, count = 1 > attempt to access beyond end of device > hdb: rw=0, want=1048578, limit=20480 > EXT3-fs error (device hdb): ext3_free_branches: Read failure, inode=1428, > block=524288 > EXT3-fs warning (device hdb): empty_dir: bad directory (dir #1360) - no `.' > or > `..' > EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory > #1332: directory entry across blocks - offset=0, inode=1332, rec_len= > BUG: unable to handle kernel paging request at c7c3240c > IP: [<c02e4be6>] empty_dir+0xe1/0x305 > *pde = 00007067 *pte = 07c32160 > Oops: 0000 [#1] DEBUG_PAGEALLOC > [ 1306.100454] > Pid: 24302, comm: rm Not tainted (2.6.27-rc3 #2) > EIP: 0060:[<c02e4be6>] EFLAGS: 00000246 CPU: 0 > EIP is at empty_dir+0xe1/0x305 > EAX: c7c3240c EBX: c3fa7cc4 ECX: 00000534 EDX: 00000534 > ESI: c7c2a400 EDI: c74d4888 EBP: c1e6cef4 ESP: c1e6cec0 > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 > Process rm (pid: 24302, ti=c1e6c000 task=c5664d00 task.ti=c1e6c000) > Stack: 00000000 c1e6cee4 c7aab400 00000058 38583e14 72b9e783 00000002 > c7c3240c > c7aaa800 00000000 c7440000 c744471c fffffffb c1e6cf28 c02e7910 > 00000246 > c0620de0 c3c67690 c0620de0 c3c67688 c3fa7cc4 c3f6e230 c7cab9a0 > 00000000 > Call Trace: > [<c02e7910>] ? ext3_rmdir+0xb7/0x18f > [<c026ba2d>] ? vfs_rmdir+0x7e/0xb3 > [<c026d2b7>] ? do_rmdir+0xb7/0xc3 > [<c026d2f4>] ? sys_unlinkat+0x31/0x36 > [<c0202f3e>] ? syscall_call+0x7/0xb > ======================= > Code: 08 5c b4 5d c0 c7 44 24 04 a4 26 55 c0 8b 45 ec 89 04 24 e8 47 45 00 00 > b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 85 c0 74 86 > 8d > 56 08 b8 6c cb 5f c0 e8 a8 9d 17 00 85 c0 > EIP: [<c02e4be6>] empty_dir+0xe1/0x305 SS:ESP 0068:c1e6cec0 > ---[ end trace 3a33b21de407e362 ]--- > ---------- > *** seed 451, ext4, 2.6.27-rc5 *** > attempt to access beyond end of device > hdb: rw=0, want=268435458, limit=20480 > EXT4-fs error (device hdb): ext4_xattr_delete_inode: inode 507: block > 134217728 > read error > EXT4-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory > #653: directory entry across blocks - offset=0, inode=653, rec_len=16 > BUG: unable to handle kernel paging request at c7d2540c > IP: [<c02fb496>] empty_dir+0xe1/0x305 > *pde = 00007067 *pte = 07d25160 > Oops: 0000 [#1] DEBUG_PAGEALLOC > [ 2151.877484] > Pid: 20705, comm: rm Not tainted (2.6.27-rc5 #2) > EIP: 0060:[<c02fb496>] EFLAGS: 00000246 CPU: 0 > EIP is at empty_dir+0xe1/0x305 > EAX: c7d2540c EBX: c48440e0 ECX: 0000028d EDX: 0000028d > ESI: c7d21400 EDI: c1b99428 EBP: c1bd7ef4 ESP: c1bd7ec0 > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 > Process rm (pid: 20705, ti=c1bd7000 task=c1a38000 task.ti=c1bd7000) > Stack: 00000000 c1bd7ee4 c6169800 0000007e e18fea3c 54ed2757 00000001 > c7d2540c > c6169400 00000000 c4a35020 c4982138 fffffffb c1bd7f28 c02fe5ef > 00000246 > c0620de0 c485bbe8 c0620de0 c485bbe0 c48440e0 c4a15dc8 c2b7a5c8 > 00000000 > Call Trace: > [<c02fe5ef>] ? ext4_rmdir+0xd5/0x1e8 > [<c026bd5d>] ? vfs_rmdir+0x7e/0xb3 > [<c026d5e7>] ? do_rmdir+0xb7/0xc3 > [<c026d624>] ? sys_unlinkat+0x31/0x36 > [<c0202f3e>] ? syscall_call+0x7/0xb > ======================= > Code: 08 54 b4 5d c0 c7 44 24 04 a4 34 55 c0 8b 45 ec 89 04 24 e8 73 4b 00 00 > b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 8 > EIP: [<c02fb496>] empty_dir+0xe1/0x305 SS:ESP 0068:c1bd7ec0 > ---[ end trace 79e4e3dfd3fb9e7d ]--- > umount: /mnt: device is busy > ---------- > *** seed 10000193, ext4, 2.6.27-rc5 *** > EXT4-fs warning (device hdb): empty_dir: bad directory (dir #733) - no `.' or > `..' > EXT4-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory > #461: directory entry across blocks - offset=0, inode=461, rec_len=82 > BUG: unable to handle kernel paging request at c769940c > IP: [<c02fb496>] empty_dir+0xe1/0x305 > *pde = 079e7163 *pte = 07699160 > Oops: 0000 [#1] DEBUG_PAGEALLOC > [ 961.774442] > Pid: 4518, comm: rm Not tainted (2.6.27-rc5 #2) > EIP: 0060:[<c02fb496>] EFLAGS: 00000246 CPU: 0 > EIP is at empty_dir+0xe1/0x305 > EAX: c769940c EBX: c3fc36c8 ECX: 000001cd EDX: 000001cd > ESI: c7697400 EDI: c3fc8380 EBP: c7a6cef4 ESP: c7a6cec0 > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 > Process rm (pid: 4518, ti=c7a6c000 task=c78bc360 task.ti=c7a6c000) > Stack: 00000000 c7a6cee4 c532ec00 0000007e 1da9562e eb3f2f99 00000001 > c769940c > c532e000 00000000 c3ee0020 c3eada08 fffffffb c7a6cf28 c02fe5ef > 00000246 > c0620de0 c747c560 c0620de0 c747c558 c3fc36c8 c3fc8d90 c76965f0 > 00000000 > Call Trace: > [<c02fe5ef>] ? ext4_rmdir+0xd5/0x1e8 > [<c026bd5d>] ? vfs_rmdir+0x7e/0xb3 > [<c026d5e7>] ? do_rmdir+0xb7/0xc3 > [<c026d624>] ? sys_unlinkat+0x31/0x36 > [<c0202f3e>] ? syscall_call+0x7/0xb > ======================= > Code: 08 54 b4 5d c0 c7 44 24 04 a4 34 55 c0 8b 45 ec 89 04 24 e8 73 4b 00 00 > b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 8 > EIP: [<c02fb496>] empty_dir+0xe1/0x305 SS:ESP 0068:c7a6cec0 > ---[ end trace 7aaee6ca8f8adc20 ]--- > ---------- >
> > Unfortunately this is one of those bugs that I can't find a way to > > reproduce except by randomly breaking one fs after another. This > > happens with ext3 and ext4, but so far I haven't seen it happen > > with ext2. > > > > > > *** seed 270, ext3, 2.6.27-rc3 *** > > *** seed 451, ext4, 2.6.27-rc5 *** Given these seed numbers, I assume this was generating using some tool like fsfuzzer? Would it be possible to generate a filesystem image *before* that triggers the problem case, before trying to execute the rm -rf? That would be the fastest way to try to track the problem down. - Ted
On Tue, Sep 09, 2008 at 05:55:31PM -0400, Theodore Tso wrote: > > > Unfortunately this is one of those bugs that I can't find a way to > > > reproduce except by randomly breaking one fs after another. This > > > happens with ext3 and ext4, but so far I haven't seen it happen > > > with ext2. > > > > > > > > > *** seed 270, ext3, 2.6.27-rc3 *** > > > *** seed 451, ext4, 2.6.27-rc5 *** > > Given these seed numbers, I assume this was generating using some tool > like fsfuzzer? Would it be possible to generate a filesystem image > *before* that triggers the problem case, before trying to execute the > rm -rf? > > That would be the fastest way to try to track the problem down. Yes, I can generate those filesystems. However the problem seems to be elusive in that I haven't yet been able to reproduce it twice with the same filesystem (and even with random filesystems, it every occurs once in a while). I'll do some more testing and try to figure out if it can be reproduced more easily. Still I can give you some filesystems that crashed once, if you wish. They are typically something like 600 KiB compressed, and I guess that could be made less by zeroing all regular files in the pristine fs before doing the fuzzing. Here's a script I use to do the testing ($1 is the initial seed). The filesystem is a 10 MiB pristine ext[34] image with a copy of my workstation's /dev and a partial copy of /usr/share/doc (I tried to be diverse in what I put there). ------------------------------------------------------------ #!/bin/sh if [ "`hostname`" != "fstest" ]; then echo "This is a dangerous script." echo "Set your hostname to \`fstest\' if you want to use it." exit 1 fi umount /dev/hdb umount /dev/hdc /etc/init.d/sysklogd stop /etc/init.d/klogd stop /etc/init.d/cron stop mount /dev/hda / -t ext3 -o remount,ro || exit 1 #ulimit -t 20 for ((s=$1; s<1000000000; s++)); do umount /mnt echo '***** zzuffing *****' seed $s zzuf -r 0:0.03 -s $s </dev/hdc >/dev/hdb || exit mount /dev/hdb /mnt -t ext2 -o errors=continue || continue cd /mnt || continue timeout 30 cp -r doc doc2 >&/dev/null timeout 30 find -xdev >&/dev/null timeout 30 find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null timeout 30 mkdir tmp >&/dev/null timeout 30 echo whoah >tmp/filu 2>/dev/null timeout 30 rm -rf /mnt/* >&/dev/null cd / done ------------------------------------------------------------ Sami
On Wed, Sep 10, 2008 at 06:26:34AM +0300, Sami Liedes wrote: > > Yes, I can generate those filesystems. However the problem seems to be > elusive in that I haven't yet been able to reproduce it twice with the > same filesystem (and even with random filesystems, it every occurs > once in a while). I'll do some more testing and try to figure out if > it can be reproduced more easily. Still I can give you some > filesystems that crashed once, if you wish. They are typically > something like 600 KiB compressed, and I guess that could be made less > by zeroing all regular files in the pristine fs before doing the > fuzzing. One easy way of doing this is the following: e2image -r /dev/hdXX /var/tmp/hdXX.e2i dd if=/var/tmp/hdXX.e2i of=/dev/hdXX Another thing you can do is change your script to add the following line before the filesystem is mounted: e2image -r /dev/hdXX - | bzip2 > /var/tmp/hdXX.e2i and then if the filesystem fails (i.e., the system oops), /var/tmp/hdXX.e2i.bz2 will have all of the filesystem metadata (including directories), such that if you decompress and write out the filesystem (or what I do when given one of these to examine): bunzip2 < hdXX.e2i.bz2 | make-sparse > hdXX.e2i Said sparse file can now be checked via e2fsck, or mounted using a loopback mount, etc. Even if it's not reliably reproducable, if I can get a series of filesystems which show the problem, using "e2fsck -nf" we can see a pattern of how the filesystems are corrupted, and that can help narrow down what might be going on that causes the kernel oops. Thanks, regards, - Ted /* * make-sparse.c --- make a sparse file from stdin * * Copyright 2004 by Theodore Ts'o. * * %Begin-Header% * This file may be redistributed under the terms of the GNU Public * License. * %End-Header% */ #define _LARGEFILE_SOURCE #define _LARGEFILE64_SOURCE #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <errno.h> int full_read(int fd, char *buf, size_t count) { int got, total = 0; int pass = 0; while (count > 0) { got = read(fd, buf, count); if (got == -1) { if ((errno == EINTR) || (errno == EAGAIN)) continue; return total ? total : -1; } if (got == 0) { if (pass++ >= 3) return total; continue; } pass = 0; buf += got; total += got; count -= got; } return total; } int main(int argc, char **argv) { int fd, got, i; char buf[1024]; if (argc != 2) { fprintf(stderr, "Usage: make-sparse out-file\n"); exit(1); } fd = open(argv[1], O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0777); if (fd < 0) { perror(argv[1]); exit(1); } while (1) { got = full_read(0, buf, sizeof(buf)); if (got == 0) break; if (got == sizeof(buf)) { for (i=0; i < sizeof(buf); i++) if (buf[i]) break; if (i == sizeof(buf)) { lseek(fd, sizeof(buf), SEEK_CUR); continue; } } write(fd, buf, got); } return 0; }
It looks another instance of our old friend, the unvalidated dir entry. In this case empty_dir is calling ext3_next_entry (and using the results) without validating the entries with ext3_check_dir_entry first. That could cause these symptoms (or other nastiness such as an infinite loop). Mostly untested patch coming shortly...
Created attachment 19146 [details] validate-dir-entries.patch Patch to validate the first couple of directory entries ('.' and '..') in empty_dir before using them.
Let me know if that seems to fix things.
On Thu, Dec 04, 2008 at 05:50:34AM -0800, bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11525 > > > > > > ------- Comment #6 from duaneg@dghda.com 2008-12-04 05:50 ------- > Created an attachment (id=19146) > --> (http://bugzilla.kernel.org/attachment.cgi?id=19146&action=view) > validate-dir-entries.patch > > Patch to validate the first couple of directory entries ('.' and '..') in > empty_dir before using them. Tested with 2.6.27.8 + the patch + a couple other patches to fix fs bugs. Apparently didn't fix the problem, however the backtrace looks a bit different now: ------------------------------------------------------------ ***** zzuffing ***** seed 10000014 [ 107.660000] kjournald starting. Commit interval 5 seconds [ 107.660000] EXT3-fs warning: checktime reached, running e2fsck is recommended [ 107.660000] EXT3 FS on hdb, internal journal [ 107.660000] EXT3-fs: mounted filesystem with ordered data mode. [ 107.760000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1573: inode out of bounds - offset=24, inode=34342, rec_len=20, name_len=9 [ 108.070000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1573: inode out of bounds - offset=24, inode=34342, rec_len=20, name_len=9 [ 108.090000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1281: directory entry across blocks - offset=3348, inode=67436560, rec_len=28004, name_len=45 [ 108.120000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #140: inode out of bounds - offset=9384, inode=4195644, rec_len=24, name_len=14 ./runtest: line 31: 3524 Killed timeout 30 find -xdev >&/dev/null [ 138.180000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1573: inode out of bounds - offset=24, inode=34342, rec_len=20, name_len=9 [ 138.190000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1281: directory entry across blocks - offset=3348, inode=67436560, rec_len=28004, name_len=45 [ 138.200000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #140: inode out of bounds - offset=9384, inode=4195644, rec_len=24, name_len=14 [ 168.480000] attempt to access beyond end of device [ 168.480000] hdb: rw=0, want=262146, limit=20480 [ 168.480000] EXT3-fs error (device hdb): ext3_free_branches: Read failure, inode=98, block=131072 [ 168.600000] EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 33554432, count = 1 [ 168.610000] EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 131072, count = 1 [ 168.620000] EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 16384, count = 1 [ 168.630000] BUG: unable to handle kernel paging request at c72d5004 [ 168.630000] IP: [<c02dfc5f>] ext3_check_dir_entry+0xf/0x11a [ 168.630000] *pde = 07a14163 *pte = 072d5160 [ 168.630000] Oops: 0000 [#1] DEBUG_PAGEALLOC [ 168.630000] [ 168.630000] Pid: 3902, comm: rm Not tainted (2.6.27.8 #1) [ 168.630000] EIP: 0060:[<c02dfc5f>] EFLAGS: 00000286 CPU: 0 [ 168.630000] EIP is at ext3_check_dir_entry+0xf/0x11a [ 168.630000] EAX: c05ced45 EBX: c75b7064 ECX: c72d5000 EDX: c75b7064 [ 168.630000] ESI: c05dc7e4 EDI: c746b9d8 EBP: c5eafeb4 ESP: c5eafe78 [ 168.630000] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 [ 168.630000] Process rm (pid: 3902, ti=c5eaf000 task=c78b8000 task.ti=c5eaf000) [ 168.630000] Stack: c7a77080 00000005 c71f9bf8 c5eafeb8 2cd482e8 00000000 c5eafee4 fffffffb [ 168.630000] c5eafeb4 c75b7064 c05ced45 c7ac1400 c75b7064 c05dc7e4 c746b9d8 c5eafef4 [ 168.630000] c02e5a98 c746b9d8 00000400 c03167ea c7ac0c00 00000058 2cd482e8 92a68f1d [ 168.630000] Call Trace: [ 168.630000] [<c02e5a98>] ? empty_dir+0xa3/0x366 [ 168.630000] [<c03167ea>] ? journal_start+0xb2/0x112 [ 168.630000] [<c02e8860>] ? ext3_rmdir+0xb7/0x18f [ 168.630000] [<c026c61d>] ? vfs_rmdir+0x7e/0xb3 [ 168.630000] [<c026de87>] ? do_rmdir+0xb7/0xc3 [ 168.630000] [<c026dec4>] ? sys_unlinkat+0x31/0x36 [ 168.630000] [<c0202f1e>] ? syscall_call+0x7/0xb [ 168.630000] ======================= [ 168.630000] Code: 8d 5d f8 89 1c 24 e8 d8 f9 ff ff 83 c4 0c 5b 5d c3 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 57 56 53 83 ec 30 89 45 ec 89 55 e8 <0f> b7 41 04 3d ff ff 00 00 74 66 89 c6 83 f8 0b 0f 8f b0 00 00 [ 168.630000] EIP: [<c02dfc5f>] ext3_check_dir_entry+0xf/0x11a SS:ESP 0068:c5eafe78 [ 168.630000] ---[ end trace 7f26b2bf1a74f033 ]--- umount: /mnt: device is busy ------------------------------------------------------------ $ addr2line -e /srv/chroot/ia32_sid/usr/src/linux-2.6.27.8/vmlinux -i 0xc02dfc5f /usr/src/linux-2.6.27.8/include/linux/ext3_fs.h:663 /usr/src/linux-2.6.27.8/fs/ext3/dir.c:70 Those are: /usr/src/linux-2.6.27.8/include/linux/ext3_fs.h:663: ------------------------------------------------------------ 661 static inline unsigned ext3_rec_len_from_disk(__le16 dlen) 662 { --> 663 unsigned len = le16_to_cpu(dlen); 664 665 if (len == EXT3_MAX_REC_LEN) 666 return 1 << 16; 667 return len; 668 } ------------------------------------------------------------ /usr/src/linux-2.6.27.8/fs/ext3/dir.c:70 ------------------------------------------------------------ 64 int ext3_check_dir_entry (const char * function, struct inode * dir, 65 struct ext3_dir_entry_2 * de, 66 struct buffer_head * bh, 67 unsigned long offset) 68 { 69 const char * error_msg = NULL; --> 70 const int rlen = ext3_rec_len_from_disk(de->rec_len); 71 72 if (rlen < EXT3_DIR_REC_LEN(1)) 73 error_msg = "rec_len is smaller than minimal"; ------------------------------------------------------------ I'll also attach the image with which this happened (in the state before mounting), even if it doesn't seem always reproducible. Sami
Created attachment 19188 [details] (broken) ext3 image which caused the crash, in the state it was before mounting
Created attachment 19189 [details] Another fs image triggering a similar crash, bzip2 compressed