Bug 11525 - Unable to handle paging request at ext3_rmdir() and ext4_rmdir() on intentionally corrupted fs
Summary: Unable to handle paging request at ext3_rmdir() and ext4_rmdir() on intention...
Status: RESOLVED OBSOLETE
Alias: None
Product: File System
Classification: Unclassified
Component: ext3 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Duane Griffin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-09 11:27 UTC by Sami Liedes
Modified: 2013-12-19 14:48 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.27-rc5 (ext4), 2.6.27-rc3 (ext3)
Subsystem:
Regression: No
Bisected commit-id:


Attachments
validate-dir-entries.patch (1.70 KB, patch)
2008-12-04 05:50 UTC, Duane Griffin
Details | Diff
(broken) ext3 image which caused the crash, in the state it was before mounting (60.21 KB, application/x-bzip2)
2008-12-06 14:44 UTC, Sami Liedes
Details
Another fs image triggering a similar crash, bzip2 compressed (323.46 KB, application/x-bzip2)
2008-12-06 14:53 UTC, Sami Liedes
Details

Description Sami Liedes 2008-09-09 11:27:52 UTC
Hardware Environment: qemu x86
Software Environment: Minimal Debian sid (unstable)
Problem Description:

[I really thought I had already reported this, but since I can't find it either via bugzilla or google, I assume I haven't.]

Hi,

Unfortunately this is one of those bugs that I can't find a way to reproduce except by randomly breaking one fs after another. This happens with ext3 and ext4, but so far I haven't seen it happen with ext2.

On doing rm -rf on an intentionally corrupted ext3/ext4 filesystem, I occasionally hit bugs like this (ext3 backtrace from -rc3, two ext4 traces from -rc5). If you want me to try to reproduce the ext3 crash on latest -rc, just mention.

----------
*** seed 270, ext3, 2.6.27-rc3 ***
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 1479317508, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 4718764, count = 1
attempt to access beyond end of device
hdb: rw=0, want=1048578, limit=20480
EXT3-fs error (device hdb): ext3_free_branches: Read failure, inode=1428, block=524288
EXT3-fs warning (device hdb): empty_dir: bad directory (dir #1360) - no `.' or `..'
EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1332: directory entry across blocks - offset=0, inode=1332, rec_len=
BUG: unable to handle kernel paging request at c7c3240c
IP: [<c02e4be6>] empty_dir+0xe1/0x305
*pde = 00007067 *pte = 07c32160
Oops: 0000 [#1] DEBUG_PAGEALLOC
[ 1306.100454]
Pid: 24302, comm: rm Not tainted (2.6.27-rc3 #2)
EIP: 0060:[<c02e4be6>] EFLAGS: 00000246 CPU: 0
EIP is at empty_dir+0xe1/0x305
EAX: c7c3240c EBX: c3fa7cc4 ECX: 00000534 EDX: 00000534
ESI: c7c2a400 EDI: c74d4888 EBP: c1e6cef4 ESP: c1e6cec0
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rm (pid: 24302, ti=c1e6c000 task=c5664d00 task.ti=c1e6c000)
Stack: 00000000 c1e6cee4 c7aab400 00000058 38583e14 72b9e783 00000002 c7c3240c
       c7aaa800 00000000 c7440000 c744471c fffffffb c1e6cf28 c02e7910 00000246
       c0620de0 c3c67690 c0620de0 c3c67688 c3fa7cc4 c3f6e230 c7cab9a0 00000000
Call Trace:
 [<c02e7910>] ? ext3_rmdir+0xb7/0x18f
 [<c026ba2d>] ? vfs_rmdir+0x7e/0xb3
 [<c026d2b7>] ? do_rmdir+0xb7/0xc3
 [<c026d2f4>] ? sys_unlinkat+0x31/0x36
 [<c0202f3e>] ? syscall_call+0x7/0xb
 =======================
Code: 08 5c b4 5d c0 c7 44 24 04 a4 26 55 c0 8b 45 ec 89 04 24 e8 47 45 00 00 b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 85 c0 74 86 8d 56 08 b8 6c cb 5f c0 e8 a8 9d 17 00 85 c0
EIP: [<c02e4be6>] empty_dir+0xe1/0x305 SS:ESP 0068:c1e6cec0
---[ end trace 3a33b21de407e362 ]---
----------
*** seed 451, ext4, 2.6.27-rc5 ***
attempt to access beyond end of device
hdb: rw=0, want=268435458, limit=20480
EXT4-fs error (device hdb): ext4_xattr_delete_inode: inode 507: block 134217728 read error
EXT4-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #653: directory entry across blocks - offset=0, inode=653, rec_len=16
BUG: unable to handle kernel paging request at c7d2540c
IP: [<c02fb496>] empty_dir+0xe1/0x305
*pde = 00007067 *pte = 07d25160
Oops: 0000 [#1] DEBUG_PAGEALLOC
[ 2151.877484]
Pid: 20705, comm: rm Not tainted (2.6.27-rc5 #2)
EIP: 0060:[<c02fb496>] EFLAGS: 00000246 CPU: 0
EIP is at empty_dir+0xe1/0x305
EAX: c7d2540c EBX: c48440e0 ECX: 0000028d EDX: 0000028d
ESI: c7d21400 EDI: c1b99428 EBP: c1bd7ef4 ESP: c1bd7ec0
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rm (pid: 20705, ti=c1bd7000 task=c1a38000 task.ti=c1bd7000)
Stack: 00000000 c1bd7ee4 c6169800 0000007e e18fea3c 54ed2757 00000001 c7d2540c
       c6169400 00000000 c4a35020 c4982138 fffffffb c1bd7f28 c02fe5ef 00000246
       c0620de0 c485bbe8 c0620de0 c485bbe0 c48440e0 c4a15dc8 c2b7a5c8 00000000
Call Trace:
 [<c02fe5ef>] ? ext4_rmdir+0xd5/0x1e8
 [<c026bd5d>] ? vfs_rmdir+0x7e/0xb3
 [<c026d5e7>] ? do_rmdir+0xb7/0xc3
 [<c026d624>] ? sys_unlinkat+0x31/0x36
 [<c0202f3e>] ? syscall_call+0x7/0xb
 =======================
Code: 08 54 b4 5d c0 c7 44 24 04 a4 34 55 c0 8b 45 ec 89 04 24 e8 73 4b 00 00 b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 8
EIP: [<c02fb496>] empty_dir+0xe1/0x305 SS:ESP 0068:c1bd7ec0
---[ end trace 79e4e3dfd3fb9e7d ]---
umount: /mnt: device is busy
----------
*** seed 10000193, ext4, 2.6.27-rc5 ***
EXT4-fs warning (device hdb): empty_dir: bad directory (dir #733) - no `.' or `..'
EXT4-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #461: directory entry across blocks - offset=0, inode=461, rec_len=82
BUG: unable to handle kernel paging request at c769940c
IP: [<c02fb496>] empty_dir+0xe1/0x305
*pde = 079e7163 *pte = 07699160
Oops: 0000 [#1] DEBUG_PAGEALLOC
[  961.774442]
Pid: 4518, comm: rm Not tainted (2.6.27-rc5 #2)
EIP: 0060:[<c02fb496>] EFLAGS: 00000246 CPU: 0
EIP is at empty_dir+0xe1/0x305
EAX: c769940c EBX: c3fc36c8 ECX: 000001cd EDX: 000001cd
ESI: c7697400 EDI: c3fc8380 EBP: c7a6cef4 ESP: c7a6cec0
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rm (pid: 4518, ti=c7a6c000 task=c78bc360 task.ti=c7a6c000)
Stack: 00000000 c7a6cee4 c532ec00 0000007e 1da9562e eb3f2f99 00000001 c769940c
       c532e000 00000000 c3ee0020 c3eada08 fffffffb c7a6cf28 c02fe5ef 00000246
       c0620de0 c747c560 c0620de0 c747c558 c3fc36c8 c3fc8d90 c76965f0 00000000
Call Trace:
 [<c02fe5ef>] ? ext4_rmdir+0xd5/0x1e8
 [<c026bd5d>] ? vfs_rmdir+0x7e/0xb3
 [<c026d5e7>] ? do_rmdir+0xb7/0xc3
 [<c026d624>] ? sys_unlinkat+0x31/0x36
 [<c0202f3e>] ? syscall_call+0x7/0xb
 =======================
Code: 08 54 b4 5d c0 c7 44 24 04 a4 34 55 c0 8b 45 ec 89 04 24 e8 73 4b 00 00 b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 8
EIP: [<c02fb496>] empty_dir+0xe1/0x305 SS:ESP 0068:c7a6cec0
---[ end trace 7aaee6ca8f8adc20 ]---
----------
Comment 1 Anonymous Emailer 2008-09-09 13:47:01 UTC
Reply-To: akpm@linux-foundation.org

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Tue,  9 Sep 2008 11:27:52 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11525
> 
>            Summary: Unable to handle paging request at ext3_rmdir() and
>                     ext4_rmdir() on intentionally corrupted fs
>            Product: File System
>            Version: 2.5
>      KernelVersion: 2.6.27-rc5 (ext4), 2.6.27-rc3 (ext3)
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: ext3
>         AssignedTo: akpm@osdl.org
>         ReportedBy: sliedes@cc.hut.fi
> 
> 
> Hardware Environment: qemu x86
> Software Environment: Minimal Debian sid (unstable)
> Problem Description:
> 
> [I really thought I had already reported this, but since I can't find it
> either
> via bugzilla or google, I assume I haven't.]
> 
> Hi,
> 
> Unfortunately this is one of those bugs that I can't find a way to reproduce
> except by randomly breaking one fs after another. This happens with ext3 and
> ext4, but so far I haven't seen it happen with ext2.
> 
> On doing rm -rf on an intentionally corrupted ext3/ext4 filesystem, I
> occasionally hit bugs like this (ext3 backtrace from -rc3, two ext4 traces
> from
> -rc5). If you want me to try to reproduce the ext3 crash on latest -rc, just
> mention.
> 
> ----------
> *** seed 270, ext3, 2.6.27-rc3 ***
> EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone
> -
> block = 1479317508, count = 1
> EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone
> -
> block = 4718764, count = 1
> attempt to access beyond end of device
> hdb: rw=0, want=1048578, limit=20480
> EXT3-fs error (device hdb): ext3_free_branches: Read failure, inode=1428,
> block=524288
> EXT3-fs warning (device hdb): empty_dir: bad directory (dir #1360) - no `.'
> or
> `..'
> EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory
> #1332: directory entry across blocks - offset=0, inode=1332, rec_len=
> BUG: unable to handle kernel paging request at c7c3240c
> IP: [<c02e4be6>] empty_dir+0xe1/0x305
> *pde = 00007067 *pte = 07c32160
> Oops: 0000 [#1] DEBUG_PAGEALLOC
> [ 1306.100454]
> Pid: 24302, comm: rm Not tainted (2.6.27-rc3 #2)
> EIP: 0060:[<c02e4be6>] EFLAGS: 00000246 CPU: 0
> EIP is at empty_dir+0xe1/0x305
> EAX: c7c3240c EBX: c3fa7cc4 ECX: 00000534 EDX: 00000534
> ESI: c7c2a400 EDI: c74d4888 EBP: c1e6cef4 ESP: c1e6cec0
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process rm (pid: 24302, ti=c1e6c000 task=c5664d00 task.ti=c1e6c000)
> Stack: 00000000 c1e6cee4 c7aab400 00000058 38583e14 72b9e783 00000002
> c7c3240c
>        c7aaa800 00000000 c7440000 c744471c fffffffb c1e6cf28 c02e7910
>        00000246
>        c0620de0 c3c67690 c0620de0 c3c67688 c3fa7cc4 c3f6e230 c7cab9a0
>        00000000
> Call Trace:
>  [<c02e7910>] ? ext3_rmdir+0xb7/0x18f
>  [<c026ba2d>] ? vfs_rmdir+0x7e/0xb3
>  [<c026d2b7>] ? do_rmdir+0xb7/0xc3
>  [<c026d2f4>] ? sys_unlinkat+0x31/0x36
>  [<c0202f3e>] ? syscall_call+0x7/0xb
>  =======================
> Code: 08 5c b4 5d c0 c7 44 24 04 a4 26 55 c0 8b 45 ec 89 04 24 e8 47 45 00 00
> b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 85 c0 74 86
> 8d
> 56 08 b8 6c cb 5f c0 e8 a8 9d 17 00 85 c0
> EIP: [<c02e4be6>] empty_dir+0xe1/0x305 SS:ESP 0068:c1e6cec0
> ---[ end trace 3a33b21de407e362 ]---
> ----------
> *** seed 451, ext4, 2.6.27-rc5 ***
> attempt to access beyond end of device
> hdb: rw=0, want=268435458, limit=20480
> EXT4-fs error (device hdb): ext4_xattr_delete_inode: inode 507: block
> 134217728
> read error
> EXT4-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory
> #653: directory entry across blocks - offset=0, inode=653, rec_len=16
> BUG: unable to handle kernel paging request at c7d2540c
> IP: [<c02fb496>] empty_dir+0xe1/0x305
> *pde = 00007067 *pte = 07d25160
> Oops: 0000 [#1] DEBUG_PAGEALLOC
> [ 2151.877484]
> Pid: 20705, comm: rm Not tainted (2.6.27-rc5 #2)
> EIP: 0060:[<c02fb496>] EFLAGS: 00000246 CPU: 0
> EIP is at empty_dir+0xe1/0x305
> EAX: c7d2540c EBX: c48440e0 ECX: 0000028d EDX: 0000028d
> ESI: c7d21400 EDI: c1b99428 EBP: c1bd7ef4 ESP: c1bd7ec0
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process rm (pid: 20705, ti=c1bd7000 task=c1a38000 task.ti=c1bd7000)
> Stack: 00000000 c1bd7ee4 c6169800 0000007e e18fea3c 54ed2757 00000001
> c7d2540c
>        c6169400 00000000 c4a35020 c4982138 fffffffb c1bd7f28 c02fe5ef
>        00000246
>        c0620de0 c485bbe8 c0620de0 c485bbe0 c48440e0 c4a15dc8 c2b7a5c8
>        00000000
> Call Trace:
>  [<c02fe5ef>] ? ext4_rmdir+0xd5/0x1e8
>  [<c026bd5d>] ? vfs_rmdir+0x7e/0xb3
>  [<c026d5e7>] ? do_rmdir+0xb7/0xc3
>  [<c026d624>] ? sys_unlinkat+0x31/0x36
>  [<c0202f3e>] ? syscall_call+0x7/0xb
>  =======================
> Code: 08 54 b4 5d c0 c7 44 24 04 a4 34 55 c0 8b 45 ec 89 04 24 e8 73 4b 00 00
> b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 8
> EIP: [<c02fb496>] empty_dir+0xe1/0x305 SS:ESP 0068:c1bd7ec0
> ---[ end trace 79e4e3dfd3fb9e7d ]---
> umount: /mnt: device is busy
> ----------
> *** seed 10000193, ext4, 2.6.27-rc5 ***
> EXT4-fs warning (device hdb): empty_dir: bad directory (dir #733) - no `.' or
> `..'
> EXT4-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory
> #461: directory entry across blocks - offset=0, inode=461, rec_len=82
> BUG: unable to handle kernel paging request at c769940c
> IP: [<c02fb496>] empty_dir+0xe1/0x305
> *pde = 079e7163 *pte = 07699160
> Oops: 0000 [#1] DEBUG_PAGEALLOC
> [  961.774442]
> Pid: 4518, comm: rm Not tainted (2.6.27-rc5 #2)
> EIP: 0060:[<c02fb496>] EFLAGS: 00000246 CPU: 0
> EIP is at empty_dir+0xe1/0x305
> EAX: c769940c EBX: c3fc36c8 ECX: 000001cd EDX: 000001cd
> ESI: c7697400 EDI: c3fc8380 EBP: c7a6cef4 ESP: c7a6cec0
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process rm (pid: 4518, ti=c7a6c000 task=c78bc360 task.ti=c7a6c000)
> Stack: 00000000 c7a6cee4 c532ec00 0000007e 1da9562e eb3f2f99 00000001
> c769940c
>        c532e000 00000000 c3ee0020 c3eada08 fffffffb c7a6cf28 c02fe5ef
>        00000246
>        c0620de0 c747c560 c0620de0 c747c558 c3fc36c8 c3fc8d90 c76965f0
>        00000000
> Call Trace:
>  [<c02fe5ef>] ? ext4_rmdir+0xd5/0x1e8
>  [<c026bd5d>] ? vfs_rmdir+0x7e/0xb3
>  [<c026d5e7>] ? do_rmdir+0xb7/0xc3
>  [<c026d624>] ? sys_unlinkat+0x31/0x36
>  [<c0202f3e>] ? syscall_call+0x7/0xb
>  =======================
> Code: 08 54 b4 5d c0 c7 44 24 04 a4 34 55 c0 8b 45 ec 89 04 24 e8 73 4b 00 00
> b8 01 00 00 00 83 c4 28 5b 5e 5f 5d c3 8d 04 06 89 45 e8 <8b> 00 8
> EIP: [<c02fb496>] empty_dir+0xe1/0x305 SS:ESP 0068:c7a6cec0
> ---[ end trace 7aaee6ca8f8adc20 ]---
> ----------
> 
Comment 2 Theodore Tso 2008-09-09 14:55:45 UTC
> > Unfortunately this is one of those bugs that I can't find a way to
> > reproduce except by randomly breaking one fs after another. This
> > happens with ext3 and ext4, but so far I haven't seen it happen
> > with ext2.
> > 
> >
> > *** seed 270, ext3, 2.6.27-rc3 ***
> > *** seed 451, ext4, 2.6.27-rc5 ***

Given these seed numbers, I assume this was generating using some tool
like fsfuzzer?  Would it be possible to generate a filesystem image
*before* that triggers the problem case, before trying to execute the
rm -rf?  

That would be the fastest way to try to track the problem down.

							- Ted
Comment 3 Sami Liedes 2008-09-09 20:27:27 UTC
On Tue, Sep 09, 2008 at 05:55:31PM -0400, Theodore Tso wrote:
> > > Unfortunately this is one of those bugs that I can't find a way to
> > > reproduce except by randomly breaking one fs after another. This
> > > happens with ext3 and ext4, but so far I haven't seen it happen
> > > with ext2.
> > > 
> > >
> > > *** seed 270, ext3, 2.6.27-rc3 ***
> > > *** seed 451, ext4, 2.6.27-rc5 ***
> 
> Given these seed numbers, I assume this was generating using some tool
> like fsfuzzer?  Would it be possible to generate a filesystem image
> *before* that triggers the problem case, before trying to execute the
> rm -rf?  
> 
> That would be the fastest way to try to track the problem down.

Yes, I can generate those filesystems. However the problem seems to be
elusive in that I haven't yet been able to reproduce it twice with the
same filesystem (and even with random filesystems, it every occurs
once in a while). I'll do some more testing and try to figure out if
it can be reproduced more easily. Still I can give you some
filesystems that crashed once, if you wish. They are typically
something like 600 KiB compressed, and I guess that could be made less
by zeroing all regular files in the pristine fs before doing the
fuzzing.

Here's a script I use to do the testing ($1 is the initial seed). The
filesystem is a 10 MiB pristine ext[34] image with a copy of my
workstation's /dev and a partial copy of /usr/share/doc (I tried to be
diverse in what I put there).

------------------------------------------------------------
#!/bin/sh

if [ "`hostname`" != "fstest" ]; then
   echo "This is a dangerous script."
   echo "Set your hostname to \`fstest\' if you want to use it."
   exit 1
fi

umount /dev/hdb
umount /dev/hdc
/etc/init.d/sysklogd stop
/etc/init.d/klogd stop
/etc/init.d/cron stop
mount /dev/hda / -t ext3 -o remount,ro || exit 1

#ulimit -t 20

for ((s=$1; s<1000000000; s++)); do
  umount /mnt
  echo '***** zzuffing *****' seed $s
  zzuf -r 0:0.03 -s $s </dev/hdc >/dev/hdb || exit
  mount /dev/hdb /mnt -t ext2 -o errors=continue || continue
  cd /mnt || continue
  timeout 30 cp -r doc doc2 >&/dev/null
  timeout 30 find -xdev >&/dev/null
  timeout 30 find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
  timeout 30 mkdir tmp >&/dev/null
  timeout 30 echo whoah >tmp/filu 2>/dev/null
  timeout 30 rm -rf /mnt/* >&/dev/null
  cd /
done
------------------------------------------------------------

	Sami
Comment 4 Theodore Tso 2008-09-10 05:58:16 UTC
On Wed, Sep 10, 2008 at 06:26:34AM +0300, Sami Liedes wrote:
> 
> Yes, I can generate those filesystems. However the problem seems to be
> elusive in that I haven't yet been able to reproduce it twice with the
> same filesystem (and even with random filesystems, it every occurs
> once in a while). I'll do some more testing and try to figure out if
> it can be reproduced more easily. Still I can give you some
> filesystems that crashed once, if you wish. They are typically
> something like 600 KiB compressed, and I guess that could be made less
> by zeroing all regular files in the pristine fs before doing the
> fuzzing.

One easy way of doing this is the following:

    e2image -r /dev/hdXX /var/tmp/hdXX.e2i
    dd if=/var/tmp/hdXX.e2i of=/dev/hdXX

Another thing you can do is change your script to add the following
line before the filesystem is mounted:

     e2image -r /dev/hdXX - | bzip2 > /var/tmp/hdXX.e2i

and then if the filesystem fails (i.e., the system oops),
/var/tmp/hdXX.e2i.bz2 will have all of the filesystem metadata
(including directories), such that if you decompress and write out the
filesystem (or what I do when given one of these to examine):

   bunzip2 < hdXX.e2i.bz2 | make-sparse > hdXX.e2i

Said sparse file can now be checked via e2fsck, or mounted using a
loopback mount, etc.

Even if it's not reliably reproducable, if I can get a series of
filesystems which show the problem, using "e2fsck -nf" we can see a
pattern of how the filesystems are corrupted, and that can help narrow
down what might be going on that causes the kernel oops.

Thanks, regards,

     	  	   	    	 	    	   - Ted

/*
 * make-sparse.c --- make a sparse file from stdin
 * 
 * Copyright 2004 by Theodore Ts'o.
 *
 * %Begin-Header%
 * This file may be redistributed under the terms of the GNU Public
 * License.
 * %End-Header%
 */

#define _LARGEFILE_SOURCE
#define _LARGEFILE64_SOURCE

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>

int full_read(int fd, char *buf, size_t count)
{
	int got, total = 0;
	int pass = 0;

	while (count > 0) {
		got = read(fd, buf, count);
		if (got == -1) {
			if ((errno == EINTR) || (errno == EAGAIN)) 
				continue;
			return total ? total : -1;
		}
		if (got == 0) {
			if (pass++ >= 3)
				return total;
			continue;
		}
		pass = 0;
		buf += got;
		total += got;
		count -= got;
	}
	return total;
}

int main(int argc, char **argv)
{
	int fd, got, i;
	char buf[1024];

	if (argc != 2) {
		fprintf(stderr, "Usage: make-sparse out-file\n");
		exit(1);
	}
	fd = open(argv[1], O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0777);
	if (fd < 0) {
		perror(argv[1]);
		exit(1);
	}
	while (1) {
		got = full_read(0, buf, sizeof(buf));
		if (got == 0)
			break;
		if (got == sizeof(buf)) {
			for (i=0; i < sizeof(buf); i++) 
				if (buf[i])
					break;
			if (i == sizeof(buf)) {
				lseek(fd, sizeof(buf), SEEK_CUR);
				continue;
			}
		}
		write(fd, buf, got);
	}
	return 0;
}
		
Comment 5 Duane Griffin 2008-12-04 05:47:35 UTC
It looks another instance of our old friend, the unvalidated dir entry.

In this case empty_dir is calling ext3_next_entry (and using the results) without validating the entries with ext3_check_dir_entry first. That could cause these symptoms (or other nastiness such as an infinite loop).

Mostly untested patch coming shortly...
Comment 6 Duane Griffin 2008-12-04 05:50:33 UTC
Created attachment 19146 [details]
validate-dir-entries.patch

Patch to validate the first couple of directory entries ('.' and '..') in empty_dir before using them.
Comment 7 Duane Griffin 2008-12-04 05:55:36 UTC
Let me know if that seems to fix things.
Comment 8 Sami Liedes 2008-12-06 14:40:09 UTC
On Thu, Dec 04, 2008 at 05:50:34AM -0800, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=11525
> 
> 
> 
> 
> 
> ------- Comment #6 from duaneg@dghda.com  2008-12-04 05:50 -------
> Created an attachment (id=19146)
>  --> (http://bugzilla.kernel.org/attachment.cgi?id=19146&action=view)
> validate-dir-entries.patch
> 
> Patch to validate the first couple of directory entries ('.' and '..') in
> empty_dir before using them.

Tested with 2.6.27.8 + the patch + a couple other patches to fix fs
bugs.

Apparently didn't fix the problem, however the backtrace looks a bit
different now:

------------------------------------------------------------
***** zzuffing ***** seed 10000014
[  107.660000] kjournald starting.  Commit interval 5 seconds
[  107.660000] EXT3-fs warning: checktime reached, running e2fsck is recommended
[  107.660000] EXT3 FS on hdb, internal journal
[  107.660000] EXT3-fs: mounted filesystem with ordered data mode.
[  107.760000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1573: inode out of bounds - offset=24, inode=34342, rec_len=20, name_len=9
[  108.070000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1573: inode out of bounds - offset=24, inode=34342, rec_len=20, name_len=9
[  108.090000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1281: directory entry across blocks - offset=3348, inode=67436560, rec_len=28004, name_len=45
[  108.120000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #140: inode out of bounds - offset=9384, inode=4195644, rec_len=24, name_len=14
./runtest: line 31:  3524 Killed                  timeout 30 find -xdev >&/dev/null
[  138.180000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1573: inode out of bounds - offset=24, inode=34342, rec_len=20, name_len=9
[  138.190000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #1281: directory entry across blocks - offset=3348, inode=67436560, rec_len=28004, name_len=45
[  138.200000] EXT3-fs error (device hdb): htree_dirblock_to_tree: bad entry in directory #140: inode out of bounds - offset=9384, inode=4195644, rec_len=24, name_len=14
[  168.480000] attempt to access beyond end of device
[  168.480000] hdb: rw=0, want=262146, limit=20480
[  168.480000] EXT3-fs error (device hdb): ext3_free_branches: Read failure, inode=98, block=131072
[  168.600000] EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 33554432, count = 1
[  168.610000] EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 131072, count = 1
[  168.620000] EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 16384, count = 1
[  168.630000] BUG: unable to handle kernel paging request at c72d5004
[  168.630000] IP: [<c02dfc5f>] ext3_check_dir_entry+0xf/0x11a
[  168.630000] *pde = 07a14163 *pte = 072d5160 
[  168.630000] Oops: 0000 [#1] DEBUG_PAGEALLOC
[  168.630000] 
[  168.630000] Pid: 3902, comm: rm Not tainted (2.6.27.8 #1)
[  168.630000] EIP: 0060:[<c02dfc5f>] EFLAGS: 00000286 CPU: 0
[  168.630000] EIP is at ext3_check_dir_entry+0xf/0x11a
[  168.630000] EAX: c05ced45 EBX: c75b7064 ECX: c72d5000 EDX: c75b7064
[  168.630000] ESI: c05dc7e4 EDI: c746b9d8 EBP: c5eafeb4 ESP: c5eafe78
[  168.630000]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[  168.630000] Process rm (pid: 3902, ti=c5eaf000 task=c78b8000 task.ti=c5eaf000)
[  168.630000] Stack: c7a77080 00000005 c71f9bf8 c5eafeb8 2cd482e8 00000000 c5eafee4 fffffffb 
[  168.630000]        c5eafeb4 c75b7064 c05ced45 c7ac1400 c75b7064 c05dc7e4 c746b9d8 c5eafef4 
[  168.630000]        c02e5a98 c746b9d8 00000400 c03167ea c7ac0c00 00000058 2cd482e8 92a68f1d 
[  168.630000] Call Trace:
[  168.630000]  [<c02e5a98>] ? empty_dir+0xa3/0x366
[  168.630000]  [<c03167ea>] ? journal_start+0xb2/0x112
[  168.630000]  [<c02e8860>] ? ext3_rmdir+0xb7/0x18f
[  168.630000]  [<c026c61d>] ? vfs_rmdir+0x7e/0xb3
[  168.630000]  [<c026de87>] ? do_rmdir+0xb7/0xc3
[  168.630000]  [<c026dec4>] ? sys_unlinkat+0x31/0x36
[  168.630000]  [<c0202f1e>] ? syscall_call+0x7/0xb
[  168.630000]  =======================
[  168.630000] Code: 8d 5d f8 89 1c 24 e8 d8 f9 ff ff 83 c4 0c 5b 5d c3 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 57 56 53 83 ec 30 89 45 ec 89 55 e8 <0f> b7 41 04 3d ff ff 00 00 74 66 89 c6 83 f8 0b 0f 8f b0 00 00 
[  168.630000] EIP: [<c02dfc5f>] ext3_check_dir_entry+0xf/0x11a SS:ESP 0068:c5eafe78
[  168.630000] ---[ end trace 7f26b2bf1a74f033 ]---
umount: /mnt: device is busy
------------------------------------------------------------

$ addr2line -e /srv/chroot/ia32_sid/usr/src/linux-2.6.27.8/vmlinux -i 0xc02dfc5f
/usr/src/linux-2.6.27.8/include/linux/ext3_fs.h:663
/usr/src/linux-2.6.27.8/fs/ext3/dir.c:70

Those are:

/usr/src/linux-2.6.27.8/include/linux/ext3_fs.h:663:
------------------------------------------------------------
    661 static inline unsigned ext3_rec_len_from_disk(__le16 dlen)
    662 {
--> 663         unsigned len = le16_to_cpu(dlen);
    664
    665         if (len == EXT3_MAX_REC_LEN)
    666                 return 1 << 16;
    667         return len;
    668 }
------------------------------------------------------------

/usr/src/linux-2.6.27.8/fs/ext3/dir.c:70
------------------------------------------------------------
     64 int ext3_check_dir_entry (const char * function, struct inode * dir,
     65                           struct ext3_dir_entry_2 * de,
     66                           struct buffer_head * bh,
     67                           unsigned long offset)
     68 {
     69         const char * error_msg = NULL;
-->  70         const int rlen = ext3_rec_len_from_disk(de->rec_len);
     71
     72         if (rlen < EXT3_DIR_REC_LEN(1))
     73                 error_msg = "rec_len is smaller than minimal";
------------------------------------------------------------

I'll also attach the image with which this happened (in the state
before mounting), even if it doesn't seem always reproducible.

	Sami
Comment 9 Sami Liedes 2008-12-06 14:44:55 UTC
Created attachment 19188 [details]
(broken) ext3 image which caused the crash, in the state it was before mounting
Comment 10 Sami Liedes 2008-12-06 14:53:34 UTC
Created attachment 19189 [details]
Another fs image triggering a similar crash, bzip2 compressed

Note You need to log in before you can comment on or make changes to this bug.