Bug 11266 - unable to handle kernel paging request in ext2_free_blocks
Summary: unable to handle kernel paging request in ext2_free_blocks
Status: RESOLVED OBSOLETE
Alias: None
Product: File System
Classification: Unclassified
Component: ext2 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-07 05:53 UTC by Sami Liedes
Modified: 2012-10-30 15:02 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.27.4, 2.6.27-rc2 + patch for #10976 (now in -mm)
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Test case, corrupted ext3 file system, bzip2 compressed (118.37 KB, application/x-bzip2)
2008-11-01 21:59 UTC, Sami Liedes
Details

Description Sami Liedes 2008-08-07 05:53:36 UTC
Latest working kernel version: (I think at least 2.6.25.4 works)
Earliest failing kernel version:
Distribution: Minimal Debian sid (unstable)
Hardware Environment: qemu x86
Software Environment:
Problem Description:

Mere rm -rf after mounting on an intentionally corrupted partition occasionally causes "BUG: unable to handle kernel paging request" in ext2_free_blocks. 

Unfortunately the issue seems to be timing sensitive (or something), doing it on the same filesystem only sometimes results in the crash :( But I have reproduced it something like 6 times now with brief testing.

If you wish, I can attach some filesystems with which I have been able to reproduce this at least once.

Another thing I could do is take a look at it with the new kernel debugger (which I haven't tried yet) if none of you are able to figure out this from the traces. Is there something you would specifically want me to take a look at? The local and referenced variables at ext2_free_blocks(), I guess?

I think I ran quite extensive tests on 2.6.25.4 & ext2, so I suspect (but am not sure, I've made some changes to the way I test) this bug is newer than 2.6.25.4. I could do some bisecting too, but I haven't managed to automate the thing yet.

Here's a script I run under qemu, google for zzuf (it's a fuzzer), and timeout is from the Debian package `timeout':

----------
#!/bin/sh

if [ "`hostname`" != "fstest" ]; then
   echo "This is a dangerous script."
   echo "Set your hostname to \`fstest\' if you want to use it."
   exit 1
fi

umount /dev/hdb
umount /dev/hdc
/etc/init.d/sysklogd stop
/etc/init.d/klogd stop
/etc/init.d/cron stop
mount /dev/hda / -t ext3 -o remount,ro || exit 1

ulimit -t 20

for ((s=$1; s<1000000000; s++)); do
  umount /mnt
  echo '***** zzuffing *****' seed $s
  zzuf -r 0:0.03 -s $s </dev/hdc >/dev/hdb || exit
  mount /dev/hdb /mnt -o errors=continue || continue
  cd /mnt || continue
  cp -r doc doc2 >&/dev/null
  find -xdev >&/dev/null
  find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
  mkdir tmp >&/dev/null
  echo whoah >tmp/filu 2>/dev/null
  rm -rf /mnt/* >&/dev/null
  cd /
done
----------

The attached backtraces all start from the time of mounting the filesystem.
Comment 1 Sami Liedes 2008-08-07 05:54:44 UTC
Backtrace, filesystem hdb.849:

[ 3796.272178] attempt to access beyond end of device
[ 3796.272839] hdb: rw=0, want=134237336, limit=20480
[ 3796.273106] Buffer I/O error on device hdb, logical block 67118667
[ 3796.276009] attempt to access beyond end of device
[ 3796.277156] hdb: rw=0, want=134237336, limit=20480
[ 3796.278263] Buffer I/O error on device hdb, logical block 67118667
[ 3796.391961] EXT2-fs error (device hdb): ext2_check_page: bad entry in directory #1281: directory entry across blocks - offset=1536, inode=50659344, rec_len=29812, name_len=121
[ 3796.395802] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[ 3796.458033] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[ 3796.596656] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[ 3796.601485] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 256
[ 3796.608617] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 67118667, count = 1
[ 3796.611922] BUG: unable to handle kernel paging request at c7e95ffc
[ 3796.613327] IP: [<c031f151>] read_block_bitmap+0xa7/0x14b
[ 3796.614573] *pde = 00007067 *pte = 07e95160 
[ 3796.615585] Oops: 0000 [#1] DEBUG_PAGEALLOC
[ 3796.616533] 
[ 3796.616905] Pid: 5933, comm: rm Not tainted (2.6.27-rc2 #4)
[ 3796.618081] EIP: 0060:[<c031f151>] EFLAGS: 00000246 CPU: 0
[ 3796.619315] EIP is at read_block_bitmap+0xa7/0x14b
[ 3796.619417] EAX: ffffffff EBX: c7e96800 ECX: c7e96000 EDX: 00000001
[ 3796.619417] ESI: c2f40428 EDI: c7aba400 EBP: c7bd5d7c ESP: c7bd5d4c
[ 3796.619417]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[ 3796.619417] Process rm (pid: 5933, ti=c7bd5000 task=c58aa5e0 task.ti=c7bd5000)
[ 3796.619417] Stack: c620d024 c620d028 c620d024 c620d098 00000002 c78f8500 00000000 00000000 
[ 3796.619417]        c7e96000 c7e96400 014f1c6d 00000003 c7bd5de0 c031f2ef c061da3c c7bd5da0 
[ 3796.619417]        c024982b c10ff2d8 c10ff2c0 c7bd5dd8 c0948780 00000029 00000000 0000002a 
[ 3796.619417] Call Trace:
[ 3796.619417]  [<c031f2ef>] ? ext2_free_blocks+0xfa/0x2b7
[ 3796.619417]  [<c024982b>] ? activate_page+0x9a/0xa2
[ 3796.619417]  [<c0322c5f>] ? ext2_free_branches+0x12d/0x190
[ 3796.619417]  [<c02820ef>] ? __getblk+0x27/0x294
[ 3796.619417]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[ 3796.619417]  [<c032320c>] ? ext2_truncate+0x54a/0x5a7
[ 3796.619417]  [<c0322431>] ? ext2_update_inode+0x1a0/0x31f
[ 3796.619417]  [<c0543169>] ? _spin_unlock+0x1d/0x20
[ 3796.619417]  [<c0323ae8>] ? ext2_delete_inode+0x7e/0x86
[ 3796.619417]  [<c0323a6a>] ? ext2_delete_inode+0x0/0x86
[ 3796.619417]  [<c027558b>] ? generic_delete_inode+0x62/0xd5
[ 3796.619417]  [<c0275711>] ? generic_drop_inode+0x113/0x162
[ 3796.619417]  [<c027469c>] ? iput+0x47/0x4e
[ 3796.619417]  [<c026bb14>] ? do_unlinkat+0xc3/0x13d
[ 3796.619417]  [<c054225f>] ? mutex_unlock+0x8/0xa
[ 3796.619417]  [<c026dc08>] ? vfs_readdir+0x60/0x85
[ 3796.619417]  [<c026d94c>] ? filldir64+0x0/0xd7
[ 3796.619417]  [<c026dcc4>] ? sys_getdents64+0x97/0xa1
[ 3796.619417]  [<c026bc86>] ? sys_unlinkat+0x23/0x36
[ 3796.619417]  [<c0202dfe>] ? syscall_call+0x7/0xb
[ 3796.619417]  =======================
[ 3796.619417] Code: 26 00 0f 88 94 00 00 00 8b 87 8c 02 00 00 89 45 e4 8b 55 e8 0f af 50 10 8b 40 34 03 50 14 8b 03 89 45 ec 8b 4e 14 89 4d f0 29 d0 <0f> a3 01 19 c0 85 c0 74 11 8b 43 04 89 45 ec 29 d0 0f a3 01 19 
[ 3796.619417] EIP: [<c031f151>] read_block_bitmap+0xa7/0x14b SS:ESP 0068:c7bd5d4c
[ 3796.619417] ---[ end trace 827753bbb10816b5 ]---
Comment 2 Sami Liedes 2008-08-07 05:56:48 UTC
Backtrace, filesystem hdb.10000057, with some of the messages removed to fit in a 64k comment:

***** zzuffing ***** seed 10000057
./runtest: line 31: 10409 Killed                  timeout 30 cp -r doc doc2 >&/dev/null
[  289.403150] EXT2-fs error (device hdb): ext2_check_page: bad entry in directory #11: inode out of bounds - offset=10240, inode=8388608, rec_len=1024, name_len=0
[  289.404505] EXT2-fs error (device hdb): ext2_readdir: bad page in #11
[  289.407140] EXT2-fs error (device hdb): ext2_check_page: bad entry in directory #1281: unaligned directory entry - offset=176, inode=33751052, rec_len=25966, name_len=116
... ... ... ... ... ... ...
[  291.368239] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 6816452, count = 1
[  291.370703] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 1213178368, count = 1
[  291.373268] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 7078488, count = 1
[  291.375763] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 1420214750, count = 1
[  291.378346] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 6554412, count = 1
[  291.380783] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 1479317508, count = 1
[  291.383262] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4718764, count = 1
[  291.385706] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 1024
[  291.387722] attempt to access beyond end of device
[  291.388782] hdb: rw=0, want=2425549960, limit=20480
[  291.390063] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=1519, block=1212774979
[  291.392175] attempt to access beyond end of device
[  291.393250] hdb: rw=0, want=2425704338, limit=20480
[  291.394536] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=1519, block=1212852168
[  291.396873] attempt to access beyond end of device
[  291.398062] hdb: rw=0, want=2425051470, limit=20480
[  291.399468] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=1519, block=1212525734
[  291.402177] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 16
[  291.404377] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 131072, count = 1
[  291.407194] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 1
[  291.409154] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 2048
[  291.411169] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 16384, count = 1
[  291.413593] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 4096
[  291.415624] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 33816576, count = 1
[  291.418238] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks in system zones - Block = 128, count = 1
[  291.420690] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 67108864, count = 1
[  291.423224] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 1024
[  291.425259] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 8208
[  291.427317] attempt to access beyond end of device
[  291.428366] hdb: rw=0, want=264146, limit=20480
[  291.429548] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=1519, block=132072
[  291.431780] BUG: unable to handle kernel paging request at c11e3ffc
[  291.433237] IP: [<c031f151>] read_block_bitmap+0xa7/0x14b
[  291.434486] *pde = 078b8163 *pte = 011e3160 
[  291.435512] Oops: 0000 [#1] DEBUG_PAGEALLOC
[  291.435763] 
[  291.435763] Pid: 10422, comm: rm Not tainted (2.6.27-rc2 #4)
[  291.435763] EIP: 0060:[<c031f151>] EFLAGS: 00000246 CPU: 0
[  291.435763] EIP is at read_block_bitmap+0xa7/0x14b
[  291.435763] EAX: ffffffff EBX: c11e4800 ECX: c11e4000 EDX: 00000001
[  291.435763] ESI: c74e35e8 EDI: c7ac8400 EBP: c7b38cfc ESP: c7b38ccc
[  291.435763]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[  291.435763] Process rm (pid: 10422, ti=c7b38000 task=c78acbc0 task.ti=c7b38000)
[  291.435763] Stack: c740aa64 c740aa68 c740aa64 c740aad8 00000002 c7a05500 00000000 00000000 
[  291.435763]        c11e4000 c11e4400 01b01e00 00000003 c7b38d60 c031f2ef c7b38d74 00000296 
[  291.435763]        c7b38d90 c0878546 c7b38d84 c740a954 c1025c80 00000029 00000000 0000002a 
[  291.435763] Call Trace:
[  291.435763]  [<c031f2ef>] ? ext2_free_blocks+0xfa/0x2b7
[  291.435763]  [<c0322c5f>] ? ext2_free_branches+0x12d/0x190
[  291.435763]  [<c02820ef>] ? __getblk+0x27/0x294
[  291.435763]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[  291.435763]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[  291.435763]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[  291.435763]  [<c03231b4>] ? ext2_truncate+0x4f2/0x5a7
[  291.435763]  [<c02441f2>] ? mempool_free+0x6b/0x73
[  291.435763]  [<c0281ad0>] ? submit_bh+0xd2/0xf1
[  291.435763]  [<c0282f82>] ? sync_dirty_buffer+0x60/0xd5
[  291.435763]  [<c0322431>] ? ext2_update_inode+0x1a0/0x31f
[  291.435763]  [<c0543169>] ? _spin_unlock+0x1d/0x20
[  291.435763]  [<c0323ae8>] ? ext2_delete_inode+0x7e/0x86
[  291.435763]  [<c0323a6a>] ? ext2_delete_inode+0x0/0x86
[  291.435763]  [<c027558b>] ? generic_delete_inode+0x62/0xd5
[  291.435763]  [<c0275711>] ? generic_drop_inode+0x113/0x162
[  291.435763]  [<c027469c>] ? iput+0x47/0x4e
[  291.435763]  [<c026bb14>] ? do_unlinkat+0xc3/0x13d
[  291.435763]  [<c054225f>] ? mutex_unlock+0x8/0xa
[  291.435763]  [<c026dc08>] ? vfs_readdir+0x60/0x85
[  291.435763]  [<c026d94c>] ? filldir64+0x0/0xd7
[  291.435763]  [<c026dcc4>] ? sys_getdents64+0x97/0xa1
[  291.435763]  [<c026bc86>] ? sys_unlinkat+0x23/0x36
[  291.435763]  [<c0202dfe>] ? syscall_call+0x7/0xb
[  291.435763]  =======================
[  291.435763] Code: 26 00 0f 88 94 00 00 00 8b 87 8c 02 00 00 89 45 e4 8b 55 e8 0f af 50 10 8b 40 34 03 50 14 8b 03 89 45 ec 8b 4e 14 89 4d f0 29 d0 <0f> a3 01 19 c0 85 c0 74 11 8b 43 04 89 45 ec 29 d0 0f a3 01 19 
[  291.435763] EIP: [<c031f151>] read_block_bitmap+0xa7/0x14b SS:ESP 0068:c7b38ccc
[  291.435763] ---[ end trace 23466f100f17dfb7 ]---
Comment 3 Sami Liedes 2008-08-07 05:57:51 UTC
Backtrace for filesystem hdb.20000113.txt:

***** zzuffing ***** seed 20000113
[  481.954567] attempt to access beyond end of device
[  481.955149] hdb: rw=0, want=4213772, limit=20480
[  481.955453] attempt to access beyond end of device
[  481.955686] hdb: rw=0, want=4213772, limit=20480
[  482.008464] EXT2-fs error (device hdb): ext2_check_page: bad entry in directory #1573: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
[  482.009420] EXT2-fs error (device hdb): ext2_readdir: bad page in #1573
[  482.016734] attempt to access beyond end of device
[  482.016734] hdb: rw=0, want=8408544, limit=20480
[  482.017081] Buffer I/O error on device hdb, logical block 4204271
[  482.018544] attempt to access beyond end of device
[  482.018808] hdb: rw=0, want=8408544, limit=20480
[  482.019047] Buffer I/O error on device hdb, logical block 4204271
[  482.035546] EXT2-fs error (device hdb): ext2_check_page: bad entry in directory #1601: directory entry across blocks - offset=96, inode=1605, rec_len=32792, name_len=16
[  482.036456] EXT2-fs error (device hdb): ext2_readdir: bad page in #1601
[  482.085122] EXT2-fs error (device hdb): ext2_check_page: bad entry in directory #1281: rec_len is smaller than minimal - offset=1116, inode=1658, rec_len=4, name_len=3
[  482.086043] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[  482.091082] EXT2-fs error (device hdb): ext2_readdir: bad page in #1573
[  482.093049] EXT2-fs error (device hdb): ext2_readdir: bad page in #1601
[  482.128294] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[  482.132962] EXT2-fs error (device hdb): ext2_readdir: bad page in #1573
[  482.134867] EXT2-fs error (device hdb): ext2_readdir: bad page in #1601
[  482.263682] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[  482.268684] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 2106885, count = 1
[  482.271479] EXT2-fs error (device hdb): ext2_xattr_delete_inode: inode 1527: bad block -2147483648
[  482.274609] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 4096
[  482.275275] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 524288, count = 1
[  482.276551] attempt to access beyond end of device
[  482.276994] hdb: rw=0, want=131074, limit=20480
[  482.277508] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=1540, block=65536
[  482.278579] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 131072, count = 1
[  482.279277] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 1073741824, count = 1
[  482.279998] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.280691] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.281382] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.282068] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.282751] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.283433] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.284101] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 134217727, count = 1
[  482.284769] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294959103, count = 1
[  482.285465] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.286147] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.286977] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.287682] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.288343] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
[  482.289033] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 2449473535, count = 1
[  482.289731] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 3565830418, count = 1
[  482.290419] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4294965384, count = 1
[  482.291290] BUG: unable to handle kernel paging request at c4b97ffc
[  482.292742] IP: [<c031f151>] read_block_bitmap+0xa7/0x14b
[  482.294030] *pde = 06d6a163 *pte = 04b97160 
[  482.295075] Oops: 0000 [#1] DEBUG_PAGEALLOC
[  482.296058] 
[  482.296458] Pid: 21153, comm: rm Not tainted (2.6.27-rc2 #4)
[  482.296734] EIP: 0060:[<c031f151>] EFLAGS: 00000246 CPU: 0
[  482.296734] EIP is at read_block_bitmap+0xa7/0x14b
[  482.296734] EAX: ffffffff EBX: c4b98800 ECX: c4b98000 EDX: 00000001
[  482.296734] ESI: c2c24700 EDI: c7ac7800 EBP: c1ac0d3c ESP: c1ac0d0c
[  482.296734]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[  482.296734] Process rm (pid: 21153, ti=c1ac0000 task=c7890000 task.ti=c1ac0000)
[  482.296734] Stack: c1ac0d24 c0543211 c2c23dc8 c1ac0d24 c0543169 c79e7500 00000000 00000000 
[  482.296734]        c4b98000 c4b98400 ffffffff 00000001 c1ac0da0 c031f2ef c7ac7800 c05cc70f 
[  482.296734]        c05d8d94 fffff888 00000001 c740aad8 00000002 000003fe 00000000 000003ff 
[  482.296734] Call Trace:
[  482.296734]  [<c0543211>] ? _spin_lock+0x32/0x38
[  482.296734]  [<c0543169>] ? _spin_unlock+0x1d/0x20
[  482.296734]  [<c031f2ef>] ? ext2_free_blocks+0xfa/0x2b7
[  482.296734]  [<c0543122>] ? _spin_unlock_irq+0x1d/0x21
[  482.296734]  [<c024982b>] ? activate_page+0x9a/0xa2
[  482.296734]  [<c0322c5f>] ? ext2_free_branches+0x12d/0x190
[  482.296734]  [<c02820ef>] ? __getblk+0x27/0x294
[  482.296734]  [<c0281ffa>] ? __find_get_block+0x6d/0x13b
[  482.296734]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[  482.296734]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[  482.296734]  [<c03231e0>] ? ext2_truncate+0x51e/0x5a7
[  482.296734]  [<c0322431>] ? ext2_update_inode+0x1a0/0x31f
[  482.296734]  [<c0543169>] ? _spin_unlock+0x1d/0x20
[  482.296734]  [<c0323ae8>] ? ext2_delete_inode+0x7e/0x86
[  482.296734]  [<c0323a6a>] ? ext2_delete_inode+0x0/0x86
[  482.296734]  [<c027558b>] ? generic_delete_inode+0x62/0xd5
[  482.296734]  [<c0275711>] ? generic_drop_inode+0x113/0x162
[  482.296734]  [<c027469c>] ? iput+0x47/0x4e
[  482.296734]  [<c026bb14>] ? do_unlinkat+0xc3/0x13d
[  482.296734]  [<c054225f>] ? mutex_unlock+0x8/0xa
[  482.296734]  [<c026dc08>] ? vfs_readdir+0x60/0x85
[  482.296734]  [<c026d94c>] ? filldir64+0x0/0xd7
[  482.296734]  [<c026dcc4>] ? sys_getdents64+0x97/0xa1
[  482.296734]  [<c026bc86>] ? sys_unlinkat+0x23/0x36
[  482.296734]  [<c0202dfe>] ? syscall_call+0x7/0xb
[  482.296734]  =======================
[  482.296734] Code: 26 00 0f 88 94 00 00 00 8b 87 8c 02 00 00 89 45 e4 8b 55 e8 0f af 50 10 8b 40 34 03 50 14 8b 03 89 45 ec 8b 4e 14 89 4d f0 29 d0 <0f> a3 01 19 c0 85 c0 74 11 8b 43 04 89 45 ec 29 d0 0f a3 01 19 
[  482.296734] EIP: [<c031f151>] read_block_bitmap+0xa7/0x14b SS:ESP 0068:c1ac0d0c
[  482.296734] ---[ end trace b455507981ff20f4 ]---
Comment 4 Sami Liedes 2008-08-07 05:59:09 UTC
Backtrace for filesystem hdb.30000019, with some error messages removed:

***** zzuffing ***** seed 30000019
[  107.103882] attempt to access beyond end of device
[  107.105397] hdb: rw=0, want=536890588, limit=20480
[  107.106539] Buffer I/O error on device hdb, logical block 268445293
[  107.109260] attempt to access beyond end of device
[  107.110498] hdb: rw=0, want=536890588, limit=20480
[  107.111538] Buffer I/O error on device hdb, logical block 268445293
[  107.185925] EXT2-fs error (device hdb): ext2_check_page: bad entry in directory #1609: directory entry across blocks - offset=44, inode=1611, rec_len=1044, name_len=9
[  107.189303] EXT2-fs error (device hdb): ext2_readdir: bad page in #1609
[  107.505070] EXT2-fs error (device hdb): ext2_readdir: bad page in #1609
[  107.699813] EXT2-fs error (device hdb): ext2_readdir: bad page in #1609
[  108.273863] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 4096
[  108.293188] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 33619980, count = 1
[  108.296045] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks in system zones - Block = 46, count = 1
... ... ... ... ... ...
[  109.129738] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 485613776, count = 1
[  109.130288] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 7602660, count = 1
[  109.130834] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 930139778, count = 1
[  109.131388] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 8126708, count = 1
[  109.131936] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 1042418564, count = 1
[  109.132494] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 6816452, count = 1
[  109.133039] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 1213178368, count = 1
[  109.133593] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 7078488, count = 1
[  109.134137] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 1420214750, count = 1
[  109.134693] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 6554412, count = 1
[  109.135237] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 1479317508, count = 1
[  109.135791] EXT2-fs error (device hdb): ext2_free_blocks: Freeing blocks not in datazone - block = 4718764, count = 1
[  109.136341] EXT2-fs error (device hdb): ext2_free_blocks: bit already cleared for block 1024
[  109.136743] attempt to access beyond end of device
[  109.136935] hdb: rw=0, want=2425549960, limit=20480
[  109.137288] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=126, block=1212774979
[  109.137744] attempt to access beyond end of device
[  109.137935] hdb: rw=0, want=2425704338, limit=20480
[  109.138282] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=126, block=1212852168
[  109.138691] attempt to access beyond end of device
[  109.138884] hdb: rw=0, want=2425051470, limit=20480
[  109.139235] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=126, block=1212525734
[  109.139712] attempt to access beyond end of device
[  109.139901] hdb: rw=0, want=264146, limit=20480
[  109.140235] EXT2-fs error (device hdb): ext2_free_branches: Read failure, inode=126, block=132072
[  109.140845] BUG: unable to handle kernel paging request at c71a3ffc
[  109.141143] IP: [<c031f151>] read_block_bitmap+0xa7/0x14b
[  109.141433] *pde = 078f3163 *pte = 071a3160 
[  109.141685] Oops: 0000 [#1] DEBUG_PAGEALLOC
[  109.141899] 
[  109.142009] Pid: 3972, comm: rm Not tainted (2.6.27-rc2 #4)
[  109.142217] EIP: 0060:[<c031f151>] EFLAGS: 00000246 CPU: 0
[  109.142458] EIP is at read_block_bitmap+0xa7/0x14b
[  109.142643] EAX: ffffffff EBX: c71a4800 ECX: c71a4000 EDX: 00000001
[  109.142875] ESI: c7595a10 EDI: c7ac3400 EBP: c6bd9cc8 ESP: c6bd9c98
[  109.143105]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[  109.143312] Process rm (pid: 3972, ti=c6bd9000 task=c78f9940 task.ti=c6bd9000)
[  109.143568] Stack: c740aa64 c740aa68 c740aa64 c740aad8 00000002 c79e1500 00000000 00000000 
[  109.143995]        c71a4000 c71a4400 01281ca7 00000003 c6bd9d2c c031f2ef c061da3c c6bd9cec 
[  109.144407]        c024982b c10e5498 c10e5480 c6bd9d24 c0948780 00000029 00000000 0000002a 
[  109.144822] Call Trace:
[  109.144966]  [<c031f2ef>] ? ext2_free_blocks+0xfa/0x2b7
[  109.148071]  [<c024982b>] ? activate_page+0x9a/0xa2
[  109.149303]  [<c0322c5f>] ? ext2_free_branches+0x12d/0x190
[  109.149303]  [<c02820ef>] ? __getblk+0x27/0x294
[  109.149303]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[  109.149303]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[  109.149303]  [<c0322c20>] ? ext2_free_branches+0xee/0x190
[  109.149303]  [<c03231b4>] ? ext2_truncate+0x4f2/0x5a7
[  109.149303]  [<c0322431>] ? ext2_update_inode+0x1a0/0x31f
[  109.149303]  [<c0323ae8>] ? ext2_delete_inode+0x7e/0x86
[  109.149303]  [<c0323a6a>] ? ext2_delete_inode+0x0/0x86
[  109.149303]  [<c027558b>] ? generic_delete_inode+0x62/0xd5
[  109.149303]  [<c0275711>] ? generic_drop_inode+0x113/0x162
[  109.149303]  [<c027469c>] ? iput+0x47/0x4e
[  109.149303]  [<c0272117>] ? dentry_iput+0x6b/0xb1
[  109.149303]  [<c02721c9>] ? d_kill+0x1d/0x37
[  109.149303]  [<c0273afb>] ? dput+0x58/0x10a
[  109.149303]  [<c026bc44>] ? do_rmdir+0xa4/0xc3
[  109.149303]  [<c026bc94>] ? sys_unlinkat+0x31/0x36
[  109.149303]  [<c0202dfe>] ? syscall_call+0x7/0xb
[  109.149303]  =======================
[  109.149303] Code: 26 00 0f 88 94 00 00 00 8b 87 8c 02 00 00 89 45 e4 8b 55 e8 0f af 50 10 8b 40 34 03 50 14 8b 03 89 45 ec 8b 4e 14 89 4d f0 29 d0 <0f> a3 01 19 c0 85 c0 74 11 8b 43 04 89 45 ec 29 d0 0f a3 01 19 
[  109.149303] EIP: [<c031f151>] read_block_bitmap+0xa7/0x14b SS:ESP 0068:c6bd9c98
[  109.149303] ---[ end trace a0739d57ca47c4e4 ]---
Comment 5 Anonymous Emailer 2008-08-07 10:54:01 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu,  7 Aug 2008 05:53:37 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11266
> 
>            Summary: unable to handle kernel paging request in
>                     ext2_free_blocks
>            Product: File System
>            Version: 2.5
>      KernelVersion: 2.6.27-rc2 + patch for #10976 (now in -mm)
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: ext2
>         AssignedTo: akpm@osdl.org
>         ReportedBy: sliedes@cc.hut.fi
> 
> 
> Latest working kernel version: (I think at least 2.6.25.4 works)
> Earliest failing kernel version:
> Distribution: Minimal Debian sid (unstable)
> Hardware Environment: qemu x86
> Software Environment:
> Problem Description:
> 
> Mere rm -rf after mounting on an intentionally corrupted partition
> occasionally
> causes "BUG: unable to handle kernel paging request" in ext2_free_blocks. 
> 
> Unfortunately the issue seems to be timing sensitive (or something), doing it
> on the same filesystem only sometimes results in the crash :( But I have
> reproduced it something like 6 times now with brief testing.
> 
> If you wish, I can attach some filesystems with which I have been able to
> reproduce this at least once.
> 
> Another thing I could do is take a look at it with the new kernel debugger
> (which I haven't tried yet) if none of you are able to figure out this from
> the
> traces. Is there something you would specifically want me to take a look at?
> The local and referenced variables at ext2_free_blocks(), I guess?
> 
> I think I ran quite extensive tests on 2.6.25.4 & ext2, so I suspect (but am
> not sure, I've made some changes to the way I test) this bug is newer than
> 2.6.25.4. I could do some bisecting too, but I haven't managed to automate
> the
> thing yet.
> 
> Here's a script I run under qemu, google for zzuf (it's a fuzzer), and
> timeout
> is from the Debian package `timeout':
> 
> ----------
> #!/bin/sh
> 
> if [ "`hostname`" != "fstest" ]; then
>    echo "This is a dangerous script."
>    echo "Set your hostname to \`fstest\' if you want to use it."
>    exit 1
> fi
> 
> umount /dev/hdb
> umount /dev/hdc
> /etc/init.d/sysklogd stop
> /etc/init.d/klogd stop
> /etc/init.d/cron stop
> mount /dev/hda / -t ext3 -o remount,ro || exit 1
> 
> ulimit -t 20
> 
> for ((s=$1; s<1000000000; s++)); do
>   umount /mnt
>   echo '***** zzuffing *****' seed $s
>   zzuf -r 0:0.03 -s $s </dev/hdc >/dev/hdb || exit
>   mount /dev/hdb /mnt -o errors=continue || continue
>   cd /mnt || continue
>   cp -r doc doc2 >&/dev/null
>   find -xdev >&/dev/null
>   find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
>   mkdir tmp >&/dev/null
>   echo whoah >tmp/filu 2>/dev/null
>   rm -rf /mnt/* >&/dev/null
>   cd /
> done
> ----------
> 
> The attached backtraces all start from the time of mounting the filesystem.
> 

Yes, please do test 2.6.26.

Aneesh, your recent changes to the ext2 block allocator would have to
be prime suspects here.
Comment 6 Sami Liedes 2008-08-07 13:07:45 UTC
On Thu, Aug 07, 2008 at 10:52:51AM -0700, Andrew Morton wrote:
> Yes, please do test 2.6.26.

Did that. I can reproduce the same crash on 2.6.26 and 2.6.26.2.

	Sami
Comment 7 Sami Liedes 2008-08-07 13:28:58 UTC
On Thu, Aug 07, 2008 at 11:07:17PM +0300, Sami Liedes wrote:
> On Thu, Aug 07, 2008 at 10:52:51AM -0700, Andrew Morton wrote:
> > Yes, please do test 2.6.26.
> 
> Did that. I can reproduce the same crash on 2.6.26 and 2.6.26.2.

2.6.25.15 crashes too, so I might have been wrong about 2.6.25.4
working (unless something changed between those two versions).

	Sami
Comment 8 Jan Kara 2008-08-18 07:58:44 UTC
> On Thu, Aug 07, 2008 at 11:07:17PM +0300, Sami Liedes wrote:
> > On Thu, Aug 07, 2008 at 10:52:51AM -0700, Andrew Morton wrote:
> > > Yes, please do test 2.6.26.
> > 
> > Did that. I can reproduce the same crash on 2.6.26 and 2.6.26.2.
> 
> 2.6.25.15 crashes too, so I might have been wrong about 2.6.25.4
> working (unless something changed between those two versions).
  I think this is the same problem Vegard reported in
http://marc.info/?l=linux-ext4&m=121637999611618&w=2.
  The problem seems to be in ext2_valid_block_bitmap() which does

  bitmap_blk = le32_to_cpu(desc->bg_block_bitmap);
  offset = bitmap_blk - group_first_block;
  if (!ext2_test_bit(offset, bh->b_data))

  (and similarly for inode bitmap). Now when the group descriptor is
corrupted, this simply accesses beyond the bh->b_data...
  The patch below should hopefully fix the issue. Can you test it
please?

								Honza
-- 
Jan Kara <jack@suse.cz>
SuSE CR Labs
---

From 06953717138efe3ad535e78343beb7204ac0d274 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 18 Aug 2008 16:45:11 +0200
Subject: [PATCH] ext2: Check for corrupted group descriptor before using data in it

We have to check whether a group descriptor isn't corrupted in
read_block_bitmap(). Otherwise ext2_valid_block_bitmap() will try
to access bits outside of bitmap and Oops happens.

CC: Vegard Nossum <vegard.nossum@gmail.com>
CC: Sami Liedes <sliedes@cc.hut.fi>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext2/balloc.c |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/fs/ext2/balloc.c b/fs/ext2/balloc.c
index 10bb02c..9104712 100644
--- a/fs/ext2/balloc.c
+++ b/fs/ext2/balloc.c
@@ -113,6 +113,17 @@ err_out:
 	return 0;
 }
 
+static int ext2_block_in_group(struct super_block *sb,
+			unsigned int block_group, ext2_fsblk_t block)
+{
+	if (block < ext2_group_first_block_no(sb, block_group))
+		return 0;
+	if (block >= ext2_group_first_block_no(sb, block_group) +
+	    EXT2_BLOCKS_PER_GROUP(sb))
+		return 0;
+	return 1;
+}
+
 /*
  * Read the bitmap for a given block_group,and validate the
  * bits for block/inode/inode tables are set in the bitmaps
@@ -129,6 +140,24 @@ read_block_bitmap(struct super_block *sb, unsigned int block_group)
 	desc = ext2_get_group_desc(sb, block_group, NULL);
 	if (!desc)
 		return NULL;
+	if (!ext2_block_in_group(sb, block_group,
+				le32_to_cpu(desc->bg_block_bitmap)) ||
+	    !ext2_block_in_group(sb, block_group,
+				le32_to_cpu(desc->bg_inode_bitmap)) ||
+	    !ext2_block_in_group(sb, block_group,
+				le32_to_cpu(desc->bg_inode_table)) ||
+	    !ext2_block_in_group(sb, block_group,
+				le32_to_cpu(desc->bg_inode_table) +
+				EXT2_SB(sb)->s_itb_per_group - 1)) {
+		ext2_error(sb, __func__, "Corrupted group descriptor - "
+				"block_group = %u, block_bitmap = %u, "
+				"inode_bitmap = %u, inode_table = %u",
+				block_group,
+				le32_to_cpu(desc->bg_block_bitmap),
+				le32_to_cpu(desc->bg_inode_bitmap),
+				le32_to_cpu(desc->bg_inode_table));
+		return NULL;
+	}
 	bitmap_blk = le32_to_cpu(desc->bg_block_bitmap);
 	bh = sb_getblk(sb, bitmap_blk);
 	if (unlikely(!bh)) {
Comment 9 Aneesh Kumar K.V 2008-08-18 09:51:55 UTC
On Mon, Aug 18, 2008 at 04:58:41PM +0200, Jan Kara wrote:
> 
> From 06953717138efe3ad535e78343beb7204ac0d274 Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Mon, 18 Aug 2008 16:45:11 +0200
> Subject: [PATCH] ext2: Check for corrupted group descriptor before using data
> in it
> 
> We have to check whether a group descriptor isn't corrupted in
> read_block_bitmap(). Otherwise ext2_valid_block_bitmap() will try
> to access bits outside of bitmap and Oops happens.
> 
> CC: Vegard Nossum <vegard.nossum@gmail.com>
> CC: Sami Liedes <sliedes@cc.hut.fi>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext2/balloc.c |   29 +++++++++++++++++++++++++++++
>  1 files changed, 29 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/ext2/balloc.c b/fs/ext2/balloc.c
> index 10bb02c..9104712 100644
> --- a/fs/ext2/balloc.c
> +++ b/fs/ext2/balloc.c
> @@ -113,6 +113,17 @@ err_out:
>       return 0;
>  }
> 
> +static int ext2_block_in_group(struct super_block *sb,
> +                     unsigned int block_group, ext2_fsblk_t block)
> +{
> +     if (block < ext2_group_first_block_no(sb, block_group))
> +             return 0;
> +     if (block >= ext2_group_first_block_no(sb, block_group) +
> +         EXT2_BLOCKS_PER_GROUP(sb))
> +             return 0;
> +     return 1;
> +}
> +
>  /*
>   * Read the bitmap for a given block_group,and validate the
>   * bits for block/inode/inode tables are set in the bitmaps
> @@ -129,6 +140,24 @@ read_block_bitmap(struct super_block *sb, unsigned int
> block_group)
>       desc = ext2_get_group_desc(sb, block_group, NULL);
>       if (!desc)
>               return NULL;
> +     if (!ext2_block_in_group(sb, block_group,
> +                             le32_to_cpu(desc->bg_block_bitmap)) ||
> +         !ext2_block_in_group(sb, block_group,
> +                             le32_to_cpu(desc->bg_inode_bitmap)) ||
> +         !ext2_block_in_group(sb, block_group,
> +                             le32_to_cpu(desc->bg_inode_table)) ||
> +         !ext2_block_in_group(sb, block_group,
> +                             le32_to_cpu(desc->bg_inode_table) +
> +                             EXT2_SB(sb)->s_itb_per_group - 1)) {
> +             ext2_error(sb, __func__, "Corrupted group descriptor - "
> +                             "block_group = %u, block_bitmap = %u, "
> +                             "inode_bitmap = %u, inode_table = %u",
> +                             block_group,
> +                             le32_to_cpu(desc->bg_block_bitmap),
> +                             le32_to_cpu(desc->bg_inode_bitmap),
> +                             le32_to_cpu(desc->bg_inode_table));
> +             return NULL;
> +     }
>       bitmap_blk = le32_to_cpu(desc->bg_block_bitmap);
>       bh = sb_getblk(sb, bitmap_blk);
>       if (unlikely(!bh)) {

Do we need to do this validation every time we do a read_block_bitmap ?
I guess we need to move the validation where we read the desc blocks
from the disk.

-aneesh
Comment 10 Sami Liedes 2008-08-18 16:30:32 UTC
On Mon, Aug 18, 2008 at 07:58:46AM -0700, bugme-daemon@bugzilla.kernel.org wrote:
>   I think this is the same problem Vegard reported in
> http://marc.info/?l=linux-ext4&m=121637999611618&w=2.
>   The problem seems to be in ext2_valid_block_bitmap() which does
> 
>   bitmap_blk = le32_to_cpu(desc->bg_block_bitmap);
>   offset = bitmap_blk - group_first_block;
>   if (!ext2_test_bit(offset, bh->b_data))
> 
>   (and similarly for inode bitmap). Now when the group descriptor is
> corrupted, this simply accesses beyond the bh->b_data...
>   The patch below should hopefully fix the issue. Can you test it
> please?

Yes, I've ran it for a few hours * 4 qemus now and have not seen the
bug again, so it's probably fixed by this patch.

	Sami
Comment 11 Anonymous Emailer 2008-08-18 20:24:28 UTC
Reply-To: adilger@sun.com

On Aug 18, 2008  22:21 +0530, Aneesh Kumar wrote:
> > +static int ext2_block_in_group(struct super_block *sb,
> > +                   unsigned int block_group, ext2_fsblk_t block)
> > +{
> > +   if (block < ext2_group_first_block_no(sb, block_group))
> > +           return 0;
> > +   if (block >= ext2_group_first_block_no(sb, block_group) +
> > +       EXT2_BLOCKS_PER_GROUP(sb))
> > +           return 0;
> > +   return 1;
> > +}
> > +
> >  /*
> >   * Read the bitmap for a given block_group,and validate the
> >   * bits for block/inode/inode tables are set in the bitmaps
> > @@ -129,6 +140,24 @@ read_block_bitmap(struct super_block *sb, unsigned int
> block_group)
> >     desc = ext2_get_group_desc(sb, block_group, NULL);
> >     if (!desc)
> >             return NULL;
> > +   if (!ext2_block_in_group(sb, block_group,
> > +                           le32_to_cpu(desc->bg_block_bitmap)) ||
> > +       !ext2_block_in_group(sb, block_group,
> > +                           le32_to_cpu(desc->bg_inode_bitmap)) ||
> > +       !ext2_block_in_group(sb, block_group,
> > +                           le32_to_cpu(desc->bg_inode_table)) ||
> > +       !ext2_block_in_group(sb, block_group,
> > +                           le32_to_cpu(desc->bg_inode_table) +
> > +                           EXT2_SB(sb)->s_itb_per_group - 1)) {

Isn't equivalent checking done in ext2_check_descriptors()?  It would make
sense to abstract out the "check one group and return error" code and use
it in both places.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
Comment 12 Jan Kara 2008-08-19 02:13:46 UTC
On Mon 18-08-08 21:24:10, Andreas Dilger wrote:
> On Aug 18, 2008  22:21 +0530, Aneesh Kumar wrote:
> > > +static int ext2_block_in_group(struct super_block *sb,
> > > +                 unsigned int block_group, ext2_fsblk_t block)
> > > +{
> > > + if (block < ext2_group_first_block_no(sb, block_group))
> > > +         return 0;
> > > + if (block >= ext2_group_first_block_no(sb, block_group) +
> > > +     EXT2_BLOCKS_PER_GROUP(sb))
> > > +         return 0;
> > > + return 1;
> > > +}
> > > +
> > >  /*
> > >   * Read the bitmap for a given block_group,and validate the
> > >   * bits for block/inode/inode tables are set in the bitmaps
> > > @@ -129,6 +140,24 @@ read_block_bitmap(struct super_block *sb, unsigned
> int block_group)
> > >   desc = ext2_get_group_desc(sb, block_group, NULL);
> > >   if (!desc)
> > >           return NULL;
> > > + if (!ext2_block_in_group(sb, block_group,
> > > +                         le32_to_cpu(desc->bg_block_bitmap)) ||
> > > +     !ext2_block_in_group(sb, block_group,
> > > +                         le32_to_cpu(desc->bg_inode_bitmap)) ||
> > > +     !ext2_block_in_group(sb, block_group,
> > > +                         le32_to_cpu(desc->bg_inode_table)) ||
> > > +     !ext2_block_in_group(sb, block_group,
> > > +                         le32_to_cpu(desc->bg_inode_table) +
> > > +                         EXT2_SB(sb)->s_itb_per_group - 1)) {
> 
> Isn't equivalent checking done in ext2_check_descriptors()?  It would make
> sense to abstract out the "check one group and return error" code and use
> it in both places.
  Actually yes, it is. Good point. Sami, is it the case that you have
mounted the filesystem, then intentionally corrupted it and after that
the kernel oopsed (as opposed to first corrupting the filesystem image and
mounting it after that)? That would explain how corrupted values could get
to read_block_bitmap() even though ext2_check_descriptors() checked them.

								Honza
Comment 13 Sami Liedes 2008-08-19 03:56:51 UTC
On Tue, Aug 19, 2008 at 11:13:39AM +0200, Jan Kara wrote:
> > Isn't equivalent checking done in ext2_check_descriptors()?  It would make
> > sense to abstract out the "check one group and return error" code and use
> > it in both places.
>   Actually yes, it is. Good point. Sami, is it the case that you have
> mounted the filesystem, then intentionally corrupted it and after that
> the kernel oopsed (as opposed to first corrupting the filesystem image and
> mounting it after that)? That would explain how corrupted values could get
> to read_block_bitmap() even though ext2_check_descriptors() checked them.

No, that's not what I do. I corrupt the fs before mounting it, then
mount it, perform normal filesystem operations on it and unmount it.

Here's the most current script I use (zzuf is the fuzzer):

------------------------------------------------------------
#!/bin/sh

if [ "`hostname`" != "fstest" ]; then
   echo "This is a dangerous script."
   echo "Set your hostname to \`fstest\' if you want to use it."
   exit 1
fi

umount /dev/hdb
umount /dev/hdc
/etc/init.d/sysklogd stop
/etc/init.d/klogd stop
/etc/init.d/cron stop
mount /dev/hda / -t ext3 -o remount,ro || exit 1

#ulimit -t 20

for ((s=$1; s<1000000000; s++)); do
  umount /mnt
  echo '***** zzuffing *****' seed $s
  zzuf -r 0:0.03 -s $s </dev/hdc >/dev/hdb || exit
  mount /dev/hdb /mnt -t ext2 -o errors=continue || continue
  cd /mnt || continue
  timeout 30 cp -r doc doc2 >&/dev/null
  timeout 30 find -xdev >&/dev/null
  timeout 30 find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
  timeout 30 mkdir tmp >&/dev/null
  timeout 30 echo whoah >tmp/filu 2>/dev/null
  timeout 30 rm -rf /mnt/* >&/dev/null
  cd /
done
------------------------------------------------------------

	Sami
Comment 14 Sami Liedes 2008-08-19 14:47:23 UTC
On Mon, Aug 18, 2008 at 04:58:41PM +0200, Jan Kara wrote:
> From 06953717138efe3ad535e78343beb7204ac0d274 Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Mon, 18 Aug 2008 16:45:11 +0200
> Subject: [PATCH] ext2: Check for corrupted group descriptor before using data
> in it
> 
> We have to check whether a group descriptor isn't corrupted in
> read_block_bitmap(). Otherwise ext2_valid_block_bitmap() will try
> to access bits outside of bitmap and Oops happens.

I think something similar is needed for ext3, or at least the
backtrace looks similar to me (tell me if you want me to file a
separate bug for it):

------------------------------------------------------------
[ 1303.485714] EXT3-fs unexpected failure: !jh->b_committed_data;
[ 1303.485714] inconsistent data on disk
[ 1303.485714] BUG: unable to handle kernel paging request at c7edfffc
[ 1303.485714] IP: [<c02ddca9>] read_block_bitmap+0xa3/0x147
[ 1303.485714] *pde = 00007067 *pte = 07edf160
[ 1303.485714] Oops: 0000 [#1] DEBUG_PAGEALLOC
[ 1303.485714]
[ 1303.485714] Pid: 17001, comm: rm Not tainted (2.6.27-rc3 #2)
[ 1303.485714] EIP: 0060:[<c02ddca9>] EFLAGS: 00000246 CPU: 0
[ 1303.485714] EIP is at read_block_bitmap+0xa3/0x147
[ 1303.485714] EAX: ffffffff EBX: c7ee0800 ECX: c7ee0000 EDX: 00000001
[ 1303.485714] ESI: c3c40690 EDI: c7abd000 EBP: c79c4c9c ESP: c79c4c6c
[ 1303.485714]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[ 1303.485714] Process rm (pid: 17001, ti=c79c4000 task=c79189a0 task.ti=c79c4000)
[ 1303.485714] Stack: 00000246 00000001 00000246 c7abda3c c7413aa0 c5d7f800 00000000 00000000
[ 1303.485714]        c7ee0000 00000000 00000000 c3c25064 c79c4cf4 c02dde1f c3c405b0 c79c4ccc
[ 1303.485714]        c0317987 00000001 c0314a9b 00000029 0000002a c7abd000 c7440000 c5d7f8ac
[ 1303.485714] Call Trace:
[ 1303.485714]  [<c02dde1f>] ? ext3_free_blocks_sb+0x93/0x3d6
[ 1303.485714]  [<c0317987>] ? journal_revoke+0x81/0xe3
[ 1303.485714]  [<c0314a9b>] ? do_get_write_access+0x381/0x49c
[ 1303.485714]  [<c02ed428>] ? __ext3_journal_revoke+0x1e/0x44
[ 1303.485714]  [<c02de18d>] ? ext3_free_blocks+0x2b/0x7f
[ 1303.485714]  [<c02e3694>] ? ext3_clear_blocks+0x11f/0x141
[ 1303.485714]  [<c02e377a>] ? ext3_free_data+0xc4/0x133
[ 1303.485714]  [<c02e3a0e>] ? ext3_free_branches+0x225/0x22d
[ 1303.485714]  [<c02e3891>] ? ext3_free_branches+0xa8/0x22d
[ 1303.485714]  [<c02e3891>] ? ext3_free_branches+0xa8/0x22d
[ 1303.485714]  [<c02e407d>] ? ext3_truncate+0x667/0x8af
[ 1303.485714]  [<c03153e2>] ? journal_start+0xb2/0x112
[ 1303.485714]  [<c031540d>] ? journal_start+0xdd/0x112
[ 1303.485714]  [<c03153e2>] ? journal_start+0xb2/0x112
[ 1303.485714]  [<c02eb243>] ? ext3_journal_start_sb+0x29/0x4a
[ 1303.485714]  [<c02e4389>] ? ext3_delete_inode+0xc4/0xdb
[ 1303.485714]  [<c02e42c5>] ? ext3_delete_inode+0x0/0xdb
[ 1303.485714]  [<c0276c2b>] ? generic_delete_inode+0x62/0xd5
[ 1303.485714]  [<c0276db1>] ? generic_drop_inode+0x113/0x162
[ 1303.485714]  [<c0275d3c>] ? iput+0x47/0x4e
[ 1303.485714]  [<c02737a7>] ? dentry_iput+0x6b/0xb1
[ 1303.485714]  [<c0273859>] ? d_kill+0x1d/0x37
[ 1303.485714]  [<c027519b>] ? dput+0x58/0x10a
[ 1303.485714]  [<c026d2a4>] ? do_rmdir+0xa4/0xc3
[ 1303.485714]  [<c026d2f4>] ? sys_unlinkat+0x31/0x36
[ 1303.485714]  [<c0202f3e>] ? syscall_call+0x7/0xb
[ 1303.485714]  =======================
[ 1303.485714] Code: 26 00 0f 88 94 00 00 00 8b 87 8c 02 00 00 89 45 e4 8b 55 e8 0f af 50 10 8b 40 34 03 50 14 8b 03 89 45 ec 8b 4e 14 89 4d f0 29 d0 <0f> a3 0
1 19 c0 85 c0 74 11 8b 43 04 89 45 ec 29 d0 0f a3 01 19
[ 1303.485714] EIP: [<c02ddca9>] read_block_bitmap+0xa3/0x147 SS:ESP 0068:c79c4c6c
[ 1303.485714] ---[ end trace ba199677255b7e73 ]---
------------------------------------------------------------
$ addr2line -e vmlinux -i 0xc02ddca9
include/asm/bitops.h:305
fs/ext3/balloc.c:98
fs/ext3/balloc.c:167

     98         if (!ext3_test_bit(offset, bh->b_data))
     99                 /* bad block bitmap */
    100                 goto err_out;
------------------------------------------------------------

	Sami
Comment 15 Jan Kara 2008-08-20 03:25:39 UTC
> On Tue, Aug 19, 2008 at 11:13:39AM +0200, Jan Kara wrote:
> > > Isn't equivalent checking done in ext2_check_descriptors()?  It would
> make
> > > sense to abstract out the "check one group and return error" code and use
> > > it in both places.
> >   Actually yes, it is. Good point. Sami, is it the case that you have
> > mounted the filesystem, then intentionally corrupted it and after that
> > the kernel oopsed (as opposed to first corrupting the filesystem image and
> > mounting it after that)? That would explain how corrupted values could get
> > to read_block_bitmap() even though ext2_check_descriptors() checked them.
> 
> No, that's not what I do. I corrupt the fs before mounting it, then
> mount it, perform normal filesystem operations on it and unmount it.
  OK, thanks. Then we must somehow corrupt group descriptor block during
the operation. Because I'm pretty sure it *is* corrupted - the oops
is: unable to handle kernel paging request at c7e95ffc. If we look into
registers, we see ECX has c7e96000 (which is probably bh->b_data). In
the second oops it's exactly the same - ECX has c11e4000, the oops is at
address c11e3ffc. So in both cases it is ECX-4. So somehow we managed to
pass negative offset into ext2_test_bit(). But as Andreas pointed out,
when we load descriptors into memory, we check that both bitmaps and
inode table is in ext2_check_descriptors()... The other possibility
would be that we managed to corrupts s_first_data_block in the
superblock. Anyway, both possibilities don't look very likely. I'll try
to reproduce the problem and maybe get more insight... How large is your
filesystem BTW?

> Here's the most current script I use (zzuf is the fuzzer):
> 
> ------------------------------------------------------------
> #!/bin/sh
> 
> if [ "`hostname`" != "fstest" ]; then
>    echo "This is a dangerous script."
>    echo "Set your hostname to \`fstest\' if you want to use it."
>    exit 1
> fi
> 
> umount /dev/hdb
> umount /dev/hdc
> /etc/init.d/sysklogd stop
> /etc/init.d/klogd stop
> /etc/init.d/cron stop
> mount /dev/hda / -t ext3 -o remount,ro || exit 1
> 
> #ulimit -t 20
> 
> for ((s=$1; s<1000000000; s++)); do
>   umount /mnt
>   echo '***** zzuffing *****' seed $s
>   zzuf -r 0:0.03 -s $s </dev/hdc >/dev/hdb || exit
>   mount /dev/hdb /mnt -t ext2 -o errors=continue || continue
>   cd /mnt || continue
>   timeout 30 cp -r doc doc2 >&/dev/null
>   timeout 30 find -xdev >&/dev/null
>   timeout 30 find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
>   timeout 30 mkdir tmp >&/dev/null
>   timeout 30 echo whoah >tmp/filu 2>/dev/null
>   timeout 30 rm -rf /mnt/* >&/dev/null
>   cd /
> done
> ------------------------------------------------------------

								Honza
Comment 16 Sami Liedes 2008-08-20 06:36:59 UTC
On Wed, Aug 20, 2008 at 12:25:33PM +0200, Jan Kara wrote:
>   OK, thanks. Then we must somehow corrupt group descriptor block during
> the operation. Because I'm pretty sure it *is* corrupted - the oops
> is: unable to handle kernel paging request at c7e95ffc. If we look into
> registers, we see ECX has c7e96000 (which is probably bh->b_data). In
> the second oops it's exactly the same - ECX has c11e4000, the oops is at
> address c11e3ffc. So in both cases it is ECX-4. So somehow we managed to
> pass negative offset into ext2_test_bit(). But as Andreas pointed out,
> when we load descriptors into memory, we check that both bitmaps and
> inode table is in ext2_check_descriptors()... The other possibility
> would be that we managed to corrupts s_first_data_block in the
> superblock. Anyway, both possibilities don't look very likely. I'll try
> to reproduce the problem and maybe get more insight... How large is your
> filesystem BTW?

My FS is 10 MiB and tries to be diverse in its contents. It has a copy
of my /dev and a small partial copy of /usr/share/doc.

I put the pristine (non-corrupted) filesystem at

   http://www.hut.fi/~sliedes/fsdebug-hdc-ext2.bz2

(520k compressed).

I've been thinking I should write a script to prepare the root
filesystem for the tests, but haven't got that far yet. Basically
(unless I forget some step) I use debootstrap to bootstrap a minimal
Debian system, create some needed devices in it (hd[abc], ttyS0 at
least), set the hostname to fstest, configure getty to listen to
ttyS0, copy the script to /root/runtest (the script's first parameter
is the seed) and install some Debian packages (zzuf and timeout at
least).

Then I make four copies of the images and run four qemus in parallel
since I have four cpus, modifying the first parameter (initial seed)
of the runtest script, e.g. 0, 10M, 20M, 30M.

I guess the approach might be useful for those who write the code too
(or people closer to them than me), since I've already found a fair
number of bugs with it in a fairly short period of time (#10871,
#10882, #10976, #11250, #11253, #11266 for ext[23] bugs, also one ext4
bug I hit when an ext3 fs was detected as ext4; search bugzilla for my
email to see the rest of the bugs).

The current root filesystem is 144M compressed (yeah, there's a lot of
stuff irrelevant to the tests there), I could upload it somewhere if
that helps. After that running the tests is a matter of running
something like

   qemu -kernel bzImage -append 'root=/dev/hda console=ttyS0,115200n8' \
       -hda hda -hdb hdb -hdc hdc -nographic -serial pty

, attaching a screen session to the allocated pty, logging in as root
and running ./runtest $seed.

Also the tests are not as comprehensive as I'd like. As an example,
some years ago I stress tested reiser4 (it was already "ready") with
pretty mundane operations (without corrupting the fs) and it worked,
but I've got it to break badly at three separate times in separate
ways just by normally using Debian's aptitude - the breakage was in
flock(), and the current tests don't test flock()). Other things to
test would be at least hard links and fifos...

The level of automation isn't quite what I'd like either, optimally
there would just be a single script that takes the kernel image,
filesystem type and number of parallel instances as arguments and runs
the tests.

	Sami
Comment 17 Anonymous Emailer 2008-08-20 12:08:09 UTC
Reply-To: adilger@sun.com

On Aug 20, 2008  12:25 +0200, Jan Kara wrote:
> > On Tue, Aug 19, 2008 at 11:13:39AM +0200, Jan Kara wrote:
> > > > Isn't equivalent checking done in ext2_check_descriptors()?  It would
> make
> > > > sense to abstract out the "check one group and return error" code and
> use
> > > > it in both places.
> > >   Actually yes, it is. Good point. Sami, is it the case that you have
> > > mounted the filesystem, then intentionally corrupted it and after that
> > > the kernel oopsed (as opposed to first corrupting the filesystem image
> and
> > > mounting it after that)? That would explain how corrupted values could
> get
> > > to read_block_bitmap() even though ext2_check_descriptors() checked them.
> > 
> > No, that's not what I do. I corrupt the fs before mounting it, then
> > mount it, perform normal filesystem operations on it and unmount it.

>   OK, thanks. Then we must somehow corrupt group descriptor block during
> the operation.

Oh, interesting...  The data in the journal is probably corrupt, but all
of the superblock/gdt sanity checks are done BEFORE the journal is replayed.

It would seem that the ext*_fill_super() code should do the sanity checks,
and then recheck the superblock and group descriptors after the journal
is replayed.  The superblock checking code can be moved out of
ext*_fill_super() into a helper function like ext*_check_super()) and then
calling ext*_check_super() and ext*_check_descriptors() again after journal
replay.

Having journal checksums enabled (ext4) would also detect this problem
before the journal replay corrupts the filesystem metadata.

It doesn't look possible that we can do journal recovery before loading
the GDT because ext*_load_journal()->ext*_get_journal() is doing iget()
and this needs the GDT to read the journal inode.

It might also make sense to just clean up the superblock and group descriptor
table and goto the beginning of fill_super() because in some cases the
superblock contents may have changed in important ways (e.g. crash after
resize of the filesystem which is only in the journal).


> Because I'm pretty sure it *is* corrupted - the oops
> is: unable to handle kernel paging request at c7e95ffc. If we look into
> registers, we see ECX has c7e96000 (which is probably bh->b_data). In
> the second oops it's exactly the same - ECX has c11e4000, the oops is at
> address c11e3ffc. So in both cases it is ECX-4. So somehow we managed to
> pass negative offset into ext2_test_bit(). But as Andreas pointed out,
> when we load descriptors into memory, we check that both bitmaps and
> inode table is in ext2_check_descriptors()... The other possibility
> would be that we managed to corrupts s_first_data_block in the
> superblock. Anyway, both possibilities don't look very likely. I'll try
> to reproduce the problem and maybe get more insight... How large is your
> filesystem BTW?
> 
> > Here's the most current script I use (zzuf is the fuzzer):
> > 
> > ------------------------------------------------------------
> > #!/bin/sh
> > 
> > if [ "`hostname`" != "fstest" ]; then
> >    echo "This is a dangerous script."
> >    echo "Set your hostname to \`fstest\' if you want to use it."
> >    exit 1
> > fi
> > 
> > umount /dev/hdb
> > umount /dev/hdc
> > /etc/init.d/sysklogd stop
> > /etc/init.d/klogd stop
> > /etc/init.d/cron stop
> > mount /dev/hda / -t ext3 -o remount,ro || exit 1
> > 
> > #ulimit -t 20
> > 
> > for ((s=$1; s<1000000000; s++)); do
> >   umount /mnt
> >   echo '***** zzuffing *****' seed $s
> >   zzuf -r 0:0.03 -s $s </dev/hdc >/dev/hdb || exit
> >   mount /dev/hdb /mnt -t ext2 -o errors=continue || continue
> >   cd /mnt || continue
> >   timeout 30 cp -r doc doc2 >&/dev/null
> >   timeout 30 find -xdev >&/dev/null
> >   timeout 30 find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
> >   timeout 30 mkdir tmp >&/dev/null
> >   timeout 30 echo whoah >tmp/filu 2>/dev/null
> >   timeout 30 rm -rf /mnt/* >&/dev/null
> >   cd /
> > done
> > ------------------------------------------------------------

Oh, hmm, this is ext2 and not ext3, so no journal...  I guess my bug is
still valid, but just not this one?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
Comment 18 Sami Liedes 2008-11-01 21:58:34 UTC
I now have found an ext3 filesystem for which this bug happens pretty reproducibly on 2.6.27.4. Increasing commit interval seems to help it happen, otherwise the journal can be aborted and then the bug no longer happens. I do realize that this report is for the ext2 bug, but I hope finding a similar bug on ext3 might help (and even if this is a separate bug, this information should help resolve it).

Here's how to do it:

1. bunzip2 the attached filesystem image hdb.10000097.bz2

(I did the following inside qemu, hence /dev/hdb)

2. mount /dev/hdb /mnt -t ext3 -o errors=continue,commit=300
3. cd /mnt
4. timeout 30 cp -r doc doc2 >&/dev/null (or manually break cp after 30 seconds, it's jammed anyway)
6. find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
7. mkdir tmp >&/dev/null
8. echo whoah >tmp/filu 2>/dev/null
9. rm -rf /mnt/* >&/dev/null
10. while completing rm -rf, the following oops occurs:

------------------------------------------------------------
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1
EXT3-fs unexpected failure: !jh->b_committed_data;
inconsistent data on disk
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks in system zones - Block = 8234, count = 1
EXT3-fs unexpected failure: !jh->b_committed_data;
inconsistent data on disk
ext3_forget: aborting transaction: IO failure in __ext3_journal_forget
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks in system zones - Block = 42, count = 3
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone - block = 25630524, count = 1
EXT3-fs error (device hdb) in ext3_free_blocks_sb: Readonly filesystem
EXT3-fs unexpected failure: !jh->b_committed_data;
inconsistent data on disk
BUG: unable to handle kernel paging request at c13fbbfc
IP: [<c02de4f9>] read_block_bitmap+0xa3/0x147
*pde = 07886163 *pte = 013fb160
Oops: 0000 [#1] DEBUG_PAGEALLOC

Pid: 817, comm: rm Not tainted (2.6.27.4 #1)
EIP: 0060:[<c02de4f9>] EFLAGS: 00000206 CPU: 0
EIP is at read_block_bitmap+0xa3/0x147
EAX: ffffdfff EBX: c13fc820 ECX: c13fc000 EDX: 00002001
ESI: c74b15b0 EDI: c7aae400 EBP: c7b7acd0 ESP: c7b7aca0
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rm (pid: 817, ti=c7b7a000 task=c78a1ce0 task.ti=c7b7a000)
Stack: 00000001 00000000 00000000 c7aaf1c0 00000246 c79cdc00 00000001 00000000
       c13fc000 00000000 00000001 c163b37c c7b7ad28 c02de66f c0315003 c740aadc
       c7b7ad10 c7440000 c7aaf1c0 00000029 0000202a c7aae400 c7440000 c79cdcac
Call Trace:
 [<c02de66f>] ? ext3_free_blocks_sb+0x93/0x3d6
 [<c0315003>] ? journal_forget+0xff/0x1aa
 [<c02edd83>] ? __ext3_journal_forget+0x19/0x3f
 [<c02de9dd>] ? ext3_free_blocks+0x2b/0x7f
 [<c02e3f8c>] ? ext3_clear_blocks+0x137/0x159
 [<c02e4072>] ? ext3_free_data+0xc4/0x133
 [<c02e4320>] ? ext3_free_branches+0x23f/0x247
 [<c02e4189>] ? ext3_free_branches+0xa8/0x247
 [<c02e4189>] ? ext3_free_branches+0xa8/0x247
 [<c02e498d>] ? ext3_truncate+0x665/0x8ad
 [<c0316062>] ? journal_start+0xb2/0x112
 [<c031608d>] ? journal_start+0xdd/0x112
 [<c0316062>] ? journal_start+0xb2/0x112
 [<c02ebb53>] ? ext3_journal_start_sb+0x29/0x4a
 [<c02e4ca4>] ? ext3_delete_inode+0xcf/0xdb
 [<c02e4bd5>] ? ext3_delete_inode+0x0/0xdb
 [<c02774b3>] ? generic_delete_inode+0x62/0xd5
 [<c0277639>] ? generic_drop_inode+0x113/0x16a
 [<c02765ac>] ? iput+0x47/0x4e
 [<c026d9f4>] ? do_unlinkat+0xc3/0x13d
 [<c054484f>] ? mutex_unlock+0x8/0xa
 [<c026fb0b>] ? vfs_readdir+0x60/0x85
 [<c026f84c>] ? filldir64+0x0/0xd7
 [<c026fbc7>] ? sys_getdents64+0x97/0xa1
 [<c026db66>] ? sys_unlinkat+0x23/0x36
 [<c0202f1e>] ? syscall_call+0x7/0xb
 =======================
Code: 26 00 0f 88 94 00 00 00 8b 87 8c 02 00 00 89 45 e4 8b 55 e8 0f af 50 10 8b 40 34 03 50 14 8b 03 89 45 ec 8b 4e 14 89 4d f0 29 d0 <0f> a3 01 19 c0 85 c0 74 11 8b 43 04 89 45 ec 29 d0 0f a3 01 19
EIP: [<c02de4f9>] read_block_bitmap+0xa3/0x147 SS:ESP 0068:c7b7aca0
---[ end trace 780108b88e07a03e ]---
------------------------------------------------------------
Comment 19 Sami Liedes 2008-11-01 21:59:08 UTC
Created attachment 18592 [details]
Test case, corrupted ext3 file system, bzip2 compressed
Comment 20 Andrew Morton 2008-11-01 22:18:05 UTC
Same, please don't update this bug via the bugzilla UI - nobody is looking
at it.

Please stick exclusively to email, vie emailed reply-to-all in the
same thread.
Comment 21 Sami Liedes 2008-11-01 22:27:52 UTC
[Sorry for duplicates, forgot to use email instead of bugzilla web
interface.]

I now have found an ext3 filesystem for which this bug happens pretty
reproducibly on 2.6.27.4. Increasing commit interval seems to help it happen,
otherwise the journal can be aborted and then the bug no longer happens. I do
realize that this report is for the ext2 bug, but I hope finding a similar bug
on ext3 might help (and even if this is a separate bug, this information should
help resolve it).

Here's how to do it:

1. bunzip2 the attached filesystem image hdb.10000097.bz2

(I did the following inside qemu, hence /dev/hdb)

2. mount /dev/hdb /mnt -t ext3 -o errors=continue,commit=300
3. cd /mnt
4. timeout 30 cp -r doc doc2 >&/dev/null (or manually break cp after 30
seconds, it's jammed anyway)
6. find -xdev -print0 2>/dev/null |xargs -0 touch -- 2>/dev/null
7. mkdir tmp >&/dev/null
8. echo whoah >tmp/filu 2>/dev/null
9. rm -rf /mnt/* >&/dev/null
10. while completing rm -rf, the following oops occurs:

------------------------------------------------------------
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone -
block = 4294967295, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone -
block = 4294967295, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone -
block = 4294967295, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone -
block = 4294967295, count = 1
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone -
block = 4294967295, count = 1
EXT3-fs unexpected failure: !jh->b_committed_data;
inconsistent data on disk
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks in system zones -
Block = 8234, count = 1
EXT3-fs unexpected failure: !jh->b_committed_data;
inconsistent data on disk
ext3_forget: aborting transaction: IO failure in __ext3_journal_forget
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks in system zones -
Block = 42, count = 3
EXT3-fs error (device hdb): ext3_free_blocks: Freeing blocks not in datazone -
block = 25630524, count = 1
EXT3-fs error (device hdb) in ext3_free_blocks_sb: Readonly filesystem
EXT3-fs unexpected failure: !jh->b_committed_data;
inconsistent data on disk
BUG: unable to handle kernel paging request at c13fbbfc
IP: [<c02de4f9>] read_block_bitmap+0xa3/0x147
*pde = 07886163 *pte = 013fb160
Oops: 0000 [#1] DEBUG_PAGEALLOC

Pid: 817, comm: rm Not tainted (2.6.27.4 #1)
EIP: 0060:[<c02de4f9>] EFLAGS: 00000206 CPU: 0
EIP is at read_block_bitmap+0xa3/0x147
EAX: ffffdfff EBX: c13fc820 ECX: c13fc000 EDX: 00002001
ESI: c74b15b0 EDI: c7aae400 EBP: c7b7acd0 ESP: c7b7aca0
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rm (pid: 817, ti=c7b7a000 task=c78a1ce0 task.ti=c7b7a000)
Stack: 00000001 00000000 00000000 c7aaf1c0 00000246 c79cdc00 00000001 00000000
       c13fc000 00000000 00000001 c163b37c c7b7ad28 c02de66f c0315003 c740aadc
       c7b7ad10 c7440000 c7aaf1c0 00000029 0000202a c7aae400 c7440000 c79cdcac
Call Trace:
 [<c02de66f>] ? ext3_free_blocks_sb+0x93/0x3d6
 [<c0315003>] ? journal_forget+0xff/0x1aa
 [<c02edd83>] ? __ext3_journal_forget+0x19/0x3f
 [<c02de9dd>] ? ext3_free_blocks+0x2b/0x7f
 [<c02e3f8c>] ? ext3_clear_blocks+0x137/0x159
 [<c02e4072>] ? ext3_free_data+0xc4/0x133
 [<c02e4320>] ? ext3_free_branches+0x23f/0x247
 [<c02e4189>] ? ext3_free_branches+0xa8/0x247
 [<c02e4189>] ? ext3_free_branches+0xa8/0x247
 [<c02e498d>] ? ext3_truncate+0x665/0x8ad
 [<c0316062>] ? journal_start+0xb2/0x112
 [<c031608d>] ? journal_start+0xdd/0x112
 [<c0316062>] ? journal_start+0xb2/0x112
 [<c02ebb53>] ? ext3_journal_start_sb+0x29/0x4a
 [<c02e4ca4>] ? ext3_delete_inode+0xcf/0xdb
 [<c02e4bd5>] ? ext3_delete_inode+0x0/0xdb
 [<c02774b3>] ? generic_delete_inode+0x62/0xd5
 [<c0277639>] ? generic_drop_inode+0x113/0x16a
 [<c02765ac>] ? iput+0x47/0x4e
 [<c026d9f4>] ? do_unlinkat+0xc3/0x13d
 [<c054484f>] ? mutex_unlock+0x8/0xa
 [<c026fb0b>] ? vfs_readdir+0x60/0x85
 [<c026f84c>] ? filldir64+0x0/0xd7
 [<c026fbc7>] ? sys_getdents64+0x97/0xa1
 [<c026db66>] ? sys_unlinkat+0x23/0x36
 [<c0202f1e>] ? syscall_call+0x7/0xb
 =======================
Code: 26 00 0f 88 94 00 00 00 8b 87 8c 02 00 00 89 45 e4 8b 55 e8 0f af 50 10
8b 40 34 03 50 14 8b 03 89 45 ec 8b 4e 14 89 4d f0 29 d0 <0f> a3 01 19 c0 85 c0
74 11 8b 43 04 89 45 ec 29 d0 0f a3 01 19
EIP: [<c02de4f9>] read_block_bitmap+0xa3/0x147 SS:ESP 0068:c7b7aca0
---[ end trace 780108b88e07a03e ]---
------------------------------------------------------------

	Sami
Comment 22 Alan 2012-10-30 15:02:19 UTC
If this is still seen on modern kernels then please re-open/update

Note You need to log in before you can comment on or make changes to this bug.