The hfs_find_init() function does not validate pointers and data structures resulting in the following Oops when reading corrupted data structures. The following filesystem image can be used to reproduce the bug: http://www.risesecurity.org/ramon/temp/hfs.45.img Printing some debug information shows that the tree pointer value passed to the hfs_find_init() function is NULL: -- linux-pipj:~ # uname -a Linux linux-pipj 2.6.27.19-5.1-ppc64 #1 SMP 2009-02-28 04:40:21 +0100 ppc64 ppc64 ppc64 GNU/Linux linux-pipj:~ # mount -o loop hfs.45.img /mnt hfs_ext_read_extent: inode = c0000001deba2f20 hfs_ext_read_extent: block = 73 hfs_ext_read_extent: HFS_SB(inode->i_sb)->ext_tree = 0000000000000000 hfs_ext_read_extent: fd = c0000000e4e62d60 hfs_find_init: tree = 0000000000000000 hfs_find_init: fd = c0000000e4e62d60 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries Modules linked in: hfs nfsd exportfs auth_rpcgss lockd nfs_acl sunrpc ipv6 fuse loop dm_mod ses ehea(X) enclosure sg sd_mod crc_t10dif ipr(X) libata scsi_mod Supported: Yes, External NIP: d0000000006e0af4 LR: d0000000006e0af0 CTR: c0000000000aba48 REGS: c0000000e4e629e0 TRAP: 0300 Tainted: G (2.6.27.19-5.1-ppc64) MSR: 8000000000009032 <EE,ME,IR,DR> CR: 24042484 XER: 00000006 DAR: 0000000000000040, DSISR: 0000000040000000 TASK = c0000000e46971e0[4106] 'mount' THREAD: c0000000e4e60000 CPU: 7 GPR00: d0000000006e0af0 c0000000e4e62c60 d0000000006f8b68 0000000000000028 GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000000000 c00000000091b9a8 c000000000d44400 c00000000091b998 GPR12: 0000000024042484 c000000000a93a80 0000000000010000 0000000000000001 GPR16: 0000000000000004 00000000000000bf c0000001dd1b6a58 d0000000006f06d0 GPR20: 0000000000000000 c000000000000000 0000000000000049 c0000001deba2f20 GPR24: 0000000000000000 c0000001dea99680 c0000001dd1b2680 0000000000000049 GPR28: c0000001deba2f20 c0000000e4e62d60 d0000000006f7d60 0000000000000000 NIP [d0000000006e0af4] .hfs_find_init+0x44/0xc4 [hfs] LR [d0000000006e0af0] .hfs_find_init+0x40/0xc4 [hfs] Call Trace: [c0000000e4e62c60] [d0000000006e0af0] .hfs_find_init+0x40/0xc4 [hfs] (unreliable) [c0000000e4e62cf0] [d0000000006e5760] .hfs_ext_read_extent+0xb0/0x1ec [hfs] [c0000000e4e62dd0] [d0000000006e5cd0] .hfs_get_block+0x10c/0x27c [hfs] [c0000000e4e62e90] [c00000000016c364] .block_read_full_page+0x160/0x3ec [c0000000e4e63390] [d0000000006e61b8] .hfs_readpage+0x20/0x38 [hfs] [c0000000e4e63410] [c0000000000f0b98] .__read_cache_page+0xc0/0x10c [c0000000e4e634c0] [c0000000000f0c38] .read_cache_page_async+0x54/0x160 [c0000000e4e63570] [c0000000000f0d58] .read_cache_page+0x14/0x70 [c0000000e4e635f0] [d0000000006e3a18] .hfs_btree_open+0x150/0x324 [hfs] [c0000000e4e63690] [d0000000006e7ee8] .hfs_mdb_get+0x510/0x6dc [hfs] [c0000000e4e63770] [d0000000006e8d08] .hfs_fill_super+0xb8/0x1f4 [hfs] [c0000000e4e638b0] [c00000000013962c] .get_sb_bdev+0x150/0x1dc [c0000000e4e63990] [d0000000006e8504] .hfs_get_sb+0x20/0x38 [hfs] [c0000000e4e63a10] [c0000000001390ac] .vfs_kern_mount+0xd4/0x1ac [c0000000e4e63ac0] [c0000000001391f4] .do_kern_mount+0x60/0x138 [c0000000e4e63b70] [c000000000158294] .do_new_mount+0x88/0x100 [c0000000e4e63c20] [c0000000001593b8] .do_mount+0x210/0x260 [c0000000e4e63d70] [c0000000001594bc] .SyS_mount+0xb4/0x124 [c0000000e4e63e30] [c0000000000086b4] syscall_exit+0x0/0x40 Instruction dump: fba1ffe8 7c9d2378 7fe4fb78 f8010010 e87e8008 f821ff71 48009515 e8410028 7fa4eb78 e87e8010 48009505 e8410028 <809f0040> e87e8018 480094f5 e8410028 -- It appears that the Kernel generated an Oops when trying to access max_key_len member of hfs_btree structure. From fs/hfs/bfind.c: -- int hfs_find_init(struct hfs_btree *tree, struct hfs_find_data *fd) { void *ptr; fd->tree = tree; fd->bnode = NULL; ptr = kmalloc(tree->max_key_len * 2 + 4, GFP_KERNEL); if (!ptr) return -ENOMEM; fd->search_key = ptr; fd->key = ptr + tree->max_key_len + 2; dprint(DBG_BNODE_REFS, "find_init: %d (%p)\n", tree->cnid, __builtin_return_address(0)); down(&tree->tree_lock); return 0; } --
From the trace, it looks like HFS_SB(sb)->extent_tree is NULL. This makes sense because this trace is from when it's actually loading the extent tree. After it initializes the inode, it uses the regular read routines. This bit is important: hfs_read_page -> block_read_full_page -> hfs_get_block, which gets here: hfs_get_block: if (ablock < HFS_I(inode)->first_blocks) { dblock = hfs_ext_find_block(HFS_I(inode)->first_extents, ablock); goto done; } mutex_lock(&HFS_I(inode)->extents_lock); res = hfs_ext_read_extent(inode, ablock); hfs_inode_read_fork: memcpy(HFS_I(inode)->first_extents, ext, sizeof(hfs_extent_rec)); for (count = 0, i = 0; i < 3; i++) count += be16_to_cpu(ext[i].count); HFS_I(inode)->first_blocks = count; So if the initial extent records are corrupted, then ->first_blocks will be as well, and we fall through to the hfs_ext_read_extent call that fails because state isn't yet set up for it.
Created attachment 23371 [details] [PATCH] hfs: fix oops on mount with corrupted btree extent records A particular fsfuzzer run caused an hfs file system to crash on mount. This is due to a corrupted MDB extent record causing a miscalculation of HFS_I(inode)->first_blocks for the extent tree. If the extent records are zereod out, then it won't trigger the first_blocks special case and instead falls through to the extent code, which we're in the middle of initializing. This patch catches the 0 size extent records, reports the corruption, and fails the mount. Submitted to LKML today.
Patch submitted again on Oct 20, 2010 by Christoph Hellwig: http://lkml.org/lkml/2010/10/20/563 Cannot test because the URL to the filesystem image in comment #1 is dead. If anyone has a test case, please report if the patch fixes the problem.
I've applied the fix to hfsplus as well and pushed it to mainline.