Bug 216541 - FUZZ: general protection fault, KASAN: null-ptr-deref at fs/ext4/ialloc.c:ext4_read_inode_bitmap() when mount a corrupted image
Summary: FUZZ: general protection fault, KASAN: null-ptr-deref at fs/ext4/ialloc.c:ext...
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-28 22:50 UTC by Wenqing Liu
Modified: 2022-11-28 22:20 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.15.71, 6.0-rc7
Subsystem:
Regression: No
Bisected commit-id:


Attachments
corrupted image and .config (57.95 KB, application/zip)
2022-09-28 22:50 UTC, Wenqing Liu
Details

Description Wenqing Liu 2022-09-28 22:50:40 UTC
Created attachment 301892 [details]
corrupted image and .config

- Overview 
FUZZ: general protection fault, KASAN: null-ptr-deref at fs/ext4/ialloc.c:ext4_read_inode_bitmap() when mount a corrupted image

- Reproduce 
Tested on kernel 5.15.71, 6.0-rc7

# mkdir mnt
# mount tmp303.img mnt

-Kernel dump
[  468.970349] loop5: detected capacity change from 0 to 32768
[  469.031258] EXT4-fs (loop5): warning: mounting unchecked fs, running e2fsck is recommended
[  469.034841] EXT4-fs error (device loop5): ext4_clear_blocks:866: inode #32: comm mount: attempt to clear invalid blocks 16777450 len 1
[  469.034935] EXT4-fs error (device loop5): ext4_free_branches:1012: inode #32: comm mount: invalid indirect mapped block 1258291200 (level 1)
[  469.034991] EXT4-fs error (device loop5): ext4_free_branches:1012: inode #32: comm mount: invalid indirect mapped block 7379847 (level 2)
[  469.035539] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI
[  469.035603] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[  469.035636] CPU: 1 PID: 1088 Comm: mount Not tainted 6.0.0-rc7 #1
[  469.035665] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[  469.035699] RIP: 0010:ext4_read_inode_bitmap+0x682/0x1240
[  469.035730] Code: 80 3c 02 00 0f 85 ac 0b 00 00 49 8b 87 b8 02 00 00 8b 54 24 08 4c 8d 3c d0 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 75 0b 00 00 4d 8b 3f e8 6c fa 78 ff 48 b8 00 00
[  469.035803] RSP: 0018:ffffc900007ff730 EFLAGS: 00010246
[  469.035830] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff11024f21407
[  469.035861] RDX: 0000000000000000 RSI: ffff88812ea2e0a8 RDI: ffff88812790a2b8
[  469.035891] RBP: 0000000000000000 R08: 0000000000000002 R09: ffffed1025d45c16
[  469.035921] R10: ffff88812ea2e0af R11: ffffed1025d45c15 R12: ffff888142dd7800
[  469.035951] R13: ffff888125c07000 R14: ffff88812ea2e0a8 R15: 0000000000000000
[  469.035981] FS:  00007f99fcf24840(0000) GS:ffff888293680000(0000) knlGS:0000000000000000
[  469.036015] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  469.036040] CR2: 000055c988e952f8 CR3: 0000000121fb8003 CR4: 0000000000370ee0
[  469.036072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  469.036101] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  469.036131] Call Trace:
[  469.036144]  <TASK>
[  469.036156]  ext4_free_inode+0x451/0xfb0
[  469.036179]  ? ext4_mark_bitmap_end+0x20/0x20
[  469.036200]  ? __ext4_journal_start_sb+0x23f/0x2d0
[  469.036223]  ext4_evict_inode+0xaf4/0x14e0
[  469.036243]  ? complete_all+0xc0/0xc0
[  469.036262]  ? ext4_da_write_begin+0x6b0/0x6b0
[  469.036283]  ? _raw_spin_lock_irqsave+0xf0/0xf0
[  469.036305]  ? _raw_spin_lock_irqsave+0xf0/0xf0
[  469.036326]  ? kmem_cache_alloc+0x13b/0x4e0
[  469.036347]  evict+0x284/0x4e0
[  469.036364]  ext4_setup_system_zone+0x66c/0x840
[  469.036386]  ? preempt_schedule_common+0x5e/0xd0
[  469.036408]  ? ext4_exit_system_zone+0x20/0x20
[  469.036429]  ? ext4_setup_super+0x3b7/0x8e0
[  469.036449]  ? _raw_spin_unlock+0x15/0x30
[  469.036468]  ext4_fill_super+0x999c/0xea10
[  469.036490]  ? ext4_reconfigure+0x2250/0x2250
[  469.036511]  ? down_write+0xad/0x120
[  469.037418]  ? snprintf+0x9e/0xd0
[  469.038319]  ? vsprintf+0x10/0x10
[  469.039200]  ? mutex_unlock+0x80/0xd0
[  469.040054]  ? __mutex_unlock_slowpath.isra.0+0x2d0/0x2d0
[  469.040909]  ? sget_fc+0x4e9/0x6b0
[  469.041726]  ? get_tree_bdev+0x388/0x660
[  469.042506]  get_tree_bdev+0x388/0x660
[  469.043256]  ? ext4_reconfigure+0x2250/0x2250
[  469.043983]  vfs_get_tree+0x81/0x2b0
[  469.044681]  ? ns_capable_common+0x57/0xe0
[  469.045363]  path_mount+0x47e/0x19d0
[  469.046027]  ? kasan_quarantine_put+0x55/0x180
[  469.046677]  ? finish_automount+0x5f0/0x5f0
[  469.047312]  ? user_path_at_empty+0x45/0x60
[  469.047934]  ? kmem_cache_free+0x1c2/0x4e0
[  469.048533]  do_mount+0xce/0xf0
[  469.049122]  ? path_mount+0x19d0/0x19d0
[  469.049694]  ? _copy_from_user+0x50/0x80
[  469.050253]  ? memdup_user+0x4e/0xa0
[  469.050804]  __x64_sys_mount+0x12c/0x1a0
[  469.051353]  do_syscall_64+0x38/0x90
[  469.051895]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  469.052442] RIP: 0033:0x7f99fd184c7e
[  469.052990] Code: 48 8b 0d 15 c2 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e2 c1 0c 00 f7 d8 64 89 01 48
[  469.054144] RSP: 002b:00007fff57286818 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[  469.054732] RAX: ffffffffffffffda RBX: 00007f99fd2b6204 RCX: 00007f99fd184c7e
[  469.055332] RDX: 000055c988e8de90 RSI: 000055c988e87370 RDI: 000055c988e8de30
[  469.055928] RBP: 000055c988e85460 R08: 0000000000000000 R09: 00007f99fd251d60
[  469.056531] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  469.057145] R13: 000055c988e8de30 R14: 000055c988e8de90 R15: 000055c988e85460
[  469.057764]  </TASK>
[  469.058366] Modules linked in: joydev input_leds serio_raw qemu_fw_cfg xfs autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear qxl drm_ttm_helper ttm drm_kms_helper hid_generic syscopyarea usbhid sysfillrect sysimgblt fb_sys_fops hid crct10dif_pclmul drm crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd psmouse
[  469.061085] ---[ end trace 0000000000000000 ]---
[  469.062245] RIP: 0010:ext4_read_inode_bitmap+0x682/0x1240
[  469.063133] Code: 80 3c 02 00 0f 85 ac 0b 00 00 49 8b 87 b8 02 00 00 8b 54 24 08 4c 8d 3c d0 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 75 0b 00 00 4d 8b 3f e8 6c fa 78 ff 48 b8 00 00
[  469.065120] RSP: 0018:ffffc900007ff730 EFLAGS: 00010246
[  469.066087] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff11024f21407
[  469.067293] RDX: 0000000000000000 RSI: ffff88812ea2e0a8 RDI: ffff88812790a2b8
[  469.068371] RBP: 0000000000000000 R08: 0000000000000002 R09: ffffed1025d45c16
[  469.069678] R10: ffff88812ea2e0af R11: ffffed1025d45c15 R12: ffff888142dd7800
[  469.070731] R13: ffff888125c07000 R14: ffff88812ea2e0a8 R15: 0000000000000000
[  469.072092] FS:  00007f99fcf24840(0000) GS:ffff888293680000(0000) knlGS:0000000000000000
[  469.073541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  469.074966] CR2: 000055c988e952f8 CR3: 0000000121fb8003 CR4: 0000000000370ee0
[  469.075781] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  469.076595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Comment 1 Theodore Tso 2022-11-28 22:20:31 UTC
I've done some analysis on this failure and what is going on is the following.

1)   The journal inode in the fuzzed image is the normal journal inode, #8.  HOWEVER, after the journal is replayed, the journal overwrites the superblock with new one where the journal inode is different; it is now #32.

2)  Next, we set up the set of static metadata blocks ("the system zone") that should never be used by any data blocks in fs/ext4/block_validity.c.    This includes the blocks used by the journal inode (which never change while the file system is mounted).  In order to reserve those blocks in the system zone, ext4_protect_reserved_inode() fetches the journal inode using ext4_iget(), and then later releases it using iput().

3)  This would be fine for a valid file system journal, but after the journal replay, the s_journal_inum now has 32.  And inode 32 has an i_links_count of 0.   That's a problem, because now when we call iput(), since the VFS layers sees that the links count is zero, it calls evict() so that the inode can be deallocated.   And at this point in the file system mount operation, we're not set up to deallocate any blocks or inodes.  And this is what triggers the NULL pointer dereference.


Fixes:

FIX A)  In ext4_iget(), if we are getting a special inode, the links count must be > 0.  If not, when that special inode (whether it is the root directory, the journal inode, or the quota inode) is finally released using iput, the system will attempt to deallocate the special inode, with the resulting hilarity ensuing.   So if i_links_count is 0, we should set the returned inode to be the bad inode, and return -EFSCORRUPTED.

FIX B)   In ext4_check_blockref(), we skip all of the checks if the inode in question is the journal inode.   We shouldn't check to see if the journal's blocks overlaps with the metadata blocks (which include the journal inode, so it will always overlap with itself) --- but we should check to make sure the block number is valid and does not exceed the file system limits.   This is not critical for fixing the bug shown here, but it does add a missing check which was unnecessarily exempted by commit 170417c8c7bb2 ("ext4: fix block validity checks for journal inodes using indirect blocks").

Note You need to log in before you can comment on or make changes to this bug.