Created attachment 274875 [details] The crafted image which causes kernel panic - Overview Invalid pointer deference happen when mounting the crafted image. - Reproduce Needs kernel 4.15 (also successful on 4.10) $ mkdir mnt $ sudo mount -t ext4 83.img mnt - Reason https://elixir.bootlin.com/linux/v4.15/source/fs/ext4/mballoc.c#L2874 entry can be NULL, which means a list node points to NULL. Perhaps a use-after-free bug? - Kernel dump [ 329.292108] EXT4-fs (loop0): barriers disabled [ 329.292247] JBD2: Clearing recovery information on journal [ 329.293613] EXT4-fs (loop0): corrupt root inode, run e2fsck [ 329.293727] EXT4-fs error (device loop0): ext4_free_inode:308: comm mount: reserved or nonexistent inode 2 [ 329.294052] EXT4-fs (loop0): mount failed [ 329.295260] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034 [ 329.295290] IP: ext4_process_freed_data+0x68/0x4d0 [ 329.295304] PGD 0 P4D 0 [ 329.295314] Oops: 0000 [#1] SMP PTI [ 329.295325] Modules linked in: vmw_balloon coretemp intel_rapl_perf input_leds joydev serio_raw snd_ens1371 btusb snd_ac97_codec uvcvideo btrtl videobuf2_vmalloc btbcm btintel gameport snd_rawmidi videobuf2_memops bluetooth videobuf2_v4l2 videobuf2_core snd_seq_device ac97_bus snd_pcm videodev media ecdh_generic snd_timer snd soundcore shpchp mac_hid vmw_vsock_vmci_transport vsock vmw_vmci sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmwgfx [ 329.295661] psmouse ttm drm_kms_helper mptspi mptscsih ahci libahci e1000 mptbase scsi_transport_spi syscopyarea sysfillrect sysimgblt fb_sys_fops drm i2c_piix4 pata_acpi [ 329.295712] CPU: 0 PID: 1112 Comm: mount Not tainted 4.15.0-12-generic #13-Ubuntu [ 329.295732] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 329.295763] RIP: 0010:ext4_process_freed_data+0x68/0x4d0 [ 329.295778] RSP: 0018:ffffb907416e39d0 EFLAGS: 00010207 [ 329.295794] RAX: 0000000000000000 RBX: ffff8b4bb91a1800 RCX: ffff8b4bb91a4aa0 [ 329.295813] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8b4bb91a4a80 [ 329.295832] RBP: ffffb907416e3a58 R08: 0000000000000000 R09: ffff8b4bb8dc9d88 [ 329.295852] R10: 0000000000000228 R11: 00000000000001b0 R12: 0000000000000006 [ 329.295872] R13: ffff8b4bb91a4800 R14: ffffb907416e39e0 R15: ffff8b4bb91a4a80 [ 329.295891] FS: 00007fbd17943080(0000) GS:ffff8b4bbc600000(0000) knlGS:0000000000000000 [ 329.295913] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 329.295929] CR2: 0000000000000034 CR3: 000000003686e004 CR4: 00000000001606f0 [ 329.295991] Call Trace: [ 329.296005] ? __set_page_dirty+0x9b/0xc0 [ 329.296624] ext4_journal_commit_callback+0x4d/0xd0 [ 329.297198] jbd2_journal_commit_transaction+0x1531/0x1720 [ 329.297772] jbd2_journal_destroy+0xdd/0x2a0 [ 329.298362] ? jbd2_journal_destroy+0xdd/0x2a0 [ 329.298910] ? wait_woken+0x80/0x80 [ 329.299408] ext4_fill_super+0x1cb9/0x2fb0 [ 329.299890] ? snprintf+0x45/0x70 [ 329.300360] mount_bdev+0x248/0x290 [ 329.300833] ? ext4_calculate_overhead+0x490/0x490 [ 329.301260] ? mount_bdev+0x248/0x290 [ 329.301684] ? ext4_calculate_overhead+0x490/0x490 [ 329.302100] ext4_mount+0x15/0x20 [ 329.302515] mount_fs+0x37/0x150 [ 329.302907] ? alloc_vfsmnt+0x1b3/0x230 [ 329.303302] vfs_kern_mount.part.23+0x5d/0x110 [ 329.303685] do_mount+0x5ed/0xcf0 [ 329.304057] ? memdup_user+0x4f/0x80 [ 329.304418] SyS_mount+0x98/0xe0 [ 329.304768] do_syscall_64+0x73/0x130 [ 329.305110] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 329.305438] RIP: 0033:0x7fbd172073ca [ 329.305752] RSP: 002b:00007ffde39c6a08 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 329.306154] RAX: ffffffffffffffda RBX: 000055d22d08ca40 RCX: 00007fbd172073ca [ 329.306556] RDX: 000055d22d08cc20 RSI: 000055d22d08e940 RDI: 000055d22d095fe0 [ 329.306951] RBP: 0000000000000000 R08: 0000000000000000 R09: 000055d22d08cc40 [ 329.307408] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 000055d22d095fe0 [ 329.307717] R13: 000055d22d08cc20 R14: 0000000000000000 R15: 00007fbd177288a4 [ 329.307987] Code: c0 4c 89 75 90 4c 89 75 88 4d 8d bd 80 02 00 00 4c 89 ff e8 cb 8e 66 00 49 8b b5 a0 02 00 00 49 8d 8d a0 02 00 00 48 39 ce 74 7b <44> 3b 66 34 75 75 48 89 f7 48 89 f2 eb 09 44 39 62 34 75 0b 48 [ 329.308923] RIP: ext4_process_freed_data+0x68/0x4d0 RSP: ffffb907416e39d0 [ 329.309365] CR2: 0000000000000034 [ 329.309812] ---[ end trace 1ecf08f3cdf242f0 ]--- - Download
Reported by Wen Xu at sslab, gatech.
Created attachment 274933 [details] Proposed patch to fix the reported bug. Thanks for reporting this bug. The attached should address the problem. If the root directory has an i_links_count of zero, then when the file system is mounted, then when ext4_fill_super() notices the problem and tries to call iput() the root directory in the error return path, ext4_evict_inode() will try to free the inode on disk, before all of the file system structures are set up, and this will result in an OOPS caused by a NULL pointer dereference.
Thank you for the quick response! By the way, I wonder whether I can get CVE numbers assigned for such kinda issues I reported recently?(In reply to Theodore Tso from comment #2) > Created attachment 274933 [details] > Proposed patch to fix the reported bug. > > Thanks for reporting this bug. The attached should address the problem. > > If the root directory has an i_links_count of zero, then when the file > system is mounted, then when ext4_fill_super() notices the problem and > tries to call iput() the root directory in the error return path, > ext4_evict_inode() will try to free the inode on disk, before all of > the file system structures are set up, and this will result in an OOPS > caused by a NULL pointer dereference. Thank you for the quick response! By the way, I wonder whether I can get CVE numbers assigned for such kinda issues I reported recently?
Ted, I'm not quite getting it: ext4_iget() should fail getting the root inode in case i_links_count is 0. It should return -ESTALE and mark the inode as bad. So iput() will call ext4_evict_inode() but because the inode is marked as bad, we skip any attempts to delete the inode and just call ext4_clear_inode(). But apparently that didn't happen and ext4_iget() succeeded despite inode having i_links_count == 0. So the question is why the check: if (inode->i_nlink == 0) { if ((inode->i_mode == 0 || !(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_ORPHAN_FS)) && ino != EXT4_BOOT_LOADER_INO) { /* this inode is deleted */ ret = -ESTALE; goto bad_inode; } in ext4_iget() didn't trigger...
Well, the root directory i_mode is not 0, and EXT4_ORPHAN_FS is not set. We could add an "ino == EXT4_ROOT_INO ||" to the conditional; that would work too. But if the root inode has i_links_count set to zero, we know for sure it's not a stale NFS handle referncing a deleted inode --- we know the file system must be corrupted, so calling ext4_error() is appropriate.