Bug 199179

Summary: Invalid pointer dereference when mounting crafted ext4 image in ext4_process_freed_data
Product: File System Reporter: Wen Xu (wen.xu)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: RESOLVED CODE_FIX    
Severity: normal CC: jack, tytso, wen.xu
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.15.x Subsystem:
Regression: No Bisected commit-id:
Attachments: The crafted image which causes kernel panic
Proposed patch to fix the reported bug.

Description Wen Xu 2018-03-22 19:55:32 UTC
Created attachment 274875 [details]
The crafted image which causes kernel panic

- Overview
Invalid pointer deference happen when mounting the crafted image.

- Reproduce
Needs kernel 4.15 (also successful on 4.10)
$ mkdir mnt
$ sudo mount -t ext4 83.img mnt

- Reason
https://elixir.bootlin.com/linux/v4.15/source/fs/ext4/mballoc.c#L2874
entry can be NULL, which means a list node points to NULL. 
Perhaps a use-after-free bug?

- Kernel dump
[  329.292108] EXT4-fs (loop0): barriers disabled
[  329.292247] JBD2: Clearing recovery information on journal
[  329.293613] EXT4-fs (loop0): corrupt root inode, run e2fsck
[  329.293727] EXT4-fs error (device loop0): ext4_free_inode:308: comm mount: reserved or nonexistent inode 2
[  329.294052] EXT4-fs (loop0): mount failed
[  329.295260] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034
[  329.295290] IP: ext4_process_freed_data+0x68/0x4d0
[  329.295304] PGD 0 P4D 0
[  329.295314] Oops: 0000 [#1] SMP PTI
[  329.295325] Modules linked in: vmw_balloon coretemp intel_rapl_perf input_leds joydev serio_raw snd_ens1371 btusb snd_ac97_codec uvcvideo btrtl videobuf2_vmalloc btbcm btintel gameport snd_rawmidi videobuf2_memops bluetooth videobuf2_v4l2 videobuf2_core snd_seq_device ac97_bus snd_pcm videodev media ecdh_generic snd_timer snd soundcore shpchp mac_hid vmw_vsock_vmci_transport vsock vmw_vmci sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmwgfx
[  329.295661]  psmouse ttm drm_kms_helper mptspi mptscsih ahci libahci e1000 mptbase scsi_transport_spi syscopyarea sysfillrect sysimgblt fb_sys_fops drm i2c_piix4 pata_acpi
[  329.295712] CPU: 0 PID: 1112 Comm: mount Not tainted 4.15.0-12-generic #13-Ubuntu
[  329.295732] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[  329.295763] RIP: 0010:ext4_process_freed_data+0x68/0x4d0
[  329.295778] RSP: 0018:ffffb907416e39d0 EFLAGS: 00010207
[  329.295794] RAX: 0000000000000000 RBX: ffff8b4bb91a1800 RCX: ffff8b4bb91a4aa0
[  329.295813] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8b4bb91a4a80
[  329.295832] RBP: ffffb907416e3a58 R08: 0000000000000000 R09: ffff8b4bb8dc9d88
[  329.295852] R10: 0000000000000228 R11: 00000000000001b0 R12: 0000000000000006
[  329.295872] R13: ffff8b4bb91a4800 R14: ffffb907416e39e0 R15: ffff8b4bb91a4a80
[  329.295891] FS:  00007fbd17943080(0000) GS:ffff8b4bbc600000(0000) knlGS:0000000000000000
[  329.295913] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  329.295929] CR2: 0000000000000034 CR3: 000000003686e004 CR4: 00000000001606f0
[  329.295991] Call Trace:
[  329.296005]  ? __set_page_dirty+0x9b/0xc0
[  329.296624]  ext4_journal_commit_callback+0x4d/0xd0
[  329.297198]  jbd2_journal_commit_transaction+0x1531/0x1720
[  329.297772]  jbd2_journal_destroy+0xdd/0x2a0
[  329.298362]  ? jbd2_journal_destroy+0xdd/0x2a0
[  329.298910]  ? wait_woken+0x80/0x80
[  329.299408]  ext4_fill_super+0x1cb9/0x2fb0
[  329.299890]  ? snprintf+0x45/0x70
[  329.300360]  mount_bdev+0x248/0x290
[  329.300833]  ? ext4_calculate_overhead+0x490/0x490
[  329.301260]  ? mount_bdev+0x248/0x290
[  329.301684]  ? ext4_calculate_overhead+0x490/0x490
[  329.302100]  ext4_mount+0x15/0x20
[  329.302515]  mount_fs+0x37/0x150
[  329.302907]  ? alloc_vfsmnt+0x1b3/0x230
[  329.303302]  vfs_kern_mount.part.23+0x5d/0x110
[  329.303685]  do_mount+0x5ed/0xcf0
[  329.304057]  ? memdup_user+0x4f/0x80
[  329.304418]  SyS_mount+0x98/0xe0
[  329.304768]  do_syscall_64+0x73/0x130
[  329.305110]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  329.305438] RIP: 0033:0x7fbd172073ca
[  329.305752] RSP: 002b:00007ffde39c6a08 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  329.306154] RAX: ffffffffffffffda RBX: 000055d22d08ca40 RCX: 00007fbd172073ca
[  329.306556] RDX: 000055d22d08cc20 RSI: 000055d22d08e940 RDI: 000055d22d095fe0
[  329.306951] RBP: 0000000000000000 R08: 0000000000000000 R09: 000055d22d08cc40
[  329.307408] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 000055d22d095fe0
[  329.307717] R13: 000055d22d08cc20 R14: 0000000000000000 R15: 00007fbd177288a4
[  329.307987] Code: c0 4c 89 75 90 4c 89 75 88 4d 8d bd 80 02 00 00 4c 89 ff e8 cb 8e 66 00 49 8b b5 a0 02 00 00 49 8d 8d a0 02 00 00 48 39 ce 74 7b <44> 3b 66 34 75 75 48 89 f7 48 89 f2 eb 09 44 39 62 34 75 0b 48
[  329.308923] RIP: ext4_process_freed_data+0x68/0x4d0 RSP: ffffb907416e39d0
[  329.309365] CR2: 0000000000000034
[  329.309812] ---[ end trace 1ecf08f3cdf242f0 ]---

- Download
Comment 1 Wen Xu 2018-03-22 19:56:38 UTC
Reported by Wen Xu at sslab, gatech.
Comment 2 Theodore Tso 2018-03-26 01:48:05 UTC
Created attachment 274933 [details]
Proposed patch to fix the reported bug.

Thanks for reporting this bug.  The attached should address the problem.

    If the root directory has an i_links_count of zero, then when the file
    system is mounted, then when ext4_fill_super() notices the problem and
    tries to call iput() the root directory in the error return path,
    ext4_evict_inode() will try to free the inode on disk, before all of
    the file system structures are set up, and this will result in an OOPS
    caused by a NULL pointer dereference.
Comment 3 Wen Xu 2018-03-26 15:59:54 UTC
Thank you for the quick response! By the way, I wonder whether I can get CVE numbers assigned for such kinda issues I reported recently?(In reply to Theodore Tso from comment #2)
> Created attachment 274933 [details]
> Proposed patch to fix the reported bug.
> 
> Thanks for reporting this bug.  The attached should address the problem.
> 
>     If the root directory has an i_links_count of zero, then when the file
>     system is mounted, then when ext4_fill_super() notices the problem and
>     tries to call iput() the root directory in the error return path,
>     ext4_evict_inode() will try to free the inode on disk, before all of
>     the file system structures are set up, and this will result in an OOPS
>     caused by a NULL pointer dereference.

Thank you for the quick response! By the way, I wonder whether I can get CVE numbers assigned for such kinda issues I reported recently?
Comment 4 Jan Kara 2018-03-29 14:00:24 UTC
Ted, I'm not quite getting it: ext4_iget() should fail getting the root inode in case i_links_count is 0. It should return -ESTALE and mark the inode as bad. So iput() will call ext4_evict_inode() but because the inode is marked as bad, we skip any attempts to delete the inode and just call ext4_clear_inode(). But apparently that didn't happen and ext4_iget() succeeded despite inode having i_links_count == 0. So the question is why the check:

        if (inode->i_nlink == 0) {
                if ((inode->i_mode == 0 ||
                     !(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_ORPHAN_FS)) &&
                    ino != EXT4_BOOT_LOADER_INO) {
                        /* this inode is deleted */
                        ret = -ESTALE;
                        goto bad_inode;
                }

in ext4_iget() didn't trigger...
Comment 5 Theodore Tso 2018-03-30 00:28:35 UTC
Well, the root directory i_mode is not 0, and EXT4_ORPHAN_FS is not set.

We could add an "ino == EXT4_ROOT_INO ||" to the conditional; that would work too.   But if the root inode has i_links_count set to zero, we know for sure it's not a stale NFS handle referncing a deleted inode --- we know the file system must be corrupted, so calling ext4_error() is appropriate.