Bug 202897
Summary: | BUG: unable to handle kernel paging request at __memmove | ||
---|---|---|---|
Product: | File System | Reporter: | Jungyeon (jungyeon) |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | NEW --- | ||
Severity: | normal | CC: | 389387252, tytso |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.0-rc8 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
The (compressed) crafted image which causes crash
min_01.c run script |
Created attachment 281789 [details]
min_01.c
I cannot reproduce this bug by following these steps Created attachment 281849 [details]
run script
Oops.. Could you run this shell for reproducing? This is quite simple but for me, it is pretty well reproduced.
The following patch can fix this bug, but i'm not sure it is the best way to fix it. diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 86ed9c6..fd2ebba 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1695,7 +1695,7 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i, /* No failures allowed past this point. */ - if (!s->not_found && here->e_value_size && here->e_value_offs) { + if (!s->not_found && here->e_value_size && here->e_value_offs && !here->e_value_inum) { /* Remove the old value. */ void *first_val = s->base + min_offs; size_t offs = le16_to_cpu(here->e_value_offs); it is likely that this patch is more fitable than the last one: diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 86ed9c6..d7fe353 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1695,7 +1695,7 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i, /* No failures allowed past this point. */ - if (!s->not_found && here->e_value_size && here->e_value_offs) { + if (old_size && here->e_value_size && here->e_value_offs) { /* Remove the old value. */ void *first_val = s->base + min_offs; size_t offs = le16_to_cpu(here->e_value_offs); The patch in #4 looks closer to the right thing. In fact, this should do: diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index dc82e7757f67..491f9ee4040e 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1696,7 +1696,7 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i, /* No failures allowed past this point. */ - if (!s->not_found && here->e_value_size && here->e_value_offs) { + if (!s->not_found && here->e_value_size && !here->e_value_inum) { /* Remove the old value. */ void *first_val = s->base + min_offs; size_t offs = le16_to_cpu(here->e_value_offs); That's because if e_value_inum==0, then here->e_value_offs is guaranteed to be non-zero --- otherwise it would have failed a check in earlier in ext4_xattr_check_entries(). In the case where e_value_inum !=0, here->e_value_offs must be zero. We're currently however, not checking it both in the kernel and in e2fsck. We're just ignoring in all other cases when !e_value_inum. Why we're ignoring e_value_offs and not doing a check I'm not sure. I want to dig back through some older e-mail discussions and see if I can get the original developer who did ea_inode feature to see if he knows of some reason why we're not enforcing that check. My preference would be to enforce that check and fail the inode as corrupt, but there may be something I'm missing. There are also a large number of other tests which we're not enforcing in the kernel that I'm strongly considering adding. The root inode is being used as a ea_inode value --- and that should have been rejected, except the EXT4_EA_INODE_FL flag was set on inode #2. But in that case, we probably should reject all files that are reachable from the name space (e.g., all directories, regular files, etc.) that have EXT4_EA_INODE_FL; that should never happen. If we did that, the file system would have never successfully mounted, so we wouldn't have tripped this particular memmove case. Which is good in production, but it does make it harder for fuzzers to find legitimate real bugs, since we block them much earlier in the process. In this particular case, the fact that we have e_value_inum pointing at a root directory is not the reason why we BUG'ed on the memmove. It's because e_value_offs was non-zero when e_value_inum was also non-zero, and that's not supposed to ever happen. I should probably also make the kernel more strict about having both a journal UUID and a journal inum set at the same time. That's again one of those "should never happen situations". Thanks again for the patch. The above suggested patch passes the aforementioned testcase. Thanks for the confirmation! I was about to ping you to ask if you could do test, since it's not something I can test for myself at the moment. This does underline that releasing instructions on how to build the userspace test library automatically from a particular kernel version is going to be critically important, since if it is only reproducible using the LKL userspace program, we need to be able to build a modified binary if we are to confirm that issue has been addressed. Sorry for keeping you waiting. We will release the code-base after around two weeks since people are asking for the code-base. We hope it can help you to reproduce and patch the bugs. |
Created attachment 281787 [details] The (compressed) crafted image which causes crash - Overview After mounting crafted image, I got this page fault while running attached program. - Produces mkdir test mount -t ext4 tmp.img test gcc min_01.c cp a.out test cd test ./a.out - Kernel messages [ 74.327744] BUG: unable to handle kernel paging request at ffff95f12b296000 [ 74.329597] #PF error: [PROT] [WRITE] [ 74.330547] PGD 23601067 P4D 23601067 PUD 2366b2063 PMD 23541d063 PTE 800000022b296061 [ 74.332538] Oops: 0003 [#1] SMP PTI [ 74.333429] CPU: 0 PID: 1158 Comm: a.out Not tainted 5.0.0-rc8+ #9 [ 74.335059] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 74.337313] RIP: 0010:__memmove+0x81/0x1a0 [ 74.338359] Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49 [ 74.343035] RSP: 0018:ffffb09a011ef938 EFLAGS: 00010207 [ 74.344361] RAX: ffff95f12666a000 RBX: ffffb09a011efb40 RCX: 1fffffffff67a7fc [ 74.346163] RDX: ffffffffffffffe4 RSI: ffff95f12b296000 RDI: ffff95f12b296000 [ 74.347980] RBP: ffffb09a011efa38 R08: 0000000000000001 R09: ffff95f1324acf00 [ 74.349763] R10: ffff95f126669fdc R11: 0000000000000000 R12: ffffb09a011efab8 [ 74.351560] R13: ffff95f12666a000 R14: 00000000000003e4 R15: 0000000000000000 [ 74.353343] FS: 00007fa3b7981700(0000) GS:ffff95f137a00000(0000) knlGS:0000000000000000 [ 74.355374] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 74.356815] CR2: ffff95f12b296000 CR3: 000000022b2bc006 CR4: 00000000000206f0 [ 74.358622] Call Trace: [ 74.359263] ? ext4_xattr_set_entry+0xa55/0x1090 [ 74.360447] ? jbd2_journal_cancel_revoke+0xbf/0xf0 [ 74.361696] ? kmem_cache_alloc+0xb0/0x170 [ 74.362761] ? jbd2_journal_get_write_access+0x5b/0x70 [ 74.364062] ext4_xattr_block_set+0x37a/0xf80 [ 74.365173] ? __getblk_gfp+0x2f/0x300 [ 74.366129] ? xattr_find_entry+0x8c/0x110 [ 74.367183] ext4_xattr_set_handle+0x544/0x5f0 [ 74.368315] __ext4_set_acl+0x1aa/0x290 [ 74.369293] ext4_set_acl+0xbf/0x1f0 [ 74.370210] ? posix_acl_from_xattr+0x180/0x180 [ 74.371373] set_posix_acl+0x79/0xb0 [ 74.372282] posix_acl_xattr_set+0x84/0x90 [ 74.373321] __vfs_removexattr+0x52/0x70 [ 74.374310] vfs_removexattr+0x84/0x100 [ 74.375293] removexattr+0x55/0x80 [ 74.376157] ? __check_object_size+0x17c/0x1b0 [ 74.377272] ? strncpy_from_user+0x50/0x1b0 [ 74.378323] ? _cond_resched+0x1a/0x50 [ 74.379292] ? __sb_start_write+0x3f/0x70 [ 74.380310] ? mnt_want_write+0x2c/0x50 [ 74.381284] path_removexattr+0x9a/0xb0 [ 74.382252] __x64_sys_removexattr+0x1b/0x20 [ 74.383357] do_syscall_64+0x5a/0x110 [ 74.384293] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 74.385568] RIP: 0033:0x7fa3b749c4d9 [ 74.386491] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8f 29 2c 00 f7 d8 64 89 01 48 [ 74.391133] RSP: 002b:00007ffffd7aeb08 EFLAGS: 00000202 ORIG_RAX: 00000000000000c5 [ 74.393021] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa3b749c4d9 [ 74.394822] RDX: 0000000000000000 RSI: 00007ffffd7aeb30 RDI: 00007ffffd7aeb20 [ 74.396608] RBP: 00007ffffd7aeb50 R08: 00007fa3b7775ab0 R09: 00007ffffd7aec38 [ 74.398392] R10: 00000000004006a0 R11: 0000000000000202 R12: 00000000004004a0 [ 74.400175] R13: 00007ffffd7aec30 R14: 0000000000000000 R15: 0000000000000000 [ 74.401951] Modules linked in: [ 74.402744] CR2: ffff95f12b296000 [ 74.403596] ---[ end trace e7fe34a5ca4f4421 ]--- [ 74.404771] RIP: 0010:__memmove+0x81/0x1a0 [ 74.405815] Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49 [ 74.410512] RSP: 0018:ffffb09a011ef938 EFLAGS: 00010207 [ 74.411833] RAX: ffff95f12666a000 RBX: ffffb09a011efb40 RCX: 1fffffffff67a7fc [ 74.413618] RDX: ffffffffffffffe4 RSI: ffff95f12b296000 RDI: ffff95f12b296000 [ 74.415419] RBP: ffffb09a011efa38 R08: 0000000000000001 R09: ffff95f1324acf00 [ 74.417211] R10: ffff95f126669fdc R11: 0000000000000000 R12: ffffb09a011efab8 [ 74.419022] R13: ffff95f12666a000 R14: 00000000000003e4 R15: 0000000000000000 [ 74.420821] FS: 00007fa3b7981700(0000) GS:ffff95f137a00000(0000) knlGS:0000000000000000 [ 74.422857] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 74.424306] CR2: ffff95f12b296000 CR3: 000000022b2bc006 CR4: 00000000000206f0 - Primitive reason When calling memmove at 1704, it give extreme value as count (3rd parameter). This is because val is smaller than first_val in this case, so that the count becomes negative number. (-28 became -xfff....ffe4 because of two's compliment) As a result, memmove show errors while copying with huge count number. 1696 /* No failures allowed past this point. */ 1697 1698 if (!s->not_found && here->e_value_size && here->e_value_offs) { 1699 /* Remove the old value. */ 1700 void *first_val = s->base + min_offs; 1701 size_t offs = le16_to_cpu(here->e_value_offs); 1702 void *val = s->base + offs; 1703 1704 memmove(first_val + old_size, first_val, val - first_val); 1705 memset(first_val, 0, old_size); 1706 min_offs += old_size; 1707 1708 /* Adjust all value offsets. */ 1709 last = s->first; 1710 while (!IS_LAST_ENTRY(last)) { 1711 size_t o = le16_to_cpu(last->e_value_offs); 1712 1713 if (!last->e_value_inum && 1714 last->e_value_size && o < offs) 1715 last->e_value_offs = cpu_to_le16(o + old_size); 1716 last = EXT4_XATTR_NEXT(last); 1717 } 1718 }