CONFIG_QUOTA=y CONFIG_FAULT_INJECTION=y Find an environment without ext4 filesystem. qemu args(Make buffer head is easily released): -smp 2 -m 2048 1. Apply diff 2. ./test.sh [ 49.477106] v2_read_file_info: <3> free_blk 0 free_entry 5 dqi_blks 6 isize 6144 [ 49.491722] v2_read_file_info: <4> free_blk 0 free_entry 5 dqi_blks 6 isize 6144 [ 49.493171] ext4 filesystem being mounted at /root/temp supports timestamps until 2038 (0x7fffffff) [ 50.335459] EXT4-fs (sda): Invalid checksum for backup superblock 8193 [ 50.335459] [ 50.337433] EXT4-fs error (device sda) in ext4_update_backup_sb:174: Filesystem failed CRC [ 50.339314] EXT4-fs (sda): Invalid checksum for backup superblock 24577 [ 50.339314] [ 50.340444] EXT4-fs error (device sda) in ext4_update_backup_sb:174: Filesystem failed CRC [ 52.690568] ext4_quota_write[chown]: write blk 6 phy 4097 [ 58.127839] wait offline [ 58.630388] wait done [ 58.630790] Buffer I/O error on dev sda, logical block 4097, lost async page write [ 58.631947] end_buffer_async_write: bh ffff88801b5132d8(blk 4097, ref 2) uptodate 0 [ 58.634121] EXT4-fs error (device sda): ext4_check_bdev_write_error:217: comm chown: Error while async write back metadata [ 59.165117] put_free_dqblk[chown]: info->dqi_free_blk is 6 [ 65.272115] wait offline [ 65.788928] wait done [ 65.789977] Buffer I/O error on dev sda, logical block 4097, lost async page write [ 65.790713] end_buffer_async_write: bh ffff88801b5132d8(blk 4097, ref 2) uptodate 0 [ 66.998850] EXT4-fs error (device sda): ext4_check_bdev_write_error:217: comm ls: Error while async write back metadata [ 104.069217] free_buffer_head: free g_bh ffff88801b5132d8 [ 104.155486] get_free_dqblk[chown]: use info->dqi_free_blk 6 [ 104.156539] get_free_dqblk[chown]: set info->dqi_free_blk 4294959296 [ 104.400851] get_free_dqblk[chown]: use info->dqi_free_blk 4294959296 [ 104.402565] get_free_dqblk[chown]: set info->dqi_free_blk 0 [ 104.403585] __quota_error: 10 callbacks suppressed [ 104.403593] Quota error (device sda): qtree_write_dquot: Error -8000 occurred while creating quota [ 104.405205] BUG: unable to handle page fault for address: ffffffffffffe120 [ 104.405943] #PF: supervisor write access in kernel mode [ 104.406505] #PF: error_code(0x0002) - not-present page [ 104.407070] PGD 2e0b067 P4D 2e0b067 PUD 2e0d067 PMD 0 [ 104.407683] Oops: 0002 [#1] PREEMPT SMP [ 104.408101] CPU: 0 PID: 5974 Comm: chown Not tainted 6.0.0-rc1-00004-gaa59d8788f74-dirty #780 [ 104.408970] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc314 [ 104.410414] RIP: 0010:_raw_spin_lock+0x3a/0x90 [ 104.411202] Code: 05 9a 36 c7 08 01 e8 c5 a3 ff fe 48 83 05 95 36 c7 08 01 48 83 05 95 36 c7 08 01 31 c0 ba 01 00 00 00 48 83 05 36 36 c73 [ 104.413329] RSP: 0018:ffff88801c4dfb68 EFLAGS: 00010206 [ 104.413953] RAX: 0000000000000000 RBX: ffffffffffffe0c0 RCX: 0000000000000000 [ 104.414692] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000001 [ 104.415408] RBP: ffffffffffffe120 R08: 0000000000000000 R09: ffffffff8a0e97a0 [ 104.416243] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88801c4dfbe8 [ 104.417188] R13: ffffffffffffe120 R14: 0000000000000000 R15: ffff88801c4dfd08 [ 104.418120] FS: 00007fbe0bcfb4c0(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000 [ 104.419215] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 104.419928] CR2: ffffffffffffe120 CR3: 000000001389c000 CR4: 00000000000006f0 [ 104.420803] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 104.421871] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 104.423104] Call Trace: [ 104.423506] <TASK> [ 104.423878] dquot_add_inodes+0x28/0x270 [ 104.424509] __dquot_transfer+0x377/0x840 [ 104.425177] ? jbd2_journal_stop+0x20b/0x540 [ 104.425944] ? percpu_counter_add_batch+0x88/0x120 [ 104.426665] ? remove_waiter+0x400/0x490 [ 104.427327] ? dqput+0x234/0x350 [ 104.427826] ? dqget+0x661/0x6a0 [ 104.428347] dquot_transfer+0xde/0x540 [ 104.428975] ext4_setattr+0x405/0x14d0 [ 104.429411] ? ext4_xattr_security_get+0x23/0x30 [ 104.429896] ? __vfs_getxattr+0x60/0xa0 [ 104.430298] notify_change+0x68e/0x9f0 [ 104.430698] ? chown_common+0x300/0x430 [ 104.431125] chown_common+0x300/0x430 [ 104.431492] ? kmemleak_free+0x43/0xa0 [ 104.431906] ? preempt_count_add+0x83/0x130 [ 104.432327] do_fchownat+0x107/0x180 [ 104.432711] __x64_sys_fchownat+0x29/0x40 [ 104.433144] do_syscall_64+0x35/0x80 [ 104.433543] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 104.434069] RIP: 0033:0x7fbe0b7013ca [ 104.434444] Code: 48 8b 0d c1 da 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 04 01 00 008 [ 104.436400] RSP: 002b:00007fff12476b78 EFLAGS: 00000246 ORIG_RAX: 0000000000000104 [ 104.437228] RAX: ffffffffffffffda RBX: 00007fff12476dc0 RCX: 00007fbe0b7013ca [ 104.438169] RDX: 000000000000fffe RSI: 000055cfd716d070 RDI: 00000000ffffff9c [ 104.439148] RBP: 000055cfd716bd10 R08: 0000000000000000 R09: 00000000ffffffff [ 104.440000] R10: 00000000ffffffff R11: 0000000000000246 R12: 000055cfd716bd88 [ 104.440863] R13: 00000000ffffffff R14: 00000000ffffff9c R15: 0000000000000001 [ 104.441690] </TASK> [ 104.441980] Modules linked in: [ 104.442355] CR2: ffffffffffffe120 [ 104.442743] ---[ end trace 0000000000000000 ]--- [ 104.443296] RIP: 0010:_raw_spin_lock+0x3a/0x90 [ 104.443852] Code: 05 9a 36 c7 08 01 e8 c5 a3 ff fe 48 83 05 95 36 c7 08 01 48 83 05 95 36 c7 08 01 31 c0 ba 01 00 00 00 48 83 05 36 36 c73 [ 104.445966] RSP: 0018:ffff88801c4dfb68 EFLAGS: 00010206 [ 104.446610] RAX: 0000000000000000 RBX: ffffffffffffe0c0 RCX: 0000000000000000 [ 104.447489] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000001 [ 104.448120] RBP: ffffffffffffe120 R08: 0000000000000000 R09: ffffffff8a0e97a0 [ 104.448756] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88801c4dfbe8 [ 104.449380] R13: ffffffffffffe120 R14: 0000000000000000 R15: ffff88801c4dfd08 [ 104.450036] FS: 00007fbe0bcfb4c0(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000 [ 104.450874] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 104.451530] CR2: ffffffffffffe120 CR3: 000000001389c000 CR4: 00000000000006f0 [ 104.452234] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 104.453081] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Created attachment 301588 [details] test.sh
Created attachment 301589 [details] diff
Created attachment 301590 [details] b.c
(In reply to Zhihao Cheng from comment #0) 0. gcc -o bb b.c > > 1. Apply diff > 2. ./test.sh
Is this reproducible with v5.19?
在 2022/8/17 17:17, bugzilla-daemon@kernel.org 写道: > https://bugzilla.kernel.org/show_bug.cgi?id=216372 > > --- Comment #5 from Christian Brauner (brauner@kernel.org) --- > Is this reproducible with v5.19? > I guess yes. I found this problem firstly in 5.10-lts. PS: I'm trying to fix it. Init: v2_read_file_info: <3> dqi_free_blk 0 dqi_free_entry 5 dqi_blks 6 Step 1. chown bin f_a -> dquot_acquire -> v2_write_dquot: qtree_write_dquot do_insert_tree find_free_dqentry get_free_dqblk write_blk(info->dqi_blocks) // info->dqi_blocks = 6, failure. The content in physical block (corresponding to blk 6) is random. Step 2. chown root f_a -> dquot_transfer -> dqput_all -> dqput -> ext4_release_dquot -> v2_release_dquot -> qtree_delete_dquot: dquot_release remove_tree free_dqentry put_free_dqblk(6) info->dqi_free_blk = blk // info->dqi_free_blk = 6 Step 3. drop cache (buffer head for block 6 is released) Step 4. chown bin f_b -> dquot_acquire -> commit_dqblk -> v2_write_dquot: qtree_write_dquot do_insert_tree find_free_dqentry get_free_dqblk dh = (struct qt_disk_dqdbheader *)buf blk = info->dqi_free_blk // 6 ret = read_blk(info, blk, buf) // The content of buf is random info->dqi_free_blk = le32_to_cpu(dh->dqdh_next_free) // random blk Step 5. chown bin f_c -> notify_change -> ext4_setattr -> dquot_transfer: dquot = dqget -> acquire_dquot -> ext4_acquire_dquot -> dquot_acquire -> commit_dqblk -> v2_write_dquot -> dq_insert_tree: do_insert_tree find_free_dqentry get_free_dqblk blk = info->dqi_free_blk // If blk < 0 and blk is not an error code, it will be returned as dquot transfer_to[USRQUOTA] = dquot // A random negative value __dquot_transfer(transfer_to) dquot_add_inodes(transfer_to[cnt]) spin_lock(&dquot->dq_dqb_lock) // page fault
Thanks for the report. As far as I'm reading the reproducer, it basically corrupts the quota file (by cleverly inserting some IO failures). We have added some more sanity checking of data loaded from quota file after 5.10 (e.g. commit 9bf3d2033129 ("quota: check block number when reading the block in quota file") in 5.16). Now AFAICS we do not check validity of the free blocks list (we should check dqdh_next_free before using it) so your reproducer will still cause more damage than necessary. Will you send a patch to add such checks please?
(In reply to Jan Kara from comment #7) > Thanks for the report. As far as I'm reading the reproducer, it basically > corrupts the quota file (by cleverly inserting some IO failures). We have > added some more sanity checking of data loaded from quota file after 5.10 > (e.g. commit 9bf3d2033129 ("quota: check block number when reading the block > in quota file") in 5.16). Now AFAICS we do not check validity of the free > blocks list (we should check dqdh_next_free before using it) so your > reproducer will still cause more damage than necessary. Will you send a > patch to add such checks please? Yes, sure. I will send the fix patch as soon as possible, before that, I need do more verifications to make sure I find all situations.
Patches were merged as: 6c8ea8b8cd47 ("quota: Check next/prev free block number after reading from quota file") 3fc61e0e96a3 ("quota: Replace all block number checking with helper function") 191249f70889 ("quota: Add more checking after reading from quota file") Closing the bug.