Bug 217078 - [Syzkaller & bisect] There is "xfs_dquot_alloc" related BUG in v6.2 in guest
Summary: [Syzkaller & bisect] There is "xfs_dquot_alloc" related BUG in v6.2 in guest
Status: NEW
Alias: None
Product: Linux
Classification: Unclassified
Component: Kernel (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Virtual assignee for kernel bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-24 04:47 UTC by xupengfe
Modified: 2023-03-28 16:00 UTC (History)
0 users

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
xfs_fixed_BUG_issue_dmesg_and_left_call_trace (2.66 MB, application/octet-stream)
2023-02-27 07:21 UTC, xupengfe
Details

Description xupengfe 2023-02-24 04:47:19 UTC
Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/repro.c
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/kconfig_origin
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/v6.2_c9c3395d5e3dcc6daee66c6908354d47bf98cb0c_dmesg.log
Bisect info(Might not be correct this time, but just gave some clues to the problem) https://github.com/xupengfe/syzkaller_logs/blob/main/230222_152458_xfs_dquot_alloc_bug/bisect_info.log

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230222_152458_xfs_dquot_alloc_bug

Platform: ADL-S, and it could be reproduced on x86 platform in guest.
There is "xfs_dquot_alloc" related BUG in v6.2:

[   71.149963] xfs filesystem being mounted at /root/syzkaller.6TPmw0/0/file0 supports timestamps until 2038 (0x7fffffff)
[   71.150653] 00000000: 58 41 47 49 00 00 00 01 00 00 00 00 00 00 80 00  XAGI............
[   71.151006] 00000010: 00 00 00 40 00 00 00 06 00 00 00 01 00 00 00 37  ...@...........7
[   71.151321] 00000020: 00 00 00 20 ff ff ff ff ff ff ff ff ff ff ff ff  ... ............
[   71.151633] 00000030: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.151946] 00000040: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.152259] 00000050: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.152570] 00000060: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.152881] 00000070: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.153193] 00000080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.153607] 00000090: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.153921] 000000a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.154237] 000000b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.154549] 000000c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.154865] 000000d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.155180] 000000e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.155494] 000000f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.155807] 00000100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.156119] 00000110: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
[   71.156433] 00000120: ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00  ................
[   71.156747] 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   71.157059] 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   71.157382] 00000150: 00 00 00 00 00 00 00 00                          ........
[   71.157671] XFS (loop3): Internal error xfs_iunlink_remove_inode at line 2013 of file fs/xfs/xfs_inode.c.  Caller xfs_ifree+0xed/0x9e0
[   71.158154] CPU: 1 PID: 137 Comm: kworker/1:3 Not tainted 6.2.0-c9c3395d5e3d #1
[   71.158447] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   71.158890] Workqueue: xfs-inodegc/loop3 xfs_inodegc_worker
[   71.159117] Call Trace:
[   71.159220]  <TASK>
[   71.159313]  dump_stack_lvl+0xa7/0xdb
[   71.159478]  dump_stack+0x19/0x1f
[   71.159621]  xfs_corruption_error+0xd7/0xe0
[   71.159805]  ? xfs_ifree+0xed/0x9e0
[   71.159957]  xfs_iunlink_remove+0x32e/0x590
[   71.160136]  ? xfs_ifree+0xed/0x9e0
[   71.160290]  xfs_ifree+0xed/0x9e0
[   71.160432]  ? write_comp_data+0x2f/0x90
[   71.160600]  ? xfs_trans_ijoin+0x47/0x70
[   71.160768]  ? __sanitizer_cov_trace_pc+0x25/0x60
[   71.160964]  ? xfs_trans_add_item+0x79/0x1c0
[   71.161151]  xfs_inactive_ifree+0xf8/0x2a0
[   71.161324]  xfs_inactive+0x226/0x340
[   71.161482]  xfs_inodegc_worker+0xd3/0x430
[   71.161657]  process_one_work+0x3b1/0x960
[   71.161837]  worker_thread+0x52/0x660
[   71.161999]  ? __pfx_worker_thread+0x10/0x10
[   71.162184]  kthread+0x161/0x1a0
[   71.162331]  ? __pfx_kthread+0x10/0x10
[   71.162498]  ret_from_fork+0x29/0x50
[   71.162670]  </TASK>
[   71.162773] XFS (loop3): Corruption detected. Unmount and run xfs_repair
[   71.163039] XFS (loop3): xfs_inactive_ifree: xfs_ifree returned error -117
[   71.163869] repro: attempt to access beyond end of device
[   71.163869] loop3: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[   71.164410] XFS (loop3): log I/O error -5
[   71.166002] XFS (loop3): Metadata I/O Error (0x1) detected at xfs_inactive_ifree+0x232/0x2a0 (fs/xfs/xfs_inode.c:1612).  Shutting down filesystem.
[   71.166541] XFS (loop3): Please unmount the filesystem and rectify the problem(s)
[   71.167188] XFS (loop1): DAX unsupported by block device. Turning off DAX.
[   71.167530] XFS (loop7): DAX unsupported by block device. Turning off DAX.
[   71.167859] XFS (loop6): DAX unsupported by block device. Turning off DAX.
[   71.168188] XFS (loop4): DAX unsupported by block device. Turning off DAX.
[   71.168554] XFS (loop1): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.168995] XFS (loop7): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.169595] XFS (loop2): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.169614] XFS (loop6): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.170444] XFS (loop4): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.172161] XFS (loop3): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[   71.172510] xfs filesystem being mounted at /root/syzkaller.mc3H24/0/file0 supports timestamps until 2038 (0x7fffffff)
[   71.174633] XFS (loop3): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.175936] XFS (loop5): DAX unsupported by block device. Turning off DAX.
[   71.176303] XFS (loop5): Mounting V4 Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.178679] XFS (loop4): totally zeroed log
[   71.179069] XFS (loop4): Ending clean mount
[   71.179442] XFS (loop4): Quotacheck needed: Please wait.
[   71.183738] XFS (loop6): totally zeroed log
[   71.184537] repro: attempt to access beyond end of device
[   71.184537] loop4: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[   71.184637] XFS (loop6): Ending clean mount
[   71.185052] XFS (loop4): log I/O error -5
[   71.185339] XFS (loop6): Quotacheck needed: Please wait.
[   71.185404] XFS (loop4): Filesystem has been shut down due to log error (0x2).
[   71.185883] XFS (loop4): Please unmount the filesystem and rectify the problem(s).
[   71.186372] XFS (loop4): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[   71.186705] xfs filesystem being mounted at /root/syzkaller.uq7iOt/0/file0 supports timestamps until 2038 (0x7fffffff)
[   71.187608] repro: attempt to access beyond end of device
[   71.187608] loop6: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[   71.188127] XFS (loop6): log I/O error -5
[   71.188317] XFS (loop6): Filesystem has been shut down due to log error (0x2).
[   71.188599] XFS (loop6): Please unmount the filesystem and rectify the problem(s).
[   71.188998] XFS (loop6): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[   71.189324] xfs filesystem being mounted at /root/syzkaller.OjSrkA/0/file0 supports timestamps until 2038 (0x7fffffff)
[   71.191782] XFS (loop4): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.193911] XFS (loop6): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.207573] XFS (loop1): totally zeroed log
[   71.207813] XFS (loop7): totally zeroed log
[   71.208396] XFS (loop7): Ending clean mount
[   71.208664] XFS (loop1): Ending clean mount
[   71.209093] XFS (loop1): Quotacheck needed: Please wait.
[   71.210841] XFS (loop7): Quotacheck needed: Please wait.
[   71.216659] repro: attempt to access beyond end of device
[   71.216659] loop7: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[   71.217165] XFS (loop7): log I/O error -5
[   71.217375] XFS (loop7): Filesystem has been shut down due to log error (0x2).
[   71.217674] XFS (loop7): Please unmount the filesystem and rectify the problem(s).
[   71.218071] XFS (loop7): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[   71.218402] xfs filesystem being mounted at /root/syzkaller.Q5dMMG/0/file0 supports timestamps until 2038 (0x7fffffff)
[   71.219906] XFS (loop7): Unmounting Filesystem 86ecfda0-089a-461f-b078-1b43afedebc1
[   71.224443] repro: attempt to access beyond end of device
[   71.224443] loop1: rw=432129, sector=65535, nr_sectors = 16 limit=65536
[   71.224960] XFS (loop1): log I/O error -5
[   71.225151] XFS (loop1): Filesystem has been shut down due to log error (0x2).
[   71.225543] XFS (loop1): Please unmount the filesystem and rectify the problem(s).
[   71.225966] XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
[   71.226310] xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
[   71.227591] BUG: kernel NULL pointer dereference, address: 00000000000002a8
[   71.227873] #PF: supervisor read access in kernel mode
[   71.228077] #PF: error_code(0x0000) - not-present page
[   71.228280] PGD c313067 P4D c313067 PUD c1fe067 PMD 0 
[   71.228494] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   71.228673] CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
[   71.228961] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   71.229400] Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
[   71.229626] RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
[   71.229820] Code: 80 15 ad 85 48 c7 c6 7c 6b 92 83 e8 75 0f 6b ff 49 8b 8d 60 01 00 00 44 89 e0 31 d2 48 c7 c6 18 ae 8f 83 48 8d bb 18 02 00 00 <f7> b1 a8 02 2
[   71.230528] RSP: 0018:ffffc90000babc20 EFLAGS: 00010246
[   71.230737] RAX: 0000000000000009 RBX: ffff8880093c98c0 RCX: 0000000000000000
[   71.231014] RDX: 0000000000000000 RSI: ffffffff838fae18 RDI: ffff8880093c9ad8
[   71.231292] RBP: ffffc90000babc48 R08: 0000000000000002 R09: 0000000000000000
[   71.231570] R10: ffffc90000baba80 R11: ffff88800af08d98 R12: 0000000000000009
[   71.231850] R13: ffff88800c4bc000 R14: ffff88800c4bc000 R15: 0000000000000004
[   71.232129] FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[   71.232441] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   71.232668] CR2: 00000000000002a8 CR3: 000000000a1d2002 CR4: 0000000000770ef0
[   71.232949] PKRU: 55555554
[   71.233061] Call Trace:
[   71.233162]  <TASK>
[   71.233254]  xfs_qm_dqread+0x46/0x440
[   71.233410]  ? xfs_qm_dqget_inode+0x13e/0x500
[   71.233596]  xfs_qm_dqget_inode+0x154/0x500
[   71.233774]  xfs_qm_dqattach_one+0x142/0x3c0
[   71.233961]  xfs_qm_dqattach_locked+0x14a/0x170
[   71.234149]  xfs_qm_dqattach+0x52/0x80
[   71.234307]  xfs_inactive+0x186/0x340
[   71.234461]  xfs_inodegc_worker+0xd3/0x430
[   71.234630]  process_one_work+0x3b1/0x960
[   71.234802]  worker_thread+0x52/0x660
[   71.234957]  ? __pfx_worker_thread+0x10/0x10
[   71.235136]  kthread+0x161/0x1a0
[   71.235279]  ? __pfx_kthread+0x10/0x10
[   71.235442]  ret_from_fork+0x29/0x50
[   71.235602]  </TASK>
[   71.235696] Modules linked in:
[   71.235826] CR2: 00000000000002a8
[   71.235964] ---[ end trace 0000000000000000 ]---

Reporting the above issue and providing a reproduced way seems valuable.

But not sure report to who in kernel community.
So used RIP: "xfs_dquot_alloc" key word to bisect between v6.2 and v5.11.
I know there is BUG also but with some other RIP info.

Related commit:
29837019d5ebb80a5f180af3107a0645c731a770
Merge tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block

This might not be the right point of suspicion.
Could you help to take a look or add the correct developer for this issue?
Thanks a lot!

---

If you don't need an environment to reproduce the problem or if you already have one, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
   // You could change the bzImage_xxx as you want In vm and login with root,  there is no password for root.

After login vm successfully, you could transfer reproduced binary to the VM by below way, and reproduce the problem:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git 
cd qemu git checkout -f v7.1.0 
mkdir build 
cd build 
yum install -y ninja-build.x86_64 
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl 
make 
make install

Thanks!
Comment 2 xupengfe 2023-02-27 07:20:37 UTC
Hi Dave Chinner,

 Thanks Helge Deller to add the xfs mailing list!
 Thanks Dave Chinner's  fixed patch for xfs BUG issue.

 After installed Dave Chinner's fixed patch on top of v6.2.

 There was no any BUG in dmesg anyway.


 And there was still some "Internal error xfs_iunlink_remove_inode" related Call Trace issue.

Related dmesg is in attached.
[  198.614621] XFS (loop6): Internal error xfs_iunlink_remove_inode at line 2013 of file fs/xfs/xfs_inode.c.  Caller xfs_ifree+0xed/0x9e0
[  198.615112] CPU: 0 PID: 291 Comm: kworker/0:6 Not tainted 6.2.0-kvmxfs+ #19
[  198.615394] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[  198.615844] Workqueue: xfs-inodegc/loop6 xfs_inodegc_worker
[  198.616074] Call Trace:
[  198.616179]  <TASK>
[  198.616274]  dump_stack_lvl+0xa7/0xdb
[  198.616441]  dump_stack+0x19/0x1f
[  198.616587]  xfs_corruption_error+0xd7/0xe0
[  198.616771]  ? xfs_ifree+0xed/0x9e0
[  198.616925]  xfs_iunlink_remove+0x32e/0x590
[  198.617108]  ? xfs_ifree+0xed/0x9e0
[  198.617265]  xfs_ifree+0xed/0x9e0
[  198.617410]  ? write_comp_data+0x2f/0x90
[  198.617580]  ? xfs_trans_ijoin+0x47/0x70
[  198.617750]  ? __sanitizer_cov_trace_pc+0x25/0x60
[  198.617950]  ? xfs_trans_add_item+0x79/0x1c0
[  198.618139]  xfs_inactive_ifree+0xf8/0x2a0
[  198.618316]  xfs_inactive+0x226/0x340
[  198.618475]  xfs_inodegc_worker+0xd3/0x430
[  198.618651]  process_one_work+0x3b1/0x960
[  198.618832]  worker_thread+0x52/0x660
[  198.618993]  ? __pfx_worker_thread+0x10/0x10
[  198.619179]  kthread+0x161/0x1a0
[  198.619326]  ? __pfx_kthread+0x10/0x10
[  198.619495]  ret_from_fork+0x29/0x50
[  198.619666]  </TASK>

  I'm new to xfs, could you help to check is it a real problem?


Thanks!
BR,
Comment 3 xupengfe 2023-02-27 07:21:35 UTC
Created attachment 303793 [details]
xfs_fixed_BUG_issue_dmesg_and_left_call_trace

Note You need to log in before you can comment on or make changes to this bug.