Bug 209243
Summary: | [regression] fsx IO_URING reading get BAD DATA | ||
---|---|---|---|
Product: | File System | Reporter: | Zorro Lang (zlang) |
Component: | XFS | Assignee: | FileSystem/XFS Default Virtual Assignee (filesystem_xfs) |
Status: | NEW --- | ||
Severity: | normal | CC: | axboe, jmoyer |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.9.0-rc4 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: | fsx log files |
Description
Zorro Lang
2020-09-12 05:46:32 UTC
Created attachment 292481 [details]
fsx log files
This failure can't be reproduced on older kernel, likes 5.8-rc4 (I have a ready-made one, so just tested on it). It looks like a regression, need more test. FYI, looks like LVM is needed, at least I haven't reproduced it on general disk partition or loop device. FYI, This mainline linux v5.8 can't reproduce this bug, but v5.9-rc1 can... The v5.9-rc1 contains lots of changes of xfs and io_uring, I'm still not sure if it's a xfs issue or io_uring issue for now. Finally, I find the first commit which can reproduce this failure: commit c3cf992c25c7ff04bfc4dec5c916705a5332320e Author: Jens Axboe <axboe@kernel.dk> Date: Fri May 22 09:24:42 2020 -0600 io_uring: support true async buffered reads, if file provides it If the file is flagged with FMODE_BUF_RASYNC, then we don't have to punt the buffered read to an io-wq worker. Instead we can rely on page unlocking callbacks to support retry based async IO. This is a lot more efficient than doing async thread offload. The retry is done similarly to how we handle poll based retry. From the unlock callback, we simply queue the retry to a task_work based handler. Signed-off-by: Jens Axboe <axboe@kernel.dk> So CC the author of this patch to get more review. I'll take a look. Do you have a repo of fsstress with your io_uring changes included? (In reply to Jens Axboe from comment #6) > I'll take a look. Do you have a repo of fsstress with your io_uring changes > included? Thanks for your reply. Sorry, I don't prepare a public xfstests git repo. But you can: 1) git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git 2) merget below 5 patches into above xfstests: https://patchwork.kernel.org/patch/11769857/ https://patchwork.kernel.org/patch/11769849/ https://patchwork.kernel.org/patch/11769851/ https://patchwork.kernel.org/patch/11769853/ https://patchwork.kernel.org/patch/11769855/ 3) build xfstests with liburing install 4) make a XFS filesystem on a LVM device 5) mount the xfs filesystem 6) run xfstests/ltp/fsx -U -R -W -o 128000 $mnt/foo Thanks, Zorro No problem, that's what I ended up doing. Can you try the patch I replied on the list with? https://lore.kernel.org/io-uring/9d3c38bc-302c-5eb6-c772-7072a75eaf74@kernel.dk/T/#m669bfdc265c53e165bc2bfd4c3484243fa0ac33d (In reply to Jens Axboe from comment #8) > No problem, that's what I ended up doing. Can you try the patch I replied on > the list with? > > https://lore.kernel.org/io-uring/9d3c38bc-302c-5eb6-c772-7072a75eaf74@kernel. > dk/T/#m669bfdc265c53e165bc2bfd4c3484243fa0ac33d By a simple test, I can't reproduce this bug after merge above patch. Great, thanks for testing! I'll add your reported-by and tested-by. |