Bug 218158
Summary: | fdatasync to a block device seems to block writes on unrelated devices | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Matthew Stapleton (matthew4196) |
Component: | Block Layer | Assignee: | Jens Axboe (axboe) |
Status: | NEW --- | ||
Severity: | normal | CC: | bagasdotme, sam, sirius |
Priority: | P3 | ||
Hardware: | AMD | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: | |
Attachments: | config-6.1.53-capsicum |
Description
Matthew Stapleton
2023-11-18 23:08:29 UTC
Created attachment 305422 [details]
config-6.1.53-capsicum
Here is my kernel config
Also, I was running badblocks -b 4096 -w -s -v on the failing hard drive for a few days before trying nwipe and it didn't seem to be causing slowdowns on the server and the man page for badblocks says it uses Direct I/O by default. I decided to try nwipe as it provides the option disable read verifying. I could probably try removing fdatasync from nwipe or modifying it to use Direct I/O, but I haven't done that yet. (In reply to Matthew Stapleton from comment #0) > I was running nwipe on a failing hard drive that was running very slow and > while nwipe was running fdatasync it seemed to cause delays for the > filesystems on the other drives. The other drives are attached to the same > onboard ahci sata adapter if that is important. After stopping nwipe, > performance returned to normal > > The system is using ext4 filesystems on top of LVM on top of Linux RAID6 and > the kernel is 6.1.53. > Can you check latest mainline (currently v6.7-rc2)? I might be able to, but it might be a while before I can try that as I will need to setup a new config for the newer kernels. I might setup a throw away VM to see if I can easily replicate the problem there as it looks like qemu has some options to throttle virtual drive speeds. I've found this: https://lore.kernel.org/linux-btrfs/CAJCQCtSh2WT3fijK4sYEdfYpp09ehA+SA75rLyiJ6guUtyWjyw@mail.gmail.com/ which looks similar to the problem I'm having (See https://bugzilla.kernel.org/show_bug.cgi?id=218161 as well). I am using the bfs io scheduler. I have setup a VM with kernel 6.1.53, but so far haven't been able to trigger the problem. I've setup kernel 6.6.2 on the host server now as I wasn't able to get the VM to deadlock and so far it has been running for nearly days 3 without problems. I have also changed preempt from full to voluntary to see if that helps. I should probably mention my custom capsicum kernel is based on the gentoo sources base and extra patches and then a few extra custom patches such as the it87 driver from here: https://github.com/frankcrawford/it87 , enabling Intel user copy and PPro checksums for generic cpus, and an old custom patch to enable hpet on some additional nforce chipsets, but most of those patches were also enabled in my 5.10 kernel build. |