Bug 218049

Summary: tmpfs: O_DIRECT | O_CREATE open reports open failure but actually creates a file.
Product: File System Reporter: Colin Ian King (colin.i.king)
Component: OtherAssignee: fs_other
Status: RESOLVED ANSWERED    
Severity: normal CC: brauner, christian.brauner, hughd, wangjianjian0
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: C reproducer of the problem. Run it on a tmpfs file system mounted on /mnt/tmpfs

Description Colin Ian King 2023-10-27 13:53:48 UTC
Created attachment 305300 [details]
C reproducer of the problem. Run it on a tmpfs file system mounted on /mnt/tmpfs

creating a file on tmpfs with open(filename, O_RDWR | O_DIRECT | O_CREAT, 0666) reports an open failure error EINVAL, but still creates the file. The file should not be created if we hit such an error.

Tested this on 6.6.0-rc7 and 6.1.0-13 mainline kernels, x86-64.

mkdir /mnt/tmpfs
sudo mount -t tmpfs -o size=1G,nr_inodes=10k,mode=777 tmpfs /mnt/tmpfs
sudo chmod 666 /mnt/tmpfs

Run the attached program. It reports an open failure (errno 22, EINVAL) but still manages to create the file.

Note this was original discovered by running stress-ng on tmpfs with the open stressor: stress-ng --open 1
Comment 1 Colin Ian King 2023-10-27 14:30:36 UTC
See also: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2041670
Comment 2 Colin Ian King 2023-10-27 14:36:29 UTC
Also occurs on ramfs too. 

Seems to occur on kernels from 4.4.x to current 6.6-rc6 (not tested anything newer).
Comment 3 Artem S. Tashkinov 2023-10-27 14:48:49 UTC
Confirming and CC'ing Christian Brauner.
Comment 4 Artem S. Tashkinov 2023-10-27 14:50:39 UTC
Actually Hugh Dickins added O_DIRECT support, CC'ing him as well.
Comment 5 christian.brauner 2023-10-27 15:30:57 UTC
I think that's intended. You're creating a file and you're requesting it to be opened with O_DIRECT but if the underlying filesystem doesn't support O_DIRECT then you fail the open. So that the file is created is fine.
Comment 6 Christian Brauner 2023-10-27 15:35:36 UTC
I see, tmpfs has FMODE_CAN_ODIRECT
Comment 7 Colin Ian King 2023-10-27 15:36:36 UTC
The file is created but the fd is -1. Surely the semantics of a O_CREAT failure are a file that's not created.
Comment 8 Christian Brauner 2023-10-27 15:44:45 UTC
Ah, ok sorry now I see:

brauner@wittgenstein|~/src/git/linux/vfs/vfs.super|vfs.super $%>
> git describe --contains e88e0d366f9cfbb810b0c8509dc5d130d5a53e02
v6.6-vfs.tmpfs~2

which means support for O_DIRECT was merged in v6.6.
Comment 9 Christian Brauner 2023-10-27 16:09:10 UTC
(In reply to Colin Ian King from comment #7)
> The file is created but the fd is -1. Surely the semantics of a O_CREAT
> failure are a file that's not created.

Not necessarily. You're requesting a file to be created and to be opened afterwards with O_DIRECT support and what we do in this case is fail the open even though the create was successful.

Stuff like that can also happen in other cases. In containers you can end up with scenarios where you fail to open a file you created - probably hard to create but possible. Other scenarios would be were your LSM allows you to create files but not to open them.
Comment 10 Colin Ian King 2023-10-27 16:17:30 UTC
We've been here before with the minix file system:

https://lore.kernel.org/all/20220107133626.413379-1-qhjin.dev@gmail.com/T/#u

in that scenario the file was not created on a file error when doing O_CREAT | O_DIRECT
Comment 11 Hugh Dickins 2023-10-27 22:12:35 UTC
Artem: thanks for the Cc.

Colin: I think you misread the reproducer output for 6.6-rc, that's
precisely the release in which this is (unwittingly) fixed on tmpfs:
not by failing the creat part, but by allowing O_DIRECT open and I/O -
it gives me "open succeeded", "stat succeeded", "unlink succeeded".

(That's if I "sudo ./reproducer", or remove "sudo chmod 666 /mnt/tmpfs"
from its preliminary instructions - maybe that line was a braino?  It
gives me "open failed, errno=13 (Permission denied)", "stat" likewise.)

So, nothing to do for tmpfs at all (this is not worth a stable backport).

As for ramfs: yes, I agree with you that it's nicer not to creat anything
when the open is going to fail; and your minix link does indicate that
Christian and Jan have previously accepted that pragmatic solution, of
allowing the O_DIRECT open but then failing at the I/O stage.

So if you want ramfs changed, how about sending a noop_direct_IO patch
to ram_aops in fs/libfs.c?  But please do Cc hch@infradead.org: he's on
his way to eliminating the direct_IO method entirely.  There appears to
be plenty of noop_direct_IO usage elsewhere, so I don't suppose he will
mind one more, but he'd better be kept in the loop.
Comment 12 Colin Ian King 2023-10-28 16:34:22 UTC
Yes, sorry, I pasted the text into the report and somehow forgot to put sudo ./reproducer

I just wanted to clarify that the open with O_DIRECT and O_CREAT failure should not leave the file around if the open call gets an error return since this is not consistent with other file systems. Also it's not consistent with other open failure modes. Otherwise we need to fix every open call to remove any left over files if open fails and a file may be left over from the failed open.
Comment 13 Jianjian Wang 2024-12-05 02:50:43 UTC
Any updates on this ? do we need fix this problem ?