Bug 5994

Summary: [PATCH]O_SYNC should be modifiable using fcntl(F_SETFL)
Product: File System Reporter: Michael Kerrisk (michael.kerrisk)
Component: VFSAssignee: fs_vfs
Status: NEW ---    
Severity: normal CC: alan, decui, mtk.manpages, peter.volkov, protasnb, sar, stuart
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.15 Subsystem:
Regression: No Bisected commit-id:
Attachments: Fix to allow O_SYNC to be set via fcntl

Description Michael Kerrisk 2006-02-01 20:22:27 UTC
Most recent kernel where this bug did not occur: applies to all kernels
Distribution:
Hardware Environment: x86 and others
Software Environment:
Problem Description:
On Linux, it not is possible to modify the setting of the
O_SYNC status flag using the fcntl(F_SETFL) operation.  
(i.e., this flag can only be set during open(2).) 

However, all other Unix implementations that I have tested do allow
this status flag to be modified using fcntl(F_SETFL).  I have tested
FreeBSD 6.0, Tru64 5.1, Solaris 8, and HP-UX 11.  A test program is 
provided below.

My reading of the SUSv3-fcntl(F_SETFL) also confirms that an 
application should be able to modify the O_SYNC setting
using fcntl(F_SETFL), and thus Linux is non-conformant.

Cheers,

Michael


/* fcntl_O_SYNC.c */

#include <sys/types.h>
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                        } while (0)

#define usageErr(msg, progName) \
                        do { fprintf(stderr, "Usage: "); \
                             fprintf(stderr, msg, progName); \
                             exit(EXIT_FAILURE); } while (0)

int
main(int argc, char *argv[])
{
    int flags, fd;

    if (argc != 2 || strcmp(argv[1], "--help") == 0)
        usageErr("%s path\n", argv[0]);

    fd = open(argv[1], O_RDWR | O_SYNC);
    if (fd == -1) errExit("open");

    flags = fcntl(fd, F_GETFL);
    if (flags == -1) errExit("fcntl");

    assert(flags & O_SYNC);

    if (fcntl(fd, F_SETFL, flags & ~O_SYNC) == -1) errExit("fcntl");

    flags = fcntl(fd, F_GETFL);
    if (flags == -1) errExit("fcntl");

    if (flags & O_SYNC) {
        printf("O_SYNC was left unchanged (non-conformant)\n");
        exit(EXIT_FAILURE);
    } else {
        printf("O_SYNC was changed (conformant)\n");
        exit(EXIT_SUCCESS);
    } 
} /* main */
Comment 1 Alan 2007-09-27 05:49:26 UTC
It appears to be unspecified what occurs. Also rather tricky is the semantic question of what happens to queued I/O at the point you set the flag ?
Comment 2 Michael Kerrisk 2007-09-27 09:53:20 UTC
My advice from Geoff Clare at the Open Group is that the behavior is specified, and Linux doesn't conform.  And as noted, every other system that I tested does support setting O_SYNC with fcntl().
Comment 3 Alan 2007-09-27 10:34:17 UTC
Could you share his reasoning on this - its not obvious to me from the spec. ALso what do other systems do if you do

write(lots)
lseek(back a bit)
f_setfl(O_SYNC)
write(overlapped)

which bits are synchronous and what ordering is guaranteed- I can't find any clear view on this at all
Comment 4 Michael Kerrisk 2007-10-01 14:02:43 UTC
I think the reasoning went like this.  All "file status flags" that are specified in the standard must be settable via F_SETFL, unless otherwise specified.  The standard says: "Bits corresponding to the file access mode and the file creation flags, as defined in <fcntl.h>, that are set in arg shall be ignored." 

Under the specification of <fcntl.h>, we find the following file status flags specified:

7851 File status flags used for open( ) and fcntl( ) are as follows:
7852 O_APPEND Set append mode.
7853 SIO O_DSYNC Write according to synchronized I/O data integrity completion.
7854 O_NONBLOCK Non-blocking mode.
7855 SIO O_RSYNC Synchronized read I/O operations.
7856 O_SYNC Write according to synchronized I/O file integrity completion.

(The SIO marking indicates feature that is part of an Option (Synchronized Input and Output) for POSIX -- thus O_DSYNC and O_RSYNC are not mandatory.)

As I remarked in the initial report, Linux seems to be alone in not allowing O_SYNC to be settable using F_SETFL.  Furthermore, Linux does allow O_APPEND and O_NONBLOCK (the other flags required by POSIX) to be modified using F_SETFL.
Comment 5 Natalie Protasevich 2008-03-17 01:13:07 UTC
Again, maybe it would be wise to drop a line about this on LKML, so this topic won't get lost in obscurity of bugzilla. 
Comment 6 Steve Rago 2011-03-22 14:17:44 UTC
Created attachment 51602 [details]
Fix to allow O_SYNC to be set via fcntl

Here is a patch that fixes the problem.  The question about outstanding I/O is interesting, but not a tough problem to solve.  An application gets what it asks for.  If it does a bunch of delayed writes and then wants to switch to synchronous writes, it better call fsync(2) first to ensure that its file is consistent.  Otherwise the results are undefined.
Comment 7 Stuart P. Bentley 2013-03-08 08:32:53 UTC
https://patchwork.kernel.org/patch/591481/

It looks like this patch still hasn't been merged into the mainline. What's its status? It's causing real-world problems (see the note on http://docs.basho.com/riak/1.3.0/tutorials/choosing-a-backend/Bitcask/), and this seems like a simple enough fix (if it doesn't bring it to total compliance, it still brings it a lot closer than it was before).
Comment 8 Michael Kerrisk 2013-03-08 09:57:13 UTC
Stuart,

I think your best bet may be to restart the thread at http://thread.gmane.org/gmane.linux.kernel/1105833 or start a new thread that points to that thread and this bug, and CC all of the contributors to the bug report and also the earlier thread.
Comment 9 Michael Kerrisk 2013-03-08 10:09:17 UTC
Stuart, I meant to add, that if you do (0start an email thread, please CC me at only mtk.manpages@gmail.com. It was a mistake that I entered this bug under my other email address.