Created attachment 306300 [details] reproduce.c Hi, I mounted an ext4 image, created a file, and created a link to it, then I wrote to these two files, and I failed with a specific read and write order. I can reproduce this with the latest linux kernel https://git.kernel.org/torvalds/t/linux-6.9-rc7.tar.gz The following is the triggering script: ``` dd if=/dev/zero of=ext4-0.img bs=1M count=120 mkfs.ext4 ext4-0.img g++ -static reproduce.c losetup /dev/loop0 ext4-0.img mkdir /root/mnt ./a.out ``` After run the script, you will see the error message: ``` write failure ``` The contents of `reproduce.c` : ``` #include <assert.h> #include <string.h> #include <stdlib.h> #include <stdio.h> #include <stdarg.h> #include <stddef.h> #include <unistd.h> #include <pthread.h> #include <errno.h> #include <dirent.h> #include <string> #include <sys/mount.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/wait.h> #include <sys/xattr.h> #include <sys/mount.h> #include <sys/statfs.h> #include <fcntl.h> #define ALIGN 4096 void* align_alloc(size_t size) { void *ptr = NULL; int ret = posix_memalign(&ptr, ALIGN, size); if (ret) { printf("align error\n"); exit(1); } return ptr; } int main() { char *buf_15 = (char*)align_alloc(4096*20); memset(buf_15, 'a', 4096*20); char *buf_4 = (char*)align_alloc(4096*20); memset(buf_4, 'a', 4096*20); mount("/dev/loop0", "/root/mnt", "f2fs", 0, ""); creat("/root/mnt/a", S_IRWXG); link("/root/mnt/a", "/root/mnt/b"); int fd_a = open("/root/mnt/a", O_RDWR); int fd_b = open("/root/mnt/b", O_RDWR | O_DIRECT); lseek(fd_a, 100, SEEK_SET); write(fd_a, buf_15, 9900); read(fd_b, buf_4, 73728); int state = write(fd_b, buf_15, 65536); if (state == -1) { printf("write failure\n"); } return 0; } ``` If I move the statement `read(fd_b, buf_4, 73728); ` before the first write operation, or modify the size `73728` to a smaller one, such as `63728`, then this script will not fail. Did I do anything wrong?
So sorry, The mount function call should be `mount("/dev/loop0", "/root/mnt", "ext4", 0, "");`
Could you please help me review this report? I am still able to reproduce the issue with the following test case: ``` #include <assert.h> #include <string.h> #include <stdlib.h> #include <stdio.h> #include <stdarg.h> #include <stddef.h> #include <unistd.h> #include <pthread.h> #include <errno.h> #include <dirent.h> #include <string> #include <sys/mount.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/wait.h> #include <sys/xattr.h> #include <sys/mount.h> #include <sys/statfs.h> #include <fcntl.h> #define ALIGN 4096 void* align_alloc(size_t size) { void *ptr = NULL; int ret = posix_memalign(&ptr, ALIGN, size); if (ret) { printf("align error\n"); exit(1); } return ptr; } int main() { char *buf_15 = (char*)align_alloc(4096*20); memset(buf_15, 'a', 4096*20); char *buf_4 = (char*)align_alloc(4096*20); memset(buf_4, 'a', 4096*20); mount("/dev/loop0", "/root/mnt", "ext4", 0, ""); creat("/root/mnt/a", S_IRWXG); link("/root/mnt/a", "/root/mnt/b"); int fd_a = open("/root/mnt/a", O_RDWR); int fd_b = open("/root/mnt/b", O_RDWR | O_DIRECT); lseek(fd_a, 100, SEEK_SET); write(fd_a, buf_15, 9900); read(fd_b, buf_4, 73728); int state = write(fd_b, buf_15, 65536); if (state == -1) { printf("write failure\n"); } return 0; } ```
Hint: check the return value of all system calls. In particular, check to see what the read(fd_b, buf_4, 73728) returns. Check to see what the size of the file is after write(fd_a, buf_15, 9900), and then reflect on what happens if the read ends up hitting the end of file marker, and what the offset of fd_b is after the short read when hitting EOF. Finally, read the documentation for the O_DIRECT flag in the NOTES section of the open man page[1], and understand what the requirements are for O_DIRECT writes, in particular about the alignment requirements are of the starting offset when performing an O_DIRECT write (or O_DIRECT) read. Then also check on the errno return (for example replace the printf("write failure\n") with perror("write"). [1] https://man7.org/linux/man-pages/man2/open.2.html In any case, this is not a bug, and this is not a good place for you to be asking for instruction in basic Unix system call programming.