Created attachment 306289 [details] reproduce.c Hi, I have a file and lseek on it after calling the close(), but it dose not trigger an EBADF error. Then I open and write to another file, but the write operation trigger an "Invalid argument" error. I can reproduce this with the latest linux kernel https://git.kernel.org/torvalds/t/linux-6.9-rc7.tar.gz The following is the triggering script: ``` dd if=/dev/zero of=ext4-0.img bs=1M count=120 mkfs.ext4 ext4-0.img g++ -static reproduce.c losetup /dev/loop0 ext4-0.img mkdir /root/mnt ./a.out ``` After running the script, you will see an error message: ``` write failure: (Invalid argument) ``` The contents of `reproduce.c` : ``` #include <assert.h> #include <string.h> #include <stdlib.h> #include <stdio.h> #include <stdarg.h> #include <stddef.h> #include <unistd.h> #include <pthread.h> #include <errno.h> #include <dirent.h> #include <string> #include <sys/mount.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/wait.h> #include <sys/xattr.h> #include <sys/mount.h> #include <sys/statfs.h> #include <fcntl.h> #define ALIGN 4096 void* align_alloc(size_t size) { void *ptr = NULL; int ret = posix_memalign(&ptr, ALIGN, size); if (ret) { printf("align error\n"); exit(1); } return ptr; } int main() { mount("/dev/loop0", "/root/mnt", "ext4", 0, ""); creat("/root/mnt/a", S_IRWXU); creat("/root/mnt/b", S_IRWXU); int fd_a = open("/root/mnt/a", O_RDWR); close(fd_a); int fd_b = open("/root/mnt/b", O_RDWR | O_DIRECT); int state = lseek(fd_a, 7208, SEEK_SET); if (state == -1) { printf("lseek failure: (%s)\n", strerror(errno)); } char *buf = (char*)align_alloc(4096); memset(buf, 'a', 4096); state = write(fd_b, buf, 4096); if (state == -1) { printf("write failure: (%s)\n", strerror(errno)); } close(fd_b); return 0; } ``` I also found that if I remove the `O_DIRECT` flag of file b, the write operation will not trigger an error, but the contents of b become garbled.
This is a test/programming bug. If you change reproduce.c so that it prints fd_a and fd_b, you'll see that they have the same value. So the reason why lseek didn't fail is because fd_a has the same integer value as fd_b --- and so lseek didn't fail and affected the current position of fd_b. This is a documented feature of the Linux/Unix/Posix interface. File descriptors are small integers, and if you close a file descriptor, it releases that integer ---- think of it as an index in an array, i.e., struct file *fd_array[MAX_FDS]. When you call open, it will find the first NULL pointer in fd_array, and installs a pointer to the struct file, and returns that integer as the file descriptor.
Hi Theodore Tso, Thank you for your quick response, I'm very sorry for making this incorrect bug report, I will read the documentation carefully.