Bug 200043 - lseek returns negative positions for directories on ext4 fs
Summary: lseek returns negative positions for directories on ext4 fs
Status: RESOLVED INVALID
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-06-12 13:10 UTC by Anatoly Trosinenko
Modified: 2018-06-13 02:31 UTC (History)
3 users (show)

See Also:
Kernel Version: v4.17
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Anatoly Trosinenko 2018-06-12 13:10:15 UTC
When used on a directory FD residing on Ext4 FS, lseek with SEEK_END (and maybe other whence-s) can return negative numbers and leave errno = 0. These numbers in case of SEEK_END are (-1 - offset) and do not accumulate.

Reading the lseek(2) man page, I'm not absolutely sure it is a bug, but this seems to be quite strange behavior.

How to reproduce:
1. Compile the following code:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>

int main(int argc, char *argv[]) {
  int fd = open(argv[1], O_RDONLY);
  printf("fd = %d\n", fd);
  int res = lseek(fd, -10, SEEK_END);
  printf("lseek returned: %d (errno = %d)\n", res, errno);
  res = lseek(fd, -1, SEEK_END);
  printf("lseek returned: %d (errno = %d)\n", res, errno);
  res = lseek(fd, 0, SEEK_END);
  printf("lseek returned: %d (errno = %d)\n", res, errno);
  res = lseek(fd, -11, SEEK_SET);
  printf("lseek returned: %d (errno = %d)\n", res, errno);
  return 0;
}

2. Run it:

# ./lseek_negative /tmp # Suppose the tmpfs is mounted there
fd = 3
lseek returned -1 (errno = 22)
lseek returned -1 (errno = 22)
lseek returned -1 (errno = 22)
lseek returned -1 (errno = 22)
# ./lseek_negative / # Suppose the / is ext4
fd = 3
lseek returned -11 (errno = 0)
lseek returned -2 (errno = 0)
lseek returned -1 (errno = 0)    <-- Look at the errno value
lseek returned -1 (errno = 22)
Comment 1 Andreas Dilger 2018-06-12 18:50:05 UTC
It should be noted that seek on an ext4 *htree* directory is handled in terms of the hash of the filename (which is the value returned by telldir()), not in terms of the byte offset.  The valid hash values for htree directories are in the range [0,EXT4_HTREE_EOF_64BIT] ([0,2^63 - 1]) on 64-bit systems.

Seeking on a non-htree directory (any directory 4KB in size) is handled via ext4_llseek()->generic_file_llseek_size().  Seeking to a negative offset on a directory doesn't particularly make sense, so should probably have some more strict limits imposed than regular files.
Comment 2 Anatoly Trosinenko 2018-06-12 18:56:56 UTC
For me it is more of "is lseek allowed to return negative numbers not equal to -1?" according to standards. Frankly speaking, I don't know.

PS: Oops, SEEK_END-relative seeks should not accumulate, it is right behavior...
Comment 3 Theodore Tso 2018-06-13 02:30:57 UTC
The answer is it depends on what version of the standard.  Quoting from SuSv3:

The POSIX.1-1990 standard did not specifically prohibit lseek() from returning a negative offset. Therefore, an application was required to clear errno prior to the call and check errno upon return to determine whether a return value of ( off_t)-1 is a negative offset or an indication of an error condition. The standard developers did not wish to require this action on the part of a conforming application, and chose to require that errno be set to [EINVAL] when the resulting file offset would be negative for a regular file, block special file, or directory.

The main thing that the standards don't require that read(2)/lseek(2) on directories; the only thing is guaranteed to work is readdir()/telldir()/seekdir().   And the standards don't define fdopendir(), so there's no way to get access to the file descriptor associated with opendir(), and so there's no standard way to call lseek() on a directory stream.   In actual practice, if you use fdopendir() and then try to use lseek(2) on it, the results will be chaos and readdir(2) will malfunction various wild and unpredictable ways.

In any case, the problem seems to be in glibc; the system call lseek(2) returns 64-bit offsets.  However, for backwards compatibility the lseek() function visible to userspace uses a 32-bit off_t type.   And glibc() is not properly returning EOVERFLOW and is instead truncating the value returned by lseek.  You can see this if you run strace on your test binary:

lseek(3, -10, SEEK_END)                 = 9223372036854775797
write(1, "lseek returned: -11 (errno = 0)\n", 32lseek returned: -11 (errno = 0)
) = 32
lseek(3, -1, SEEK_END)                  = 9223372036854775806
write(1, "lseek returned: -2 (errno = 0)\n", 31lseek returned: -2 (errno = 0)
) = 31

In practice, it probably doesn't matter, because no sane program will be using lseek() in actual practice on a directory.  It should be using opendir(2)/readdir(2)/telldir(2)/seekdir(3).   And that all works correctly.

Note You need to log in before you can comment on or make changes to this bug.