Bug 15272 - epoll_ctl(2) fails on regular plain files
epoll_ctl(2) fails on regular plain files
Product: IO/Storage
Classification: Unclassified
Component: Other
All Linux
: P1 normal
Assigned To: Davide Libenzi
Depends on:
  Show dependency treegraph
Reported: 2010-02-11 14:00 UTC by Stephane Thiell
Modified: 2012-06-27 13:24 UTC (History)
2 users (show)

See Also:
Kernel Version:
Tree: Fedora
Regression: No


Description Stephane Thiell 2010-02-11 14:00:27 UTC
epoll_ctl(2) doesn't work for regular file or stdin (fd 0) and return EPERM. On FreeBSD, for example, it seems that kqueue(2) is able to listen for plain file readiness, why epoll doesn't allow this? Please note that poll(2) on the same kernel works just fine for plain files. A simple reproducer using fd 0 (stdin) is shown below. The same appends for plain files opened with open(2).

It's annoying when an application wants to also listen for fd 0 in a common fashion (pipe or file).

>> reproducer.c:

#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/epoll.h>

main(int argc, char **argv)
  struct epoll_event event;
  int epfd, rc;

  event.events = EPOLLIN;

  epfd = epoll_create(1023);
  if (epfd < 0) {
    printf("epoll_create failed %d\n", epfd);
    printf("errno %d %s\n", errno, strerror(errno));
    return 1;
  errno = 0;
  rc = epoll_ctl(epfd, EPOLL_CTL_ADD, 0, &event);
  printf("epoll_ctl returned %d\n", rc);
  printf("errno %d %s\n", errno, strerror(errno));
  return -rc;

$ echo foobar | ./reproducer 
epoll_ctl returned 0
errno 0 Success

$ echo foobar >/tmp/foobar

$ ./reproducer < /tmp/foobar
epoll_ctl returned -1
errno 1 Operation not permitted
Comment 1 Davide Libenzi 2010-02-12 19:46:05 UTC
Regular files do not support the linux ->poll() file operation.
If you want to wait for events on regular files, you need to use AIO+eventfd+epoll.
Comment 2 Davide Libenzi 2010-02-12 19:50:16 UTC
BTW, this is not an epoll(4) problem. not even poll(2) or select(2) will work with that example.
Comment 3 Stephane Thiell 2010-02-13 19:53:35 UTC
Thank you for your reply. I can understand comment #1, but for comment #2, it looks like poll(2) works for available read on regular file (simple case, file not being modified), for example:

>> reproducer-poll.c:

#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <poll.h>

main(int argc, char **argv)
  struct pollfd onepollfd = { 0, POLLIN };
  int nfds;

  errno = 0;
  nfds = poll(&onepollfd, 1, -1);
  printf("nfds=%d\n", nfds);
  if (nfds < 0) {
    printf("poll failed %d\n", nfds);
    printf("errno %d %s\n", errno, strerror(errno));
    return 1;
  } else if (nfds > 0) {
    printf("revents 0x%x (POLLIN=0x%x, POLLERR=0x%x)\n",
        onepollfd.revents, POLLIN, POLLERR);
  return 0;

$ ./reproducer-poll < /tmp/foobar 
revents 0x1 (POLLIN=0x1, POLLERR=0x8)

But, when writing this post, I can see that the read event is always available even when there is no data to read in the file (/tmp/foobar). However, having such behaviour is useful for simple user applications that have an unique handler for socket, pipe or regular file descriptors. Would it be possible to make epoll(4) behave like poll(2) for regular files? Or maybe poll(2) should return an error for regular files, like POLLERR?
Comment 4 Davide Libenzi 2010-02-13 21:53:00 UTC
It seems it is working, but it isn't ;)
If you look at fs/select.c, line 723 to 731, you notice that in case f_op->poll is not provided by the device, DEFAULT_POLLMASK is used as returned mask, where DEFAULT_POLLMASK is defined as (POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM).
Later on, this DEFAULT_POLLMASK is masked with your mask, which returns POLLIN, even though no test have been really performed with the device, since a file device does not provide an f_op->poll() function.
Epoll will fail you explicitly, while poll/select will not. but nothing meaningful is returned from poll/select on file system files.
Maybe poll/select should be changed to return POLLERR or POLLNVAL in case the device is not supported, but dunno if this is "POSIX Legal".

Note You need to log in before you can comment on or make changes to this bug.