Bug 546

Summary: Close notification in poll/select never arrives
Product: IO/Storage Reporter: Sampo Kellom (sampo)
Component: OtherAssignee: Zwane Mwaikambo (zwane)
Status: REJECTED INVALID    
Severity: normal CC: akpm, mtk.manpages
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.5.65 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 1875    

Description Sampo Kellom 2003-04-07 05:38:46 UTC
Distribution: any, problem not sensitive or specific to any distribution
Hardware Environment:	reproduces at least on i386, others not tested
Software Environment:	glibc-2.2.5, gcc-2.95.3
		       (Many other combinations reproduce the bug, too. The
			bug is not sensitive to user land config. Its a kernel
			problem.)
History:

  This bug was originally reported by thierry.lelegard@canal-plus.fr
  to linux.kernel under "Subject: PROBLEM: I/O system call never returns
  if file desc is closed in the meantime". Report date was 2001-06-06.

  Thierry tells me that discussion was had back then, but was not constructive
  and died away without the bug being fixed. Not very positive experience
  about open software community's ability to get bugs fixed.

  Thierry resubmitted the bug to comp.os.linux.development.system under
  "Subject: Close notification in poll/select (works on all OS except Linux)"
  on 2003-01-13. He never got an answer. I guess there are a lot of cracks
  and this one is falling through one of them.

  I have hit the very same problems as Thierry and have verified his bug
  report to be factually correct. I used 2.6.65 kernel instead of the
  various 2.4.x kernels he used and got substantially same results.
  I find it incredible that such well researched bug report has not
  lead to resolution of the bug.

Problem Description:

Thierry's report from January:

  I have a problem which has not been resolved in the last few years:
  an interaction problem between close and poll/select.

  Consider a multithreaded program (pthreads). One thread reads from
  a file descriptor (typically a TCP or UDP socket). At some time,
  another thread decides to close(2) the file descriptor while the
  first thread is blocked, waiting for input data.

  If the reading thread uses read(2) or recv(2), the system call returns
  with an error status at the time the file descriptor is closed (expected
  behavior).

  But, since we also need a timeout on the read, we do not use read(2)
  or recv(2). We set the file descriptor in non-blocking mode and we
  use poll(2) with a timeout.

  The problem is that poll(2) does not return when the file descriptor
  is closed. On all other operating systems we have here (Solaris, HP-UX
  and even non-UNIX systems like OpenVMS and Windows), poll(2) returns and
  reports an error condition as soon as the file descriptor is closed.

  The same behavior is observed if select(2) is used instead of poll(2).
  This has been observed in all kernels (currently using 2.4.18).

  Two things are noticeable:
  - If recv(2)/read(2) returns when the fd is closed, why not poll(2)?
  - All other O/S report an error, while Linux poll(2) hangs. This is a
    serious portability issue.

  Does anyone have an idea or workaround? How to implement a recv(2)
  with a timeout in such a way that the reading thread is unblocked
  when the file descriptor is closed from some other thread on Linux?

Thierry's original report (I have trimmed away redundant software
version information bloat):

  [2.] Full description of the problem/report:

  This report describes a problem in the usage of file descriptors across
  multiple threads. When one thread closes a file descriptor, another
  thread which waits for an I/O on that file descriptor is not notified
  and blocks forever.

  I don't know exactly what should be considered as "correct behavior"
  but all other operating systems we use (Sun Solaris, HP-UX, OpenVMS
  and Windows NT) have the exact same behavior. Linux is the only one
  to hang.

  Context: We use multi-threaded applications. Some threads perform I/O on
  file descriptors (usually sockets, but not only sockets). Each thread
  manages one file descriptor (a "communication"). A manager thread detects
  application-defined conditions and may decide to interrupt a communication.
  To interrupt the communication, this manager thread closes the file
  descriptor.
  If the close operation occurs while the "communication" thread is waiting
  for data on this file descriptor (using poll(2) for instance), this thread
  never returns from the I/O system call and hangs forever.

  On all other operating systems we use, the I/O system call in the
  "communication" thread returns and reports an error.

  See example program below.

  [3.] Keywords

  file-descriptor thread async-close

  [4.] Kernel version (from /proc/version):

  Linux version 2.4.2-2 (root@porky.devel.redhat.com) (gcc version 2.96
20000731   (Red Hat Linux 7.1 2.96-79)) #1 Sun Apr 8 20:41:30
  EDT 2001

  [6.] A small shell script or example program which triggers the problem

  Here is an example program in C.

  From the main thread, the program creates an UDP socket and polls it
  using poll(2). Since no packet is received on this UDP port. poll blocks.
  From a second thread, after 2 seconds, the socket's file descriptor is
  closed (while the main thread is suspended on poll).

  On Solaris and HP-UX (as well as equivalent programs on OpenVMS and
  Windows NT), the poll returns in the main thread and reports an error
  on the file descriptor. On Linux, the poll never returns and the main
  thread is blocked forever.

  This example uses a socket but it works exactly the same way using
  a pipe or any device which may block long enough.

  --> Output on Solaris 5.8 and HP-UX B.11.00

  main: starting poll
  close_thread: starting thread
  close_thread: closing socket
  close_thread: close status = 0
  main: poll status = 1
  main: events   = 00000001
  main: revents  = 00000020

  ... and the program exits.

  --> Output on Linux RedHat 7.1 (kernel 2.4.2-2)

  main: starting poll
  close_thread: starting thread
  close_thread: closing socket
  close_thread: close status = 0

  ... and the program remains blocked.


Steps to reproduce:	compile and run program that follows. If kernel behaves
			correctly, the line marked `*** never reached' is
			reached. If the bug is present, it is not reached.
/*
 *  Test the asynchronous close of an UDP socket.
 *
 *  Linux   : gcc -o asclose_udp_c asclose_udp_c.c -lpthread
 *  Solaris : cc -o asclose_udp_c asclose_udp_c.c -lpthread -lsocket -lnsl
 *  HP-UX   : cc -o asclose_udp_c asclose_udp_c.c -lpthread
 */

#include <sys/types.h>
#include <sys/time.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <poll.h>
#include <pthread.h>

#define UDP_PORT 6666
#define CLOSE_DELAY 2 /* seconds */

/* UDP socket descriptor */

static int sock_fd = -1;

/* This procedure is a thread which closes the socket after a delay */

static void* close_thread (void *arg)
{
    int status;
    struct timeval now;
    struct timezone tz;
    struct timespec timeout;
    pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
    pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

    printf ("close_thread: starting thread\n");

    /* Wait for the specified delay */

    pthread_mutex_lock (&mutex);
    gettimeofday (&now, &tz);
    timeout.tv_sec = now.tv_sec + CLOSE_DELAY;
    timeout.tv_nsec = now.tv_usec * 1000;
    pthread_cond_timedwait (&cond, &mutex, &timeout);
    pthread_mutex_unlock (&mutex);
   
    /* Close the socket while the main task is blocked in a poll(2) */

    printf ("close_thread: closing socket\n");
    status = close (sock_fd);
    printf ("close_thread: close status = %d\n", status);
    
    return NULL;
}

/* Main program */

int main (int argc, char *argv[])
{
    int status;
    pthread_t th;
    struct pollfd fdset;
    struct sockaddr_in saddr;

    /* Create a UDP socket */

    if ((sock_fd = socket (AF_INET, SOCK_DGRAM, IPPROTO_UDP)) < 0) {
        perror ("socket");
        exit (EXIT_FAILURE);
    }

    saddr.sin_family = AF_INET;
    saddr.sin_port = htons (UDP_PORT);
    saddr.sin_addr.s_addr = 0;

    if (bind (sock_fd, (struct sockaddr*) &saddr, sizeof (saddr)) < 0) {
        perror ("bind");
        exit (EXIT_FAILURE);
    }

    /* Create a new thread to asynchronously close the socket */

    if ((status = pthread_create (&th, NULL, close_thread, NULL)) != 0) {
        fprintf (stderr, "pthread_create: %s\n", strerror (status));
        exit (EXIT_FAILURE);
    }

    /* Wait for input data using poll(2) and an infinite timeout */

    fdset.fd = sock_fd;
    fdset.events = POLLIN;
    fdset.revents = 0;

    printf ("main: starting poll\n");
    status = poll (&fdset, 1, -1);
    printf ("main: poll status = %d\n", status);   /* *** never reached */
    printf ("main: events   = %08X\n", fdset.events);
    printf ("main: revents  = %08X\n", fdset.revents);
}
Comment 1 Andrew Morton 2003-04-07 13:49:49 UTC
The consensus from the June 2001 discussion is that your application
is buggy.  I would agree with that.

You are relying on side-effects from certain operating systems
to save it.

The close doesn't destroy the file context - someone is still
running a select against it!

Suggest you redesign the application to not be dependent upon the
presence of this side-effect.
Comment 2 Zwane Mwaikambo 2003-05-04 04:45:16 UTC
I'd like to close this bug report as invalid
Comment 3 Michael Kerrisk 2020-05-09 06:21:50 UTC
See also Bug 53781
https://bugzilla.kernel.org/show_bug.cgi?id=53781