Bug 95331
Summary: | fcntl.2 + sigaction.2 + signal.7 need further information about use of a SA_SIGINFO signal handler that uses si->si_fd | ||
---|---|---|---|
Product: | Documentation | Reporter: | Jason Vas Dias (jason.vas.dias) |
Component: | man-pages | Assignee: | documentation_man-pages (documentation_man-pages) |
Status: | RESOLVED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | mtk.manpages |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: | |
Attachments: |
File that demonstrates the problem
clearer example of same illustration program better version |
To test (under kernel 3.13, Linux x86_64 8-core) $ gcc -g -pthread -o t_sigio_rdwr t_sigio_rdwr.c $ mkfifo -m 0600 /tmp/f.in /tmp/f.out $ strace -f bash -c './t_sigio_rdwr </tmp/f.in >/tmp/f.out' & ... $ echo 'hello' >/tmp/f.in && read res </tmp/f.out && echo "RES: $res" RES: hello ... $ echo 'hel2o' >/tmp/f.in ... $ read res </tmp/f.out && echo "RES: $res" ^C ^- now you have to press <CTRL+C>, at which time the t_sigio_rdwr process does get a SIGIO with si->si_fd == 1, but it is now too late - the reader has disconnected. Can anyone answer the question: For output (O_WRONLY|O_ASYNC) file descriptors, WHEN is the F_SETOWN signal or SIGIO meant to be sent ? Is it upon connection of a new reader FD to the read end of the the pipe, or when a reader FD is closed, or both ? IMHO, it must be possible to trigger the first case (sent on connection of new reader) but I can't see how to do it - the process only gets a SIGIO when a reader disconnects. Incidentally, nowhere do the man-pages state that unless you use an SA_ONSTACK handler, having used sigaltstack() to set the stack used to invoke the handler, then you get garbage pointed to by the siginfo_t 'si' pointer 2nd parameter to the handler, which is definitely the case. Also, unless you actually call fcntl(fd, F_SETSIG, signum), then no si->_sifields._sigpoll information is set at all - the manual pages definitely suggest that SIGIO / SIGPOLL is the default, but do not mention that the default does not apply to SA_SIGACTION signal handlers . Oh well, I was hoping to avoid having to do so, but I guess there is no alternative to making the program use poll / pselect . But the documentation is definitely in need of clarification of the above points. Created attachment 171961 [details]
clearer example of same illustration program
Created attachment 171971 [details]
better version
Now, if you run:
$ ./t_sigio_rdwr_gsf </tmp/f.in >/tmp/f.out &
$ echo 'hello' >/tmp/f.in && read res </tmp/f.out && echo "RES: $res"
RES: hello
$ ps -ef | grep t_sigio
# ie. get the pid of the t_sigio_rdwr_gsf process in $TS_PID
$ echo hel2o >/tmp/f.in
$ kill -IO $TS_PID; read res </tmp/f.out && echo 'RES: '"$res"
RES: hel2o
But if you do not do 'kill -IO $TS_PID;' and just do ' read res </tmp/f.out ',
after sending the second 'hel2o' line, the reader will hang, because the
t_sigio_rdwr process gets NO SIGNALs at all after the SIGPIPE attempting to
write, until the reader disconnects. I think this might be a kernel bug
in the way it is generating siginfo events for output pipes - or is it
a documentation bug ? Because even the short one line description of SIGIO
states : 'I/O now possible' - yet for writable pipe FDs, linux is generating
SIGIO only when output becomes IMpossible on the FD - why is this ?
One cannot use poll() or select() either in this case, either, because there seems to be no way of polling for POLLOUT without being flooded with revents==POLLOUT|POLLERR poll() returns, regardless of a timeout parameter of -1 (another undocumented fact about poll(2) ) . So is it true that Linux really provides no way of waiting for output to become possible on a PIPE file descriptor after a SIGPIPE has been received on it (the last reader has closed its input pipe to our output pipe) , so we want to wait for a new reader to open an input pipe? I thought that was what SIGIO was meant to be for ? Or is there some other mechanism ? Yes, I know, I should be using UNIX sockets and accept(), but the real application is required to use PIPEs, and the documentation suggests it should be possible to wait for an event to be generated by the kernel when a reader connects a new input FD to our output pipe. Should have said in previous comment: Of course, one CAN use poll() or select() , but then the whole point of signal driven I/O is rather negated, IMHO - if one can wait for the number of writers to the input end of a pipe to increase before reading, and then receive a signal with siginfo si_band and si_fd fields filled in, as evidently happens for input pipe file descriptors (FDs), why can't the same be made to happen when the number of readers of an output pipe FD increases ? Particularly as it seems both numbers are maintained in the pipe filesystem structure for each pipe FD: struct pipe_inode_info { struct mutex mutex; wait_queue_head_t wait; unsigned int nrbufs, curbuf, buffers; unsigned int readers; unsigned int writers; unsigned int files; unsigned int waiting_writers; unsigned int r_counter; unsigned int w_counter; struct page *tmp_page; struct fasync_struct *fasync_readers; struct fasync_struct *fasync_writers; struct pipe_buffer *bufs; }; I suppose this is because a signal might be sent whenever the output buffer has space available for writable FDs , unlike for readable FDs, which generate si_band events when input is available ? Yet, as the test case shows, this does NOT occur for writable FIFO fds - a signal with si_band and si_fd siginfo is only received (sent by kernel) when a reader disconnects from a writable FIFO fd . But a special FCNTL or IOCTL could be provided to say, "For this writable FIFO FD, send the IO signal only when a the "readers" counter is incremented or decremented (or perhaps only when it reaches 0 and when it transitions from 0 to 1) . I'm going to investigate producing a version of linux that does support such an fcntl / ioctl and send such si_band events for output file descriptors, as it seems there is no way to make current versions of linux do this . Note that poll() or select() return immediately for writable pipe file descriptors on which there are no readers with open file descriptors with : pollfd.revents & ( POLL_ERR | POLLOUT ) set, so one cannot set struct pollfd's events' POLLOUT bit and then use poll() or select() without some form of busy-waiting or nanosleep() or user-space polling, thus making 100% SIGIO driven I/O on pipes impossible IMHO . I think linux should provide a way of by-passing this generation of error events for writable FDs with no writers - blocking should be possible for writable pipe FDs for which the "wait for readers on write" fcntl or ioctl has been issued, which would only return success for writable pipe FDs with no readers (readers == 0) ; for such pipes, blocking reads would succeed once a SINGLE SIGPIPE or write() == -1 with errno==EPIPE event occurred until a reader has connected, upon which a SIGIO / SIGPOLL signal would be sent if the FD had such a handler registered for it with the siginfo si_band and si_fd fields correctly filled in; such a signal would also be sent and write() would return -1 ONCE for such FDs when the last reader disconnects, as currently happens . This would essentially fix the problem of having to know the name of the pipe in order to re-open it and be able to wait for readers to connect ; one could handle a write returning -1 with EPIPE or a SIGPIPE by simply entering pause() with a SIGIO handler registered, which is NOT currently the case. blocking reads would succeed Sorry, version of above comment #8 with typos fixed: I think linux should provide a way of by-passing this generation of error events for writable FDs with no readers - blocking should be possible for writable pipe FDs for which the "wait for readers on write" (WFROW??) fcntl or ioctl has been issued, which would only return success for writable pipe FDs with no readers (readers == 0) ; for such pipes, for which SIGPIPE or write has set errno to EPIPE only once, blocking writes would then succeed, until a new reader has connected, upon which a SIGIO / SIGPOLL or fcntl(fd, F_SETSIG, signum) specified signal with that signal being caught, would be sent if the FD had such a handler registered for it with the siginfo si_band and si_fd fields correctly filled in; such a signal would also be sent and write() would return -1 ONCE for such FDs when the last reader disconnects, as currently happens; but if O_NONBLOCK is NOT set in the FD's flags, then the next write would block, instead of returning -1 as currently happens, or at least a signal would be sent if registered and the O_ASYNC bit is set in the FD's flags . This would essentially fix the problem of having to know the name of the pipe in order to re-open it and be able to wait for readers to connect ; one could handle a write returning -1 with EPIPE or a SIGPIPE by simply entering pause() with a SIGIO handler registered, which is NOT currently the case. simply by entering pause() One cannot handle the case for currently open pipe FDs, where no reader has yet connected, as are inherited by shell re-directs, without using some form of user-space polling and sleeping or busy-waiting. I think it is because in fs/pipe.c : static int pipe_fasync(int fd, struct file *filp, int on) { struct pipe_inode_info *pipe = filp->private_data; int retval = 0; __pipe_lock(pipe); if (filp->f_mode & FMODE_READ) retval = fasync_helper(fd, filp, on, &pipe->fasync_readers); if ((filp->f_mode & FMODE_WRITE) && retval >= 0) { retval = fasync_helper(fd, filp, on, &pipe->fasync_writers); if (retval < 0 && (filp->f_mode & FMODE_READ)) /* this can happen only if on == T */ fasync_helper(-1, filp, 0, &pipe->fasync_readers); } __pipe_unlock(pipe); return retval; } Perhaps a pipe FD ioctl could be provided that would set a pipe inode 'sigio_on_write_enabled' flag that could be handled ? : ... else if (pipe->sigio_on_write_enabled && (filp->f_mode & FMODE_WRITE) && (retval == -1 ) && (pipe->last_retval != -1) && (pipe->last_readers != 0) ) { pipe->last_retval = -1; if( pipe->readers == 0 ) kill_fasync(&pipe->fasync_readers, SIGIO, POLL_OUT); retval = fasync_helper(fd, filp, on, &pipe->fasync_writers); } pipe->last_retval = retval; pipe->last_readers = pipe->readers; __pipe_unlock(pipe); return retval; } ie. so if the last error status was NOT -1, but is now -1, and O_ASYNC is enabled, and readers is now 0, then if this special new sigio_on_write_enabled flag was set, then a sigio might be sent for it ? Also in pipe poll : if (filp->f_mode & FMODE_WRITE) { mask |= (nrbufs < pipe->buffers) ? POLLOUT | POLLWRNORM : 0; /* * Most Unices do not set POLLERR for FIFOs but on Linux they * behave exactly like pipes for poll(). */ if ((!pipe->readers) &&( (!pipe->sigio_on_write_enabled) || (pipe->last_readers != 0))) mask |= POLLERR; } Also in pipe_read(), perhaps it should make the call : if (do_wakeup) { wake_up_interruptible_sync_poll(&pipe->wait, POLLOUT | POLLWRNORM); kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT); } BEFORE it does the read ? Else how is the writer meant to know it has a new reader ? I'd also really like to know where precisely all this is meant to be specified / documented . The POSIX standards are rather vague on this, but do suggest that SIGIO is meant to be sent when I/O is possible on a file descriptor , regardless of whether it is readable or writable . Since currently ONLY open() already waits for readers to connect : case FMODE_WRITE: /* * O_WRONLY * POSIX.1 says that O_NONBLOCK means return -1 with * errno=ENXIO when there is no process reading the FIFO. */ ret = -ENXIO; if (!is_pipe && (filp->f_flags & O_NONBLOCK) && !pipe->readers) goto err; pipe->w_counter++; if (!pipe->writers++) wake_up_partner(pipe); if (!is_pipe && !pipe->readers) { if (wait_for_partner(pipe, &pipe->r_counter)) goto err_wr; } break; I think if O_ASYNC is set and O_NONBLOCK is NOT set , once ONE sigpipe or EPIPE has been generated, then the next write should block for pipes marked with sigio_on_write_enabled , and also readers should generate the SIGIO events BEFORE they are about to read. I guess such changes might take a long time to become accepted ... But which paragraphs of which standard are meant to apply to this situtation ? Where is it specified what SHOULD happen in these cases ? ie. only for writable FDS where O_NONBLOCK is NOT set, they should be capable of generating SIGIO on new readers connecting and write() should also run the above code after the FIRST sigpipe has been sent to wait for readers . static ssize_t pipe_write(struct kiocb *iocb, struct iov_iter *from) { if (!pipe->readers) { if( (!(filp->f_flags & O_NONBLOCK)) && ((!pipe->sigio_on_write_enabled) || !pipe->sigpipe_sent ) ) { pipe->sigpipe_sent = 1; send_sig(SIGPIPE, current, 0); ret = -EPIPE; goto out; }else if(pipe->sigio_on_write_enabled && pipe->sigpipe_sent) if ((!(filp->f_flags & O_NONBLOCK)) && wait_for_partner(pipe, &pipe->r_counter)) goto err_wr; } }else { pipe->sigpipe_sent = 0; ... } ... } I don't see any huge obstacles to making something like the above work . Pipe I/O would then be alot more robust and efficient since user-space polling or busy-waiting could be eliminated for writers with use of pause() . Jason, can you summarize any man-pages changes that you think are required? Closing this, as I can't decipher the bug report. If you can _succinctly_ summarize the changes you believe are required in any man page(s), please reopen. |
Created attachment 171831 [details] File that demonstrates the problem Despite extensive reading of latest versions of the $subject manual pages over the last few days while trying to debug a program that needs to read and write pipe file descriptors, I am still none the wiser about the precise circumstances that the $signum signal, used in a : fcntl(fd, F_SETSIG, $signum ) call, will be sent for an OUTPUT file descriptor. The test program, which I will attach, which is run as : $ mkfifo -m 0600 /tmp/f.in /tmp/f.out $ ./t_sigio_rdwr </tmp/f.in >/tmp/f.out & $ echo 'hello' >/tmp/f.in ../ $ read </tmp/f.out ^C only gets a SIGIO+SIGRTMIN signal for the output file descriptor (FD 1) when a reader DISCONNECTS from its output pipe, by pressing <CTRL+C> - which is pretty useless, IMHO - the program is expecting to receive a SIGIO+RTMIN signal when a reader CONNECTs (opens) its output fifo. so that it can send the response. All the documentation states is in the sigaction manual page: * SIGIO/SIGPOLL (the two names are synonyms on Linux) fills in si_band and si_fd. The si_band event is a bit mask containing the same values as are filled in the revents field by poll(2). The si_fd field indicates the file descriptor for which the I/O event occurred. ... si_code: .. SI_SIGIO Queued SIGIO (only in kernels up to Linux 2.2; from Linux 2.4 onward SIGIO/SIGPOLL fills in si_code as described below). The following values can be placed in si_code for a SIGIO/SIGPOLL signal: POLL_IN Data input available. POLL_OUT Output buffers available. And in fcntl.2 : F_SETSIG (int) Set the signal sent when input or output becomes possible to the value given in arg. A value of zero means to send the default SIGIO signal. Any other value (including SIGIO) is the signal to send instead, and in this case additional info is available to the signal handler if installed with SA_SIGINFO. By using F_SETSIG with a nonzero value, and setting SA_SIGINFO for the signal handler (see sigaction(2)), extra information about I/O events is passed to the handler in a siginfo_t structure. If the si_code field indicates the source is SI_SIGIO, the si_fd field gives the file descriptor associated with the event. Otherwise, there is no indication which file descriptors are pending, and you should use the usual mechanisms (select(2), poll(2), read(2) with O_NONBLOCK set etc.) to determine which file descriptors are available for I/O. By selecting a real time signal (value >= SIGRTMIN), multiple I/O events may be queued using the same signal numbers. (Queuing is dependent on available mem‐ ory). Extra information is available if SA_SIGINFO is set for the signal han‐ dler, as above. Nowhere does it say WHEN or on what conditions the SIGIO will be generated, or is it discussed what happens when SIGIO for multiple FDs are queued. The following strace output of the program demonstrates the problem: [pid 11215] rt_sigprocmask(SIG_BLOCK, [PIPE RT_31], [], 8) = 0 [pid 11215] fcntl(1, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE) [pid 11215] fcntl(1, F_SETFL, O_WRONLY|O_ASYNC|O_LARGEFILE) = 0 [pid 11215] fcntl(1, 0xf /* F_??? */, 0x7fff1c940090) = 0 [pid 11215] fcntl(1, F_SETSIG, 0x3f) = 0 [pid 11215] fcntl(0, F_GETFL) = 0x8000 (flags O_RDONLY|O_LARGEFILE) [pid 11215] fcntl(0, F_SETFL, O_WRONLY|O_ASYNC|O_LARGEFILE) = 0 [pid 11215] fcntl(0, 0xf /* F_??? */, 0x7fff1c940090) = 0 [pid 11215] fcntl(0, F_SETSIG, 0x3f) = 0 [pid 11215] rt_sigaction(SIGRT_31, {0x400b4d, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fc6c47f0340}, {SIG_DFL, [], 0}, 8) = 0 [pid 11215] rt_sigaction(SIGPIPE, {0x400e49, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fc6c47f0340}, {SIG_DFL, [], 0}, 8) = 0 [pid 11215] fcntl(1, F_SETFL, O_WRONLY|O_LARGEFILE) = 0 [pid 11215] write(2, "reading\n", 8reading ) = 8 [pid 11215] read(0, "hello\n", 8192) = 6 [pid 11215] write(2, "read 6 bytes\n", 13read 6 bytes ) = 13 [pid 11215] write(2, "writing: 6 bytes\n", 17writing: 6 bytes ) = 17 [pid 11215] write(1, "hello\n", 6) = 6 [pid 11215] write(2, "wrote : 6\n", 10wrote : 6 ) = 10 [pid 11215] rt_sigprocmask(SIG_BLOCK, [PIPE RT_31], [PIPE RT_31], 8) = 0 [pid 11215] fcntl(0, F_GETFL) = 0xa000 (flags O_RDONLY|O_ASYNC|O_LARGEFILE) [pid 11215] fcntl(0, F_SETFL, O_RDONLY|O_ASYNC|O_LARGEFILE) = 0 [pid 11215] fcntl(0, 0xf /* F_??? */, 0x7fff1c940090) = 0 [pid 11215] fcntl(0, F_SETSIG, 0x3f) = 0 [pid 11215] fcntl(0, F_GETFL) = 0xa000 (flags O_RDONLY|O_ASYNC|O_LARGEFILE) [pid 11215] fcntl(0, F_SETFL, O_RDONLY|O_ASYNC|O_LARGEFILE) = 0 [pid 11215] fcntl(0, 0xf /* F_??? */, 0x7fff1c940090) = 0 [pid 11215] fcntl(0, F_SETSIG, 0x3f) = 0 [pid 11215] rt_sigaction(SIGRT_31, {0x400b4d, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fc6c47f0340}, {0x400b4d, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fc6c47f0340}, 8) = 0 [pid 11215] rt_sigaction(SIGPIPE, {0x400e49, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fc6c47f0340}, {0x400e49, [], SA_RESTORER|SA_STACK|SA_NODEFER|SA_SIGINFO, 0x7fc6c47f0340}, 8) = 0 [pid 11215] fcntl(1, F_SETFL, O_RDONLY|O_LARGEFILE) = 0 [pid 11215] write(2, "reading\n", 8reading ) = 8 [pid 11215] read(0, "", 8192) = 0 [pid 11215] write(2, "read returned 0 - waiting for SI"..., 36read returned 0 - waiting for SIGIO ) = 36 [pid 11215] fcntl(0, F_SETFL, O_RDONLY|O_ASYNC|O_LARGEFILE) = 0 [pid 11215] rt_sigprocmask(SIG_UNBLOCK, [PIPE RT_31], [PIPE RT_31], 8) = 0 [pid 11215] pause() = ? ERESTARTNOHAND (To be restarted if no handler) [pid 11215] --- SIGRT_31 {si_signo=SIGRT_31, si_code=0x1, si_pid=65, si_uid=0} --- [pid 11215] --- SIGRT_31 {si_signo=SIGRT_31, si_code=0x1, si_pid=65, si_uid=0} --- [pid 11215] write(2, "IO: SI_FD: 0\n", 13IO: SI_FD: 0 ) = 13 [pid 11215] rt_sigreturn() = 0 [pid 11215] write(2, "IO: SI_FD: 0\n", 13IO: SI_FD: 0 ) = 13 [pid 11215] rt_sigreturn() = -1 EINTR (Interrupted system call) [pid 11215] rt_sigprocmask(SIG_BLOCK, [PIPE RT_31], [], 8) = 0 [pid 11215] write(2, "after SIGIO - retrying read \n", 29after SIGIO - retrying read ) = 29 [pid 11215] fcntl(0, F_SETFL, O_RDONLY|O_LARGEFILE) = 0 [pid 11215] write(2, "reading\n", 8reading ) = 8 [pid 11215] read(0, "hello\n", 8192) = 6 [pid 11215] write(2, "read 6 bytes\n", 13read 6 bytes ) = 13 [pid 11215] write(2, "writing: 6 bytes\n", 17writing: 6 bytes ) = 17 [pid 11215] write(1, "hello\n", 6) = -1 EPIPE (Broken pipe) [pid 11215] rt_sigprocmask(SIG_UNBLOCK, [PIPE RT_31], [PIPE RT_31], 8) = 0 [pid 11215] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=11215, si_uid=1001} --- [pid 11215] write(2, "PIPE: SI_FD: 0\n", 15PIPE: SI_FD: 0 ) = 15 [pid 11215] rt_sigreturn() = 0 [pid 11215] write(2, "PIPE - waiting for connection\n", 30PIPE - waiting for connection ) = 30 [pid 11215] fcntl(1, F_SETFL, O_RDONLY|O_ASYNC|O_LARGEFILE) = 0 [pid 11215] pause( Now I have to press <CTRL+C>, and only AFTER the reader has disonnected does the t_sigio_rdwr process get a SIGIO: ) = ? ERESTARTNOHAND (To be restarted if no handler) [pid 11215] --- SIGRT_31 {si_signo=SIGRT_31, si_code=0x2, si_pid=772, si_uid=0, si_value={int=1, ptr=0x1}} --- [pid 11215] write(2, "IO: SI_FD: 1\n", 13IO: SI_FD: 1 ) = 13 [pid 11215] rt_sigreturn() = -1 EINTR (Interrupted system call) [pid 11215] fcntl(1, F_SETFL, O_RDONLY|O_LARGEFILE) = 0 [pid 11215] rt_sigprocmask(SIG_BLOCK, [PIPE RT_31], [], 8) = 0 [pid 11215] write(2, "AFTER SIGIO : 1\n", 16AFTER SIGIO : 1 ) = 16 [pid 11215] write(2, "writing: 6 bytes\n", 17writing: 6 bytes ) = 17 [pid 11215] write(1, "hello\n", 6) = -1 EPIPE (Broken pipe) [pid 11215] rt_sigprocmask(SIG_UNBLOCK, [PIPE RT_31], [PIPE RT_31], 8) = 0 [pid 11215] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=11215, si_uid=1001} --- [pid 11215] write(2, "PIPE: SI_FD: 1\n", 15PIPE: SI_FD: 1 ) = 15 [pid 11215] rt_sigreturn() = 0 [pid 11215] write(2, "PIPE - waiting for connection\n", 30PIPE - waiting for connection ) = 30 [pid 11215] fcntl(1, F_SETFL, O_RDONLY|O_ASYNC|O_LARGEFILE) = 0 [pid 11215] pause( But since the reader had to disconnect, it gets SIGPIPE trying to write, so never leaves the state of trying to write back the second input line.