Write records to a SCSI tape until a write fails with a ENOSPC (you have reached early warning. Now perform a: struct mtget before, after; ioctl(fd, MTIOCGET, &before); struct mtop mtop = { MTWEOF, 1 }; ioctl(fd, MTIOCTOP, &mtop); ioctl(fd, MTIOCGET, &after); Check the value of mt_fileno in the before and after structures. Notice the after is 2 greater then the before. The problem appears to be in the block of code starting at line 2817 in st.c. This block is entered because the drive did return a CHECK CONDITION with NO SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But it has already been increased by the number of filemarks requested by the MTIOCTOP. I believe that the residue count in the sense data should be subtracted from fileno, not a increment as is done.
Another minor issue in the same section of code: The value of STps->drv_block also needs to be updated (set to 0)
On Mon, 22 Jan 2007 13:07:20 -0800 bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=7864 > > Summary: A MTIOCTOP/MTWEOF within the early warning will cause > the file number to be incorrect > Kernel Version: 2.6.19.2 > Status: NEW > Severity: low > Owner: io_scsi@kernel-bugs.osdl.org > Submitter: ce_reisinger@yahoo.com > > > Write records to a SCSI tape until a write fails with a ENOSPC (you have reached > early warning. > Now perform a: > struct mtget before, after; > ioctl(fd, MTIOCGET, &before); > struct mtop mtop = { MTWEOF, 1 }; > ioctl(fd, MTIOCTOP, &mtop); > ioctl(fd, MTIOCGET, &after); > > Check the value of mt_fileno in the before and after structures. Notice the > after is 2 greater then the before. > > The problem appears to be in the block of code starting at line 2817 in st.c. > This block is entered because the drive did return a CHECK CONDITION with NO > SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But > it has already been increased by the number of filemarks requested by the > MTIOCTOP. I believe that the residue count in the sense data should be > subtracted from fileno, not a increment as is done. > Thanks. Could you please send us a tested patch to fix these things, as per http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt ?
Reply-To: Kai.Makisara@kolumbus.fi On Wed, 24 Jan 2007, Andrew Morton wrote: > On Mon, 22 Jan 2007 13:07:20 -0800 > bugme-daemon@bugzilla.kernel.org wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=7864 > > > > Summary: A MTIOCTOP/MTWEOF within the early warning will cause > > the file number to be incorrect > > Kernel Version: 2.6.19.2 > > Status: NEW > > Severity: low > > Owner: io_scsi@kernel-bugs.osdl.org > > Submitter: ce_reisinger@yahoo.com > > > > > > Write records to a SCSI tape until a write fails with a ENOSPC (you have reached > > early warning. > > Now perform a: > > struct mtget before, after; > > ioctl(fd, MTIOCGET, &before); > > struct mtop mtop = { MTWEOF, 1 }; > > ioctl(fd, MTIOCTOP, &mtop); > > ioctl(fd, MTIOCGET, &after); > > > > Check the value of mt_fileno in the before and after structures. Notice the > > after is 2 greater then the before. > > > > The problem appears to be in the block of code starting at line 2817 in st.c. > > This block is entered because the drive did return a CHECK CONDITION with NO > > SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But > > it has already been increased by the number of filemarks requested by the > > MTIOCTOP. I believe that the residue count in the sense data should be > > subtracted from fileno, not a increment as is done. > > > > Thanks. Could you please send us a tested patch to fix these things, as > per http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt ? > The analysis is basically correct and explains the bug. According to the SCSI standards, the sense code is NO SENSE or RECOVERED ERROR in case writing filemark(s) succeeds. If it fails (partly or completely) the sense code is VOLUME OVERFLOW. The patch below is tested to fix the case when one filemark is successfully written after the EOM early warning. It should also fix the case at real EOM but this has not been tested. Carl, thanks for reporting the bug and providing the analysis for the fix. Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi> --- linux-2.6/drivers/scsi/st.c 2006-12-09 13:29:31.000000000 +0200 +++ linux-2.6.20-rc6-km/drivers/scsi/st.c 2007-01-25 22:51:35.000000000 +0200 @@ -2816,15 +2816,18 @@ static int st_int_ioctl(struct scsi_tape if (cmd_in == MTWEOF && cmdstatp->have_sense && - (cmdstatp->flags & SENSE_EOM) && - (cmdstatp->sense_hdr.sense_key == NO_SENSE || - cmdstatp->sense_hdr.sense_key == RECOVERED_ERROR) && - undone == 0) { - ioctl_result = 0; /* EOF written successfully at EOM */ - if (fileno >= 0) - fileno++; + (cmdstatp->flags & SENSE_EOM)) { + if (cmdstatp->sense_hdr.sense_key == NO_SENSE || + cmdstatp->sense_hdr.sense_key == RECOVERED_ERROR) { + ioctl_result = 0; /* EOF(s) written successfully at EOM */ + STps->eof = ST_NOEOF; + } else { /* Writing EOF(s) failed */ + if (fileno >= 0) + fileno -= undone; + if (undone < arg) + STps->eof = ST_NOEOF; + } STps->drv_file = fileno; - STps->eof = ST_NOEOF; } else if ((cmd_in == MTFSF) || (cmd_in == MTFSFM)) { if (fileno >= 0) STps->drv_file = fileno - undone;
The patch as supplied fixes the incorrect file number, but it seems that my additional comment on the block number was missed. The following change solves this issue: --- st.c 2007-01-26 11:00:19.552772022 -0800 +++ st.c.new 2007-01-26 12:53:07.935696743 -0800 @@ -2827,7 +2827,10 @@ if (undone < arg) STps->eof = ST_NOEOF; } - STps->drv_file = fileno; + if (STps->drv_file != fileno) { + STps->drv_file = fileno; + STps->drv_block = 0; + } } else if ((cmd_in == MTFSF) || (cmd_in == MTFSFM)) { if (fileno >= 0) STps->drv_file = fileno - undone;
The problem was fixed by commit 91614c054c9ffc26b47a5cb3135113aa0f6e6ff0. The bug can be closed now.