Bug 7864

Summary: A MTIOCTOP/MTWEOF within the early warning will cause the file number to be incorrect
Product: IO/Storage Reporter: Carl Reisinger (ce_reisinger)
Component: SCSIAssignee: io_scsi
Status: CLOSED CODE_FIX    
Severity: low CC: protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.19.2 Subsystem:
Regression: --- Bisected commit-id:

Description Carl Reisinger 2007-01-22 12:59:30 UTC
Write records to a SCSI tape until a write fails with a ENOSPC (you have reached
early warning.
Now perform a:
   struct mtget before, after;
   ioctl(fd, MTIOCGET, &before);
   struct mtop mtop = { MTWEOF, 1 };
   ioctl(fd, MTIOCTOP, &mtop);
   ioctl(fd, MTIOCGET, &after);

Check the value of mt_fileno in the before and after structures. Notice the
after is 2 greater then the before.

The problem appears to be in the block of code starting at line 2817 in st.c.
This block is entered because the drive did return a CHECK CONDITION with NO
SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But
it has already been increased by the number of filemarks requested by the
MTIOCTOP. I believe that the residue count in the sense data should be
subtracted from fileno, not a increment as is done.
Comment 1 Carl Reisinger 2007-01-22 15:56:01 UTC
Another minor issue in the same section of code:

The value of STps->drv_block also needs to be updated (set to 0)
Comment 2 Andrew Morton 2007-01-24 20:40:06 UTC
On Mon, 22 Jan 2007 13:07:20 -0800
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=7864
> 
>            Summary: A MTIOCTOP/MTWEOF within the early warning will cause
>                     the file number to be incorrect
>     Kernel Version: 2.6.19.2
>             Status: NEW
>           Severity: low
>              Owner: io_scsi@kernel-bugs.osdl.org
>          Submitter: ce_reisinger@yahoo.com
> 
> 
> Write records to a SCSI tape until a write fails with a ENOSPC (you have reached
> early warning.
> Now perform a:
>    struct mtget before, after;
>    ioctl(fd, MTIOCGET, &before);
>    struct mtop mtop = { MTWEOF, 1 };
>    ioctl(fd, MTIOCTOP, &mtop);
>    ioctl(fd, MTIOCGET, &after);
> 
> Check the value of mt_fileno in the before and after structures. Notice the
> after is 2 greater then the before.
> 
> The problem appears to be in the block of code starting at line 2817 in st.c.
> This block is entered because the drive did return a CHECK CONDITION with NO
> SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But
> it has already been increased by the number of filemarks requested by the
> MTIOCTOP. I believe that the residue count in the sense data should be
> subtracted from fileno, not a increment as is done.
> 

Thanks.  Could you please send us a tested patch to fix these things, as
per http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt ?

Comment 3 Anonymous Emailer 2007-01-25 14:31:51 UTC
Reply-To: Kai.Makisara@kolumbus.fi

On Wed, 24 Jan 2007, Andrew Morton wrote:

> On Mon, 22 Jan 2007 13:07:20 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=7864
> > 
> >            Summary: A MTIOCTOP/MTWEOF within the early warning will cause
> >                     the file number to be incorrect
> >     Kernel Version: 2.6.19.2
> >             Status: NEW
> >           Severity: low
> >              Owner: io_scsi@kernel-bugs.osdl.org
> >          Submitter: ce_reisinger@yahoo.com
> > 
> > 
> > Write records to a SCSI tape until a write fails with a ENOSPC (you have reached
> > early warning.
> > Now perform a:
> >    struct mtget before, after;
> >    ioctl(fd, MTIOCGET, &before);
> >    struct mtop mtop = { MTWEOF, 1 };
> >    ioctl(fd, MTIOCTOP, &mtop);
> >    ioctl(fd, MTIOCGET, &after);
> > 
> > Check the value of mt_fileno in the before and after structures. Notice the
> > after is 2 greater then the before.
> > 
> > The problem appears to be in the block of code starting at line 2817 in st.c.
> > This block is entered because the drive did return a CHECK CONDITION with NO
> > SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But
> > it has already been increased by the number of filemarks requested by the
> > MTIOCTOP. I believe that the residue count in the sense data should be
> > subtracted from fileno, not a increment as is done.
> > 
> 
> Thanks.  Could you please send us a tested patch to fix these things, as
> per http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt ?
> 
The analysis is basically correct and explains the bug. According to the 
SCSI standards, the sense code is NO SENSE or RECOVERED ERROR in case 
writing filemark(s) succeeds. If it fails (partly or completely) the sense 
code is VOLUME OVERFLOW. The patch below is tested to fix the case when 
one filemark is successfully written after the EOM early warning. It 
should also fix the case at real EOM but this has not been tested.

Carl, thanks for reporting the bug and providing the analysis for the fix.

Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>

--- linux-2.6/drivers/scsi/st.c	2006-12-09 13:29:31.000000000 +0200
+++ linux-2.6.20-rc6-km/drivers/scsi/st.c	2007-01-25 22:51:35.000000000 +0200
@@ -2816,15 +2816,18 @@ static int st_int_ioctl(struct scsi_tape
 
 		if (cmd_in == MTWEOF &&
 		    cmdstatp->have_sense &&
-		    (cmdstatp->flags & SENSE_EOM) &&
-		    (cmdstatp->sense_hdr.sense_key == NO_SENSE ||
-		     cmdstatp->sense_hdr.sense_key == RECOVERED_ERROR) &&
-		    undone == 0) {
-			ioctl_result = 0;	/* EOF written successfully at EOM */
-			if (fileno >= 0)
-				fileno++;
+		    (cmdstatp->flags & SENSE_EOM)) {
+			if (cmdstatp->sense_hdr.sense_key == NO_SENSE ||
+			    cmdstatp->sense_hdr.sense_key == RECOVERED_ERROR) {
+				ioctl_result = 0;	/* EOF(s) written successfully at EOM */
+				STps->eof = ST_NOEOF;
+			} else {  /* Writing EOF(s) failed */
+				if (fileno >= 0)
+					fileno -= undone;
+				if (undone < arg)
+					STps->eof = ST_NOEOF;
+			}
 			STps->drv_file = fileno;
-			STps->eof = ST_NOEOF;
 		} else if ((cmd_in == MTFSF) || (cmd_in == MTFSFM)) {
 			if (fileno >= 0)
 				STps->drv_file = fileno - undone;

Comment 4 Carl Reisinger 2007-01-26 12:52:49 UTC
The patch as supplied fixes the incorrect file number, but it seems that my
additional comment on the block number was missed.

The following change solves this issue:

--- st.c        2007-01-26 11:00:19.552772022 -0800
+++ st.c.new    2007-01-26 12:53:07.935696743 -0800
@@ -2827,7 +2827,10 @@
                                if (undone < arg)
                                        STps->eof = ST_NOEOF;
                        }
-                       STps->drv_file = fileno;
+                        if (STps->drv_file != fileno) {
+                            STps->drv_file = fileno;
+                            STps->drv_block = 0;
+                        }
                } else if ((cmd_in == MTFSF) || (cmd_in == MTFSFM)) {
                        if (fileno >= 0)
                                STps->drv_file = fileno - undone;
Comment 5 Natalie Protasevich 2007-08-01 01:00:11 UTC
The problem was fixed by commit 91614c054c9ffc26b47a5cb3135113aa0f6e6ff0.
The bug can be closed now.