Bug 201257

Summary: SCSI write error not seen by Linux AIO?
Product: IO/Storage Reporter: dchen
Component: AIOAssignee: Badari Pulavarty (pbadari)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.9.129 Subsystem:
Regression: No Bisected commit-id:
Attachments: test program used to do Linux AIO
kludge to scsi_debug to return error on write
Test log for unexpected behavior on 4.9.129
Test log for expected behavior or 4.18.10

Description dchen 2018-09-27 22:11:23 UTC
Created attachment 278803 [details]
test program used to do Linux AIO

I'm using Linux AIO to write to a SCSI device. My test program (a slightly edited version of https://gist.github.com/larytet/87f90b08643ac3de934df2cadff4989c) uses Linux AIO to do a single write of 512B. The device I'm writing to is a fake SCSI device created by the scsi_debug kernel module. I've added a small kludge to the scsi_debug kernel module to return a WRITE ERROR - AUTO REALLOCATION FAILED error when the SCSI device is written to. I expect that the test program will receive some error, but everywhere I check result statuses, they report OK.

I get this unexpected (to me anyway) behavior on kernel versions 4.9.129 and 4.9.0. I do get an error as I expect on kernel versions 4.18.10, 4.14.7, and 4.1.8.

I'm checking that the result code is zero for each of: the io_event filled in by io_getevents(), the return code of io_setup(), io_destroy(), fsync(), and close(). I'm checking that the result code is 1 for each of io_submit() and io_getevents().

This looks like a bug to me, but certainly I could have missed checking a return  code somewhere, or maybe I'm not supposed to get an error back for a reason I don't know about yet, or...?
Comment 1 dchen 2018-09-27 22:12:48 UTC
Created attachment 278805 [details]
kludge to scsi_debug to return error on write
Comment 2 dchen 2018-09-27 22:14:15 UTC
Created attachment 278807 [details]
Test log for unexpected behavior on 4.9.129
Comment 3 dchen 2018-09-27 22:15:20 UTC
Created attachment 278809 [details]
Test log for expected behavior or 4.18.10
Comment 4 dchen 2018-09-27 22:30:38 UTC
It could be that I'm seeing this unexpected (to me) behavior because of some quirk with the scsi_debug fake SCSI device. However, I originally ran into this behavior when testing against my company's ClearSky Storage, using both iSCSI and Fibre Channel targets. Since I've seen it on these different targets, I think it's unlikely to be some quirk of scsi_debug.
Comment 5 dchen 2018-10-02 16:22:01 UTC
I should add that when I use dd to do a standard write() to the SCSI device, instead of using Linux AIO, then I get an I/O error as expected.

Also when I use device mapper to create a flakey device (e.g. "dmsetup create dchen-test --table="0 `blockdev --getsz /dev/sdg` flakey /dev/sdg 0 9 1"), instead of using a SCSI device, then I get an I/O error as expected.
Comment 6 dchen 2018-12-18 20:26:08 UTC
This bug is fixed by the change below:

commit 41e817bca3acd3980efe5dd7d28af0e6f4ab9247
Author: Maximilian Heyne <mheyne@amazon.de>
Date:   Fri Nov 30 08:35:14 2018 -0700

    fs: fix lost error code in dio_complete
    
    commit e259221763a40403d5bb232209998e8c45804ab8 ("fs: simplify the
    generic_write_sync prototype") reworked callers of generic_write_sync(),
    and ended up dropping the error return for the directio path. Prior to
    that commit, in dio_complete(), an error would be bubbled up the stack,
    but after that commit, errors passed on to dio_complete were eaten up.
    
    This was reported on the list earlier, and a fix was proposed in
    https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/, but
    never followed up with.  We recently hit this bug in our testing where
    fencing io errors, which were previously erroring out with EIO, were
    being returned as success operations after this commit.
    
    The fix proposed on the list earlier was a little short -- it would have
    still called generic_write_sync() in case `ret` already contained an
    error. This fix ensures generic_write_sync() is only called when there's
    no pending error in the write. Additionally, transferred is replaced
    with ret to bring this code in line with other callers.
    
    Fixes: e259221763a4 ("fs: simplify the generic_write_sync prototype")
    Reported-by: Ravi Nankani <rnankani@amazon.com>
    Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    CC: Torsten Mehlan <tomeh@amazon.de>
    CC: Uwe Dannowski <uwed@amazon.de>
    CC: Amit Shah <aams@amazon.de>
    CC: David Woodhouse <dwmw@amazon.co.uk>
    CC: stable@vger.kernel.org
    Signed-off-by: Jens Axboe <axboe@kernel.dk>