Bug 211023 - Spurious Kernel errors while ejecting a USB-KEY/USB-drive
Summary: Spurious Kernel errors while ejecting a USB-KEY/USB-drive
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: Tejun Heo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-04 09:57 UTC by Aurélien Oudelet
Modified: 2021-02-28 21:05 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.10.4
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Log when ejecting USB drive with Kernel 5.10.x (5.64 KB, text/plain)
2021-01-04 09:57 UTC, Aurélien Oudelet
Details
Bisection log (2.52 KB, text/plain)
2021-02-12 05:53 UTC, Tom Seewald
Details

Description Aurélien Oudelet 2021-01-04 09:57:18 UTC
Created attachment 294479 [details]
Log when ejecting USB drive with Kernel 5.10.x

We, at Mageia.org, remark a strange behavior when ejecting a USB disk-based device. Kernel seems complain about missing device while udisks has been ejected it. See attached log.

STEPS TO REPRODUCE
1. Plug a USB-based disk (USB key, USB Harddrive), assuming here it is /dev/sdb
2. Mount partition: # mount /dev/sdb1 /media
3. Remove partition: # eject /dev/sdb1
6. See system logs.

OBSERVED RESULT
kernel: scsi_io_completion_action: 6 callbacks suppressed
kernel: sd 4:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
kernel: sd 4:0:0:0: [sdb] tag#0 Sense Key : Not Ready [current] 
kernel: sd 4:0:0:0: [sdb] tag#0 Add. Sense: Medium not present
kernel: sd 4:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
kernel: print_req_error: 6 callbacks suppressed
kernel: blk_update_request: I/O error, dev sdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
kernel: buffer_io_error: 54 callbacks suppressed

See attached log.

We can't reproduce with Kernel-5.9.12 (as we only compiled it before switching to 5.10.x LTS).
We can reproduce with vanilla Kernel 5.10.x (1 to 4).

Here is report in our Bugzilla:
https://bugs.mageia.org/show_bug.cgi?id=27877

Here is report for udisks2:
https://github.com/storaged-project/udisks/issues/827
Comment 1 Pascal Terjan 2021-01-04 13:26:41 UTC
Copying some details here, this happens when the reset command asks the kernel to reread the partition table just after "ejecting" the device, and the kernel can no longer read it. I am not sure there is a valid reason for doing that, but this started to fail with 5.10.

openat(AT_FDCWD, "/dev/sda", O_RDWR|O_EXCL|O_NONBLOCK) = 3
ioctl(3, CDROM_LOCKDOOR, 0)             = -1 EINVAL (Invalid argument)
ioctl(3, SG_GET_VERSION_NUM, [30527])   = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_NONE, cmd_len=6, cmdp="\x1e\x00\x00\x00\x00\x00", mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=10000, flags=0, status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=0, duration=0, info=0}) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_NONE, cmd_len=6, cmdp="\x1b\x00\x00\x00\x01\x00", mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=10000, flags=0, status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=0, duration=0, info=0}) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_NONE, cmd_len=6, cmdp="\x1b\x00\x00\x00\x02\x00", mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=10000, flags=0, status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
ioctl(3, BLKRRPART)                     = -1 EIO (Input/output error)

I started a bisect yesterday but this will take some time.
Comment 2 Pascal Terjan 2021-01-04 19:10:14 UTC
471bd0af544bcd80450d61659a1ffabdd1820d7d is the first bad commit

commit 471bd0af544bcd80450d61659a1ffabdd1820d7d
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Sep 8 16:53:44 2020 +0200

    sd: use bdev_check_media_change
    
    Switch to use bdev_check_media_change instead of check_disk_change and
    call sd_revalidate_disk manually.  As sd also calls sd_revalidate_disk
    manually during probe and open, the extra call into ->revalidate_disk
    from bdev_disk_changed is not required either, so stop wiring up the
    method.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

 drivers/scsi/sd.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Log on older kernel:

[   38.512120] sda: detected capacity change from 1992294400 to 0

Log on newer kernel:

[   32.094148] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
[   32.094153] sd 2:0:0:0: [sda] tag#0 Sense Key : Unit Attention [current] 
[   32.094156] sd 2:0:0:0: [sda] tag#0 Add. Sense: Not ready to ready change, medium may have changed
[   32.094159] sd 2:0:0:0: [sda] tag#0 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
[   32.094163] blk_update_request: I/O error, dev sda, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094171] Buffer I/O error on dev sda, logical block 0, async page read
[   32.094201] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094205] blk_update_request: I/O error, dev sda, sector 1 op 0x0:(READ) flags 0x0 phys_seg 7 prio class 0
[   32.094207] Buffer I/O error on dev sda, logical block 1, async page read
[   32.094208] Buffer I/O error on dev sda, logical block 2, async page read
[   32.094210] Buffer I/O error on dev sda, logical block 3, async page read
[   32.094211] Buffer I/O error on dev sda, logical block 4, async page read
[   32.094213] Buffer I/O error on dev sda, logical block 5, async page read
[   32.094215] Buffer I/O error on dev sda, logical block 6, async page read
[   32.094216] Buffer I/O error on dev sda, logical block 7, async page read
[   32.094258] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094265] blk_update_request: I/O error, dev sda, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094268] Buffer I/O error on dev sda, logical block 0, async page read
[   32.094283] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094286] blk_update_request: I/O error, dev sda, sector 1 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094287] Buffer I/O error on dev sda, logical block 1, async page read
[   32.094300] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094303] blk_update_request: I/O error, dev sda, sector 2 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094315] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094317] blk_update_request: I/O error, dev sda, sector 3 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094330] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094332] blk_update_request: I/O error, dev sda, sector 4 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094345] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094347] blk_update_request: I/O error, dev sda, sector 5 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094359] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094361] blk_update_request: I/O error, dev sda, sector 6 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094377] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094379] blk_update_request: I/O error, dev sda, sector 7 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   32.094399] sd 2:0:0:0: [sda] tag#0 device offline or changed
[repeated many times]
[   32.094487] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.094497] ldm_validate_partition_table(): Disk read failed.
[   32.094507] sd 2:0:0:0: [sda] tag#0 device offline or changed
[repeated many times]
[   32.095037] sd 2:0:0:0: [sda] tag#0 device offline or changed
[   32.095048]  sda: unable to read partition table
Comment 3 Pascal Terjan 2021-01-05 17:44:25 UTC
Component should probably be SCSI but I don't have permissions to change that.
Comment 4 Tom Seewald 2021-02-12 05:52:48 UTC
I ran into this myself and have reported this issue to the linux-block mailing list.
https://lore.kernel.org/linux-block/CAARYdbiUBxFTY25VusuxgxqVzNRnoB61fFQeXcmsKyDP_d_ipQ@mail.gmail.com/T/#u
Comment 5 Tom Seewald 2021-02-12 05:53:41 UTC
Created attachment 295251 [details]
Bisection log
Comment 6 Tom Seewald 2021-02-28 21:05:26 UTC
This should be fixed with the following commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4601b4b130de2329fe06df80ed5d77265f2058e5

I have confirmed it solves the issue for me, so I think this issue can be marked as resolved.

Note You need to log in before you can comment on or make changes to this bug.