Bug 207877

Summary: ASMedia drive (174c:55aa) hangs in ioctl CDROM_DRIVE_STATUS when mounting a DVD
Product: IO/Storage Reporter: Zebediah Figura (zfigura)
Component: SCSIAssignee: linux-scsi (linux-scsi)
Status: NEW ---    
Severity: normal CC: bvanassche
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.6.14-arch1-1 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg including backtrace of hang
dmesg including backtrace of hang, from 5.7-rc6
usbmon trace
dmesg (from journalctl -k) including backtrace of hang and usb-storage debugging

Description Zebediah Figura 2020-05-24 16:50:58 UTC
Created attachment 289267 [details]
dmesg including backtrace of hang

The device has USB vendor/product 174c:55aa, and is described by its USB product string as "ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge, ASM1153E SATA 6Gb/s bridge".

When attempting to mount via "mount /dev/sr0 /mnt", strace shows the following sequence (tail of log):

openat(AT_FDCWD, "/dev/sr0", O_RDONLY|O_NONBLOCK|O_CLOEXEC) = 3
fadvise64(3, 0, 0, POSIX_FADV_RANDOM)   = 0
fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(0xb, 0), ...}) = 0
ioctl(3, BLKGETSIZE64, [8351399936])    = 0
openat(AT_FDCWD, "/sys/dev/block/11:0", O_RDONLY|O_CLOEXEC) = 4
openat(4, "dm/uuid", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
close(4)                                = 0
openat(AT_FDCWD, "/sys/dev/block/11:0", O_RDONLY|O_CLOEXEC) = 4
faccessat(4, "partition", F_OK)         = -1 ENOENT (No such file or directory)
openat(4, "dm/uuid", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
close(4)                                = 0
ioctl(3, CDROM_GET_CAPABILITY, 0)       = 2227183
ioctl(3, CDROM_DRIVE_STATUS

The last ioctl hangs in the kernel forever. mount cannot be killed even with SIGKILL. dmesg shows a hang in multiple kernel thread which has been attached.

This problem occurs with at least one DVD I have tested, but I was able to successfully mount, access, and unmount another CD-ROM.
Comment 1 Bart Van Assche 2020-05-24 18:30:46 UTC
This may have been fixed by commit 51a858817dcd ("scsi: sr: get rid of sr global mutex") + commit 72655c0ebd1d ("scsi: sr: Fix sr_block_release()"). Both commits will be included in kernel v5.7 (not yet released). How about testing whether this has been fixed in kernel v5.7-rc6 (https://git.kernel.org/torvalds/t/linux-5.7-rc6.tar.gz)?
Comment 2 Zebediah Figura 2020-05-24 20:57:59 UTC
Created attachment 289273 [details]
dmesg including backtrace of hang, from 5.7-rc6

(In reply to Bart Van Assche from comment #1)
> This may have been fixed by commit 51a858817dcd ("scsi: sr: get rid of sr
> global mutex") + commit 72655c0ebd1d ("scsi: sr: Fix sr_block_release()").
> Both commits will be included in kernel v5.7 (not yet released). How about
> testing whether this has been fixed in kernel v5.7-rc6
> (https://git.kernel.org/torvalds/t/linux-5.7-rc6.tar.gz)?

Unfortunately, 5.7-rc6 doesn't fix the bug. It hangs on the same ioctl, with what looks like the same stack trace.
Comment 3 Bart Van Assche 2020-05-24 23:48:07 UTC
I think there are fewer processes involved in the v5.7-rc6 hang. Anyway, please report this on the appropriate USB mailing list. It seems to me that the following USB code is involved in the hang:

static int device_reset(struct scsi_cmnd *srb)
{
	struct us_data *us = host_to_us(srb->device->host);
	int result;

	usb_stor_dbg(us, "%s called\n", __func__);

	/* lock the device pointers and do the reset */
	mutex_lock(&(us->dev_mutex));
	result = us->transport_reset(us);
	mutex_unlock(&us->dev_mutex);

	return result < 0 ? FAILED : SUCCESS;
}

and also

static int usb_stor_control_thread(void * __us)
{
		[ ... ]
		mutex_lock(&(us->dev_mutex));
		[ ... ]
			fill_inquiry_response(us, data_ptr, 36);
		[ ... ]
		mutex_unlock(&us->dev_mutex);
		[ ... ]
}

The following mailing list may be appropriate:

usb-storage@lists.one-eyed-alien.net
Comment 4 Zebediah Figura 2020-05-28 16:28:32 UTC
Created attachment 289385 [details]
usbmon trace

Attaching a usbmon trace including the hang. The drive is bus 1, device 3. The last bulk input submission apparently never receives a response (I watched it for a while, and after several minutes it was still stuck on the last line.)

Is this a sign of a broken device?

Even if it is, is the kernel mishandling this by hanging forever? (I'd be inclined to say yes, but maybe there's only so much the kernel can do with broken devices...)
Comment 5 Zebediah Figura 2020-05-28 19:29:47 UTC
Created attachment 289389 [details]
dmesg (from journalctl -k) including backtrace of hang and usb-storage debugging

More testing, in response to questions/instructions from the mailing list.

Sometimes mount hangs forever in openat instead. I guess probably some kernel thread is getting stuck for reasons not directly related to mount, and mount is hanging as a result.

I built with 5.7-rc7 and CONFIG_USB_STORAGE_DEBUG enabled. I got a hang in openat; I've attached kernel logs. usbmon shows a similar bulk input transfer which never completes (even after waiting several minutes); I can attach that output if it's helpful.