I am repeatedly seeing hald, [scsi_eh_1], and [usb-storage] lockup in uninterruptable sleep (D) state while the activity light of my 6in1 card reader flashes nervously. Sep 1 10:22:33 hostmaster kernel: hald D 000052deb4689beb 0 18321 1 21755 17171 (NOTLB) Sep 1 10:22:33 hostmaster kernel: 000001001d787c28 0000000000000006 0000007300000000 0000010022888030 Sep 1 10:22:33 hostmaster kernel: 0000000000000189 000001003fec9430 0000010022888348 ffffffff80279860 Sep 1 10:22:33 hostmaster kernel: 000001003ffe3390 ffffffff802d7d6d Sep 1 10:22:33 hostmaster kernel: Call Trace:<ffffffff80279860>{scsi_done+0} <ffffffff802d7d6d>{.text.lock.scsiglue+5} Sep 1 10:22:33 hostmaster kernel: <ffffffff8038f74a>{wait_for_completion+154} <ffffffff8012e370>{default_wake_function+0} Sep 1 10:22:33 hostmaster kernel: <ffffffff8012e370>{default_wake_function+0} <ffffffff8027d8e3>{scsi_wait_req+99} Sep 1 10:22:33 hostmaster kernel: <ffffffff8014e028>{find_get_pages+40} <ffffffff8028eca0>{sd_revalidate_disk+288} Sep 1 10:22:33 hostmaster kernel: <ffffffff8026d979>{ide_diag_taskfile+201} <ffffffff80170c50>{invalidate_bh_lru+0} Sep 1 10:22:33 hostmaster kernel: <ffffffff80170c50>{invalidate_bh_lru+0} <ffffffff8017648d>{check_disk_change+93} Sep 1 10:22:33 hostmaster kernel: <ffffffff8028e527>{sd_open+215} <ffffffff80176839>{do_open+281} Sep 1 10:22:33 hostmaster kernel: <ffffffff801761c7>{bdget+263} <ffffffff80176cdf>{blkdev_open+47} Sep 1 10:22:33 hostmaster kernel: <ffffffff8016dac6>{dentry_open+246} <ffffffff8016dc1e>{filp_open+62} Sep 1 10:22:33 hostmaster kernel: <ffffffff80182c86>{sys_poll+806} <ffffffff8017bb55>{getname+149} Sep 1 10:22:33 hostmaster kernel: <ffffffff80181f30>{__pollwait+0} <ffffffff8016de7c>{sys_open+76} Sep 1 10:22:33 hostmaster kernel: <ffffffff8010e562>{system_call+126} Sep 1 10:22:02 hostmaster kernel: scsi_eh_1 D 000001003ffe33b0 0 196 1 197 179 (L-TLB) Sep 1 10:22:02 hostmaster kernel: 000001001fcd1df8 0000000000000046 0000008b3fe976c8 000001003fe973b0 Sep 1 10:22:02 hostmaster kernel: 000000000000334f 000001003df96d30 000001003fe976c8 000001003fe20970 Sep 1 10:22:02 hostmaster kernel: 0000000300000000 000001001fcd1ef8 Sep 1 10:22:02 hostmaster kernel: Call Trace:<ffffffff8038f74a>{wait_for_completion+154} <ffffffff8012e370>{default_wake_function+0} Sep 1 10:22:02 hostmaster kernel: <ffffffff802c9746>{hcd_unlink_urb+390} <ffffffff8012e370>{default_wake_function+0} Sep 1 10:22:02 hostmaster kernel: <ffffffff802d7877>{command_abort+135} <ffffffff8027cf3e>{scsi_error_handler+942} Sep 1 10:22:02 hostmaster kernel: <ffffffff8010f08f>{child_rip+8} <ffffffff8027cb90>{scsi_error_handler+0} Sep 1 10:22:02 hostmaster kernel: <ffffffff8010f087>{child_rip+0} Sep 1 10:22:02 hostmaster kernel: usb-storage D 0000010020b91d98 0 197 1 424 196 (L-TLB) Sep 1 10:22:02 hostmaster kernel: 0000010020b91cc8 0000000000000046 0000008b1fc1e468 000001003fec9430 Sep 1 10:22:02 hostmaster kernel: 000000000000039c 000001003df96d30 000001003fec9748 000001003ff80c00 Sep 1 10:22:02 hostmaster kernel: 000001003fec9430 0000010020b91d78 Sep 1 10:22:02 hostmaster kernel: Call Trace:<ffffffff8038f74a>{wait_for_completion+154} <ffffffff8012e370>{default_wake_function+0} Sep 1 10:22:02 hostmaster kernel: <ffffffff8012e370>{default_wake_function+0} <ffffffff802d845f>{usb_stor_msg_common+351} Sep 1 10:22:02 hostmaster kernel: <ffffffff802d88c2>{usb_stor_bulk_transfer_buf+98} <ffffffff802d9117>{usb_stor_Bulk_transport+167} Sep 1 10:22:02 hostmaster kernel: <ffffffff802d8c40>{usb_stor_invoke_transport+448} <ffffffff802d81b9>{usb_stor_transparent_scsi_command+25} Sep 1 10:22:02 hostmaster kernel: <ffffffff802d9863>{usb_stor_control_thread+483} <ffffffff8010f08f>{child_rip+8} Sep 1 10:22:02 hostmaster kernel: <ffffffff802d9680>{usb_stor_control_thread+0} <ffffffff8010f087>{child_rip+0} Sep 1 10:22:02 hostmaster kernel:
This isn't necessarily an SMP race. It looks more like a hardware problem with the USB host controller or a problem with interrupt delivery. A USB request was cancelled, and both usb-storage and scsi_eh are waiting for the host controller driver to acknowledge that the request has been unsuccessfully completed. Presumably the HCD in turn is waiting for an interrupt from the controller, which never arrives.
Is this reproducable in non-SMP mode?
More than 30 days past with no response. Closing.