Bug 83271
Summary: | slow speeds with usb (and mmc) card readers only since 3.17-rc1 kernel | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | abandoned account (abandonedaccountubdprczb8hs) |
Component: | Block Layer | Assignee: | Jens Axboe (axboe) |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | liuj97 |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 3.17-rc1, 3.17-rc2 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | kernel 3.17-rc2 .config |
There are very few block layer changes from 3.16 to 3.17-rc2, so it's not likely that the problem is in there. Could be USB or MMC, I guess. The best way to figure it out would be to do a git bisect from a known good version and to 3.17-rc1. Thank you Jens Axboe. I will look into exactly how to do that and will report back with the bad commit. Hi. I need some guidance. I am not sure what to do next. I tried git bisect: git bisect start v3.17-rc1 v3.16 and I got the first bad commit as being this one: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b1bfd5ea451b60a4725907c282dec232c63f68bb But that doesn't make sense to me, so I must've done something wrong (like marked a bad commit as good? maybe because lsblk seemed fast to detect device when plugged but it only happens sometimes?) and so I started another bisect: git bisect start b1bfd5ea451b60a4725907c282dec232c63f68bb v3.16 And now I got this message: The merge base 401c58fcbbf570e7e4a8ee0e21ffd829deb4de0b is bad. This means the bug has been fixed between 401c58fcbbf570e7e4a8ee0e21ffd829deb4de0b and [19583ca584d6f574384e17fe7613dfaeadcdc4a6]. I don't know what to do next. I link those commits here for easy access: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=401c58fcbbf570e7e4a8ee0e21ffd829deb4de0b https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=19583ca584d6f574384e17fe7613dfaeadcdc4a6 I've only tested the amount of time it takes for the usb card reader to be seen by lsblk and I notice that lsblk gets stuck for a few seconds if I run it after I plug in the USB device: that means it shows nothing on screen and cannot be C-c until it unstucks and shows the list of devices. And it unstucks only after dmesg spews these(killed messages): ... [ 83.480413] Bulk Command S 0x43425355 T 0x178 L 0 F 0 Trg 0 LUN 2 CL 6 [ 83.480420] xfer 31 bytes [ 83.480588] Status code 0; transferred 31/31 [ 83.480600] -- transfer complete [ 83.480603] Bulk command transfer result=0 [ 83.480609] Attempting to get CSW... [ 83.480613] xfer 13 bytes [ 83.480679] Status code 0; transferred 13/13 [ 83.480682] -- transfer complete [ 83.480686] Bulk status result = 0 [ 83.480694] Bulk Status S 0x53425355 T 0x178 R 0 Stat 0x0 [ 83.480700] scsi cmd done, result=0x0 [ 83.480714] *** thread sleeping [ 83.761693] systemd-udevd[2084]: worker [6647] /devices/pci0000:00/0000:00:12.2/usb1/1-1/1-1:1.0/host4/targ et4:0:0/4:0:0:2/block/sde timeout; kill it [ 83.762358] systemd-udevd[2084]: seq 2030 '/devices/pci0000:00/0000:00:12.2/usb1/1-1/1-1:1.0/host4/target4: 0:0/4:0:0:2/block/sde' killed [ 83.762698] systemd-udevd[2084]: worker [6716] /devices/pci0000:00/0000:00:12.2/usb1/1-1/1-1:1.0/host4/targ et4:0:0/4:0:0:3/block/sdf timeout; kill it [ 83.762995] systemd-udevd[2084]: seq 2026 '/devices/pci0000:00/0000:00:12.2/usb1/1-1/1-1:1.0/host4/target4: 0:0/4:0:0:3/block/sdf' killed [ 83.763395] systemd-udevd[2084]: worker [6721] /devices/pci0000:00/0000:00:12.2/usb1/1-1/1-1:1.0/host4/targ et4:0:0/4:0:0:1/block/sdd timeout; kill it [ 83.763719] systemd-udevd[2084]: seq 2027 '/devices/pci0000:00/0000:00:12.2/usb1/1-1/1-1:1.0/host4/target4: 0:0/4:0:0:1/block/sdd' killed [ 83.764032] systemd-udevd[2084]: worker [6746] /devices/pci0000:00/0000:00:12.2/usb1/1-1/1-1:1.0/host4/targ et4:0:0/4:0:0:0/block/sdc timeout; kill it [ 83.764367] systemd-udevd[2084]: seq 2007 '/devices/pci0000:00/0000:00:12.2/usb1/1-1/1-1:1.0/host4/target4: 0:0/4:0:0:0/block/sdc' killed [ 83.764701] systemd-udevd[2084]: worker [6746] terminated by signal 9 (Killed) [ 83.764989] systemd-udevd[2084]: worker [6647] terminated by signal 9 (Killed) [ 83.765322] systemd-udevd[2084]: worker [6716] terminated by signal 9 (Killed) [ 83.765636] systemd-udevd[2084]: worker [6721] terminated by signal 9 (Killed) [ 85.383518] *** thread awakened [ 85.383528] Command TEST_UNIT_READY (6 bytes) [ 85.383531] bytes: [ 85.383534] 00 00 00 00 00 00 [ 85.383545] Bulk Command S 0x43425355 T 0x179 L 0 F 0 Trg 0 LUN 1 CL 6 [ 85.383548] xfer 31 bytes [ 85.383750] Status code 0; transferred 31/31 [ 85.383758] -- transfer complete [ 85.383761] Bulk command transfer result=0 [ 85.383763] Attempting to get CSW... [ 85.383765] xfer 13 bytes [ 85.383990] Status code 0; transferred 13/13 ... Meanwhile(since I don't know what else to do) I will try another bisect in which I will not consider the time it takes until lsblk can see the device(was plugged in to the USB), but rather I will try to copy a file from the USB to /dev/null and see if the speed is around 6 MB/sec and mark commit as good, if it's like under 200 KB then I'll mark as bad. Ok this is it(unless I'm missing something): 6a38fa0e3c94dfd1394a71a2d47c9c4d47367374 is the first bad commit commit 6a38fa0e3c94dfd1394a71a2d47c9c4d47367374 Author: Jiang Liu <jiang.liu@linux.intel.com> Date: Tue Jun 10 14:16:27 2014 +0800 x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled Release IOAPIC pin associated with PCI device when the PCI device is disabled. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Link: http://lkml.kernel.org/r/1402380987-32577-1-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> :040000 040000 b05cd14bcdd2d0e5e2858b54ae5673f2b3c540bf 886f8fd82099bd1e0f6be1af3a5fc3e1591fd93f M arch :040000 040000 fbf08f292b3cda11c0f8bf4eaa0fed8672776b08 c35cdf1d6757a94af3b795412eadb45ad3569f04 M drivers Press any key to continue... But this is the bad commit only for the issue of speed transfer (a file read from USB card reader into /dev/null with 6MB/sec is good commit, but with 200 KB/sec is bad commit) There's a second issue with detection which happens when plugging in the USB card reader, and seems to be blocking lsblk for a few seconds. I'm not sure about this one yet, but is likely another (different) commit. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6a38fa0e3c94dfd1394a71a2d47c9c4d47367374 Here's my git bisect log (could be useful) # bad: [7d1311b93e58ed55f3a31cc8f94c4b8fe988a2b9] Linux 3.17-rc1 # good: [19583ca584d6f574384e17fe7613dfaeadcdc4a6] Linux 3.16 git bisect start 'v3.17-rc1' 'v3.16' # good: [ae045e2455429c418a418a3376301a9e5753a0a8] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next git bisect good ae045e2455429c418a418a3376301a9e5753a0a8 # good: [44c916d58b9ef1f2c4aec2def57fa8289c716a60] Merge tag 'cleanup-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc git bisect good 44c916d58b9ef1f2c4aec2def57fa8289c716a60 # good: [023f78b02c729070116fa3a7ebd4107a032d3f5c] Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 git bisect good 023f78b02c729070116fa3a7ebd4107a032d3f5c # good: [d27c0d90184a13e9e9f28c38e84f889a259f6b5f] Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect good d27c0d90184a13e9e9f28c38e84f889a259f6b5f # bad: [ae36e95cf81c98b111b84317adeb358aaffa80e2] Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux git bisect bad ae36e95cf81c98b111b84317adeb358aaffa80e2 # bad: [4a319a490ca59a746b3d36768c0e29ee19832366] Merge branch 'for-3.17/core' of git://git.kernel.dk/linux-block git bisect bad 4a319a490ca59a746b3d36768c0e29ee19832366 # bad: [13b102bf48d39d84d87bee31544f4588502649b6] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc git bisect bad 13b102bf48d39d84d87bee31544f4588502649b6 # good: [d7b830013f59cf586c1cec3caa1ce7156da59a13] x86, irq, ACPI: Use common irqdomain map interface to program IOAPIC pins git bisect good d7b830013f59cf586c1cec3caa1ce7156da59a13 # bad: [6ab1b27c849106647c42b3ea0681a039552e24fa] x86, apic: Replace trampoline physical addresses with defaults git bisect bad 6ab1b27c849106647c42b3ea0681a039552e24fa # bad: [6a38fa0e3c94dfd1394a71a2d47c9c4d47367374] x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled git bisect bad 6a38fa0e3c94dfd1394a71a2d47c9c4d47367374 # good: [795aacf63fee1bfe8b68fd5db4882576c5c0c570] x86, irq, devicetree: Use common irqdomain map interface to program IOAPIC pins git bisect good 795aacf63fee1bfe8b68fd5db4882576c5c0c570 # good: [16ee7b3dcc56be14b9a813612cff2cc2339cdced] x86, irq: Simplify the way to handle ISA IRQ git bisect good 16ee7b3dcc56be14b9a813612cff2cc2339cdced # good: [df334bead7e94772c41745af9f329383067d44ae] x86, irq: Introduce helper functions to release IOAPIC pin git bisect good df334bead7e94772c41745af9f329383067d44ae # first bad commit: [6a38fa0e3c94dfd1394a71a2d47c9c4d47367374] x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled Some notes: this is good for speed: d7b830013f59cf586c1cec3caa1ce7156da59a13 but bad for mount!(did I mean detect here?) first one that's bad for mount I think prev one was good, see git bisect log this2 is good for speed: 795aacf63fee1bfe8b68fd5db4882576c5c0c570 but bad for detect(aka lsblk)! mount is fast tho! good for speed: 16ee7b3dcc56be14b9a813612cff2cc2339cdced bad for lsblk - locks up when about to show sde1 (but after showing sde) I can confirm that reverting that commit(on master branch) returns the speed to normal. 1. made sure I'm on branch master (currently: ff0c57ac70434bc936cb0110eaf033a0a1a62e52 ) 2. recompile kernel, reboot, test file transfer speed, confirmed to be less than 100 KB/sec 3. git revert -n 6a38fa0e3c94dfd1394a71a2d47c9c4d47367374 4. recompile kernel... reboot and test file transfer speed, confirmed to be normal (around 6 MB/sec) https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6a38fa0e3c94dfd1394a71a2d47c9c4d47367374 Thanks for doing this! Now all you need to do is email Thomas Gleixner and Jiang Liu with a link to this bug report. Done. Thanks for your support Jens. One of them couldn't receive the email(for some reason that has to do with my email address maybe): <www.linutronix.de[62.245.132.108]>: Client host rejected: Sender Quota Exceeded 550 Sender verify failed (Note: I sent from a different email address than the aliased one that I used to report this bug) We have found the root cause and the patch passes basic tests. Will send out patch for formal review after more tests. tl;dr: that patch(by Jiang Liu) fixes this bug entirely. long read: I'd like to add, in the spirit of consistency just in case there are any doubts, that this patch solves both issues - in my bisect it seemed that 1. the low read speed and 2. the long(er) time it took the device to get detected(with lsblk after plugging it in) were two different issues which seemingly must've been caused by two different commits(because I was experiencing one without the other, at some point during bisect). fixed as seen here: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=9eabc99a635a77cbf0948ce17d3cbc2b51680d4a Thanks so much! further tested with mainline's master branch which is currently at: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=fd5984d7c8e8e249aca0c515817ab1e7dee1502c and can confirm this to be fixed and working as expected. It's my understanding that this fix will be available in 3.17-rc3 Cheers. |
Created attachment 148321 [details] kernel 3.17-rc2 .config Hi. I have an external USB card reader, I plug it into USB and it takes a while to detect and transfers to/from it are slow: kernel 3.16 max read speed: 12.43 MB/s kernel 3.17-rc2 max read speed: 190 KB/s It seems to be the same for 3.17-rc1 also. I have an internal card reader for this (same system I've tested everything on) Lenovo Z575 laptop and I'm experiencing low speeds with 3.17-rc1 and rc2 kernel. With kernel 3.17-rc2 : It takes 10 seconds to detect sdc and sdc1 with lsblk after plugging in the USB cardreader(with a SD card already plugged in) It takes 21 seconds to mount. With the internal card, if I insert a MSPRO Duo card, it takes 13 seconds to detect the device in lsblk as mspblk0 mspblk0 252:0 0 60.8M 0 tape mounting this one takes over 7 minutes: $ time sudo mount /dev/mspblk0 here real 7m6.653s user 0m0.015s sys 0m0.027s unmounting is instant apparently. this internal card read is: Bus 002 Device 002: ID 0bda:0139 Realtek Semiconductor Corp. RTS5139 Card Reader Controller Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 255 Vendor Specific Class bDeviceSubClass 255 Vendor Specific Subclass bDeviceProtocol 255 Vendor Specific Protocol bMaxPacketSize0 64 idVendor 0x0bda Realtek Semiconductor Corp. idProduct 0x0139 RTS5139 Card Reader Controller bcdDevice 39.60 iManufacturer 1 Generic iProduct 2 USB2.0-CRW iSerial 3 20100201396000000 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 39 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 4 CARD READER bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 500mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 6 bInterfaceProtocol 80 iInterface 5 Bulk-In, Bulk-Out, Interface Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0003 1x 3 bytes bInterval 10 Device Qualifier (for other device speed): bLength 10 bDescriptorType 6 bcdUSB 2.00 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 bNumConfigurations 1 Device Status: 0x0000 (Bus Powered) Even lsusb takes about 2-3 seconds to run. When inserting the external card reader into USB this is the difference: +Bus 001 Device 005: ID 0781:c2c3 SanDisk Corp. +Device Descriptor: + bLength 18 + bDescriptorType 1 + bcdUSB 2.00 + bDeviceClass 0 (Defined at Interface level) + bDeviceSubClass 0 + bDeviceProtocol 0 + bMaxPacketSize0 64 + idVendor 0x0781 SanDisk Corp. + idProduct 0xc2c3 + bcdDevice 94.04 + iManufacturer 0 + iProduct 4 MobileMate MS+ + iSerial 5 00211 + bNumConfigurations 1 + Configuration Descriptor: + bLength 9 + bDescriptorType 2 + wTotalLength 32 + bNumInterfaces 1 + bConfigurationValue 1 + iConfiguration 0 + bmAttributes 0x80 + (Bus Powered) + MaxPower 500mA + Interface Descriptor: + bLength 9 + bDescriptorType 4 + bInterfaceNumber 0 + bAlternateSetting 0 + bNumEndpoints 2 + bInterfaceClass 8 Mass Storage + bInterfaceSubClass 6 SCSI + bInterfaceProtocol 80 Bulk-Only + iInterface 0 + Endpoint Descriptor: + bLength 7 + bDescriptorType 5 + bEndpointAddress 0x81 EP 1 IN + bmAttributes 2 + Transfer Type Bulk + Synch Type None + Usage Type Data + wMaxPacketSize 0x0200 1x 512 bytes + bInterval 0 + Endpoint Descriptor: + bLength 7 + bDescriptorType 5 + bEndpointAddress 0x02 EP 2 OUT + bmAttributes 2 + Transfer Type Bulk + Synch Type None + Usage Type Data + wMaxPacketSize 0x0200 1x 512 bytes + bInterval 0 +Device Qualifier (for other device speed): + bLength 10 + bDescriptorType 6 + bcdUSB 2.00 + bDeviceClass 0 (Defined at Interface level) + bDeviceSubClass 0 + bDeviceProtocol 0 + bMaxPacketSize0 64 + bNumConfigurations 1 +Device Status: 0x0000 + (Bus Powered) + ... Hub Port Status: - Port 1: 0000.0100 power + Port 1: 0000.0503 highspeed power enable connect I'm not sure what to attach, I'm attaching the config for the kernel 3.17-rc2 and pasting here the config differences between the tested 3.16 kernel and the tested 3.17-rc2 kernel: --- kernel-3.16.config 2014-08-11 08:54:05.000000000 +0200 +++ kernel-3.17-rc2.config 2014-08-26 11:38:58.000000000 +0200 +CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y +CONFIG_BUILD_BIN2C=y +CONFIG_LOG_CPU_MAX_BUF_SHIFT=21 +CONFIG_IOSCHED_CFQ=m +CONFIG_CFQ_GROUP_IOSCHED=y +CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y +CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y +CONFIG_HAVE_ACPI_APEI=y +CONFIG_HAVE_ACPI_APEI_NMI=y +CONFIG_PMC_ATOM=y -CONFIG_NETFILTER_ADVANCED=y -CONFIG_NETFILTER_NETLINK=y -CONFIG_NETFILTER_NETLINK_LOG=y +CONFIG_NETFILTER_NETLINK=m -CONFIG_NETFILTER_XTABLES=y -CONFIG_NETFILTER_XT_MARK=m -CONFIG_NETFILTER_XT_TARGET_LOG=m -CONFIG_NETFILTER_XT_TARGET_NFLOG=y -CONFIG_NETFILTER_XT_TARGET_TCPMSS=y +CONFIG_NF_TABLES=m +CONFIG_NFT_EXTHDR=m +CONFIG_NFT_META=m +CONFIG_NFT_RBTREE=m +CONFIG_NFT_HASH=m +CONFIG_NFT_COUNTER=m +CONFIG_NFT_LOG=m +CONFIG_NFT_LIMIT=m +CONFIG_NFT_REJECT=m -CONFIG_IP_NF_IPTABLES=m -CONFIG_IP_NF_FILTER=m -CONFIG_IP_NF_TARGET_REJECT=m -CONFIG_IP_NF_TARGET_ULOG=m -CONFIG_IP_NF_MANGLE=m +CONFIG_NF_TABLES_IPV4=m +CONFIG_NFT_CHAIN_ROUTE_IPV4=m +CONFIG_NFT_REJECT_IPV4=m +CONFIG_NF_TABLES_ARP=m -CONFIG_NET_ACT_IPT=m +CONFIG_ACPI_I2C_OPREGION=y -CONFIG_MEMSTICK_DEBUG=y +CONFIG_RAS=y -CONFIG_DEFAULT_MESSAGE_LOGLEVEL=7 +CONFIG_MESSAGE_LOGLEVEL_DEFAULT=7 -CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y +CONFIG_INTERVAL_TREE_TEST=m +CONFIG_HAVE_KVM_IRQFD=y +CONFIG_INTERVAL_TREE=y +CONFIG_GLOB=y +CONFIG_ARCH_HAS_SG_CHAIN=y If I can help with more info, please let me know. Thanks.