Bug 73241
Description
leonfauster
2014-03-30 20:12:44 UTC
I also have a problem with this device (14e4:16bc / Broadcom BCM57765/57785 SDXC/MMC Card Reader), although I get different error messages in dmesg. On modprobe sdhci-pci: [ 8138.575342] sdhci-pci 0000:02:00.1: SDHCI controller found [14e4:16bc] (rev 21) [ 8138.575618] sdhci-pci 0000:02:00.1: dummy supplies not allowed [ 8138.575621] mmc0: no vqmmc regulator found [ 8138.575625] sdhci-pci 0000:02:00.1: dummy supplies not allowed [ 8138.575626] mmc0: no vmmc regulator found [ 8138.575743] mmc0: SDHCI controller on PCI [0000:02:00.1] using ADMA On inserting an SD card: [ 8175.636139] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock [ 8175.636165] mmc0: new ultra high speed SDR104 SDHC card at address 59b4 [ 8175.636521] mmcblk0: mmc0:59b4 SD32G 29.5 GiB [ 8175.646902] mmc0: Controller never released inhibit bit(s). [ 8185.683584] mmc0: Timeout waiting for hardware interrupt. [ 8185.683702] mmcblk0: error -110 sending status command, retrying [ 8185.685293] mmcblk0: unknown error -5 sending read/write command, card status 0x900 [ 8185.685310] end_request: I/O error, dev mmcblk0, sector 0 [ 8185.685313] Buffer I/O error on device mmcblk0, logical block 0 [ 8185.686148] ldm_validate_partition_table(): Disk read failed. [ 8185.686159] Dev mmcblk0: unable to read RDB block 0 [ 8185.686834] mmcblk0: unable to read partition table I have now tried applying the patch found at https://chromium.googlesource.com/chromiumos/third_party/kernel-next/+/fd1acc54a6b3db4e6503ccc4a9349f28b436031a The result is that my SD card reader now works! The output in dmesg when inserting the card is: [ 77.803116] mmc0: Skipping voltage switch [ 78.804834] mmc0: new high speed SDHC card at address 59b4 [ 78.808642] mmcblk0: mmc0:59b4 SD32G 29.5 GiB [ 78.809606] mmcblk0: p1 It looks like I get a steady read rate of 20 MB/s. I *think* the corresponding read rate I get in OS X is about 30 MB/s, which in any event is what the SD card is advertised as supporting. The underlying cause for this discrepancy might be some magic SD card voodoo or poor Linux drivers -- I have not been able to do a direct head-to-head/apples-to-apples speed test versus OS X. Ah, I was cheering prematurely, sorry. It seems that the mentioned patch did not effect the observed change in behavior, so presumably something has changed in the stock Ubuntu kernel to do this. The currently observed behavior with stock ubuntu (3.13.0-24-generic) as well as my patched kernel (3.13.0-27-generic with patch) is the following: 1. After a reboot, I can plug in the card and stuff is working (automatic mouting, etc) 2. The transfer rate (for reading) seems to be capped at 20 MB/s (I tested with the same rsync command in OS X, and got nearly 50 MB/s. The card is advertised as 30 MB/s) 3. After removing the card, subsequent inserts of the card do not work. I can only use the card one time after a reboot. 4. Similarly as 3., with system sleep. If I let the system sleep and then wake it up, the card reader does not work. ---- Observations from dmesg: When the reader works, I get the following messages in dmesg, which are absent when the reader does not work: "mmc0: Skipping voltage switch". This is followed by "mmc0: new high speed SDHC card at address 59b4". Notice that the card is incorrectly recognized as "high speed SDHC". When the reader fails, I get the following mesages in dmesg, which are absent when the reader does work: "sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock". This is followed by "mmc0: new ultra high speed SDR104 SDHC card at address 59b4". Notice that this time, the card is correctly identified as "ultra high speed SDHC". ---- It looks like it might be the case that the driver fails if and only if the card is "ultra high speed", unless the driver mistakenly believes that the card is merely "high speed". Is it possible to ask the driver to not attempt UHS mode? Well. I've now added "options sdhci debug_quirks2=4" to modprobe.d, and this seems to make it treat the card as "high speed" rather than "ultra high speed", which works, although at a poorer speed than theoretically possible. I have no idea what I am doing, though. (In reply to maghoff from comment #5) With which kernel version is this? (In reply to Nico Schlömer from comment #6) > (In reply to maghoff from comment #5) > > With which kernel version is this? I have reproduced this with both 3.15-rc6, considered as Ubuntu's upstream kernel, from here: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.15-rc6-utopic/ and the stock Ubuntu 14.04 kernel, which has version number 3.13.0-24-generic. The described workaround does not work for me. In fact, after booting the machine up (Macbook Pro 8,1) I don't see anything from SDHCI on syslog or dmesg. Do I understand correctly that you do _not_ need to apply the ChromeOS patch to get it to work? $ ls /dev/mm* ls: cannot access /dev/mm*: No such file or directory $ uname -a Linux fuji 3.13.0-27-generic #50-Ubuntu SMP Thu May 15 18:06:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux $ lspci | grep SD 02:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 10) $ dmesg -t | grep -i sdh sdhci: Secure Digital Host Controller Interface driver sdhci: Copyright(c) Pierre Ossman sdhci-pci 0000:02:00.1: SDHCI controller found [14e4:16bc] (rev 10) sdhci-pci 0000:02:00.1: dummy supplies not allowed sdhci-pci 0000:02:00.1: dummy supplies not allowed mmc0: SDHCI controller on PCI [0000:02:00.1] using ADMA $ sudo lspci -s 02:00.1 -vn 02:00.1 0805: 14e4:16bc (rev 10) (prog-if 01) Subsystem: 14e4:0000 Flags: bus master, fast devsel, latency 0, IRQ 17 Memory at a0420000 (64-bit, prefetchable) [size=64K] Capabilities: [48] Power Management version 3 Capabilities: [58] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [ac] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [150] Power Budgeting <?> Capabilities: [160] Virtual Channel Kernel driver in use: sdhci-pci $ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.13.0-27-generic root=UUID=1f3ee539-70dd-4ff4-83aa-b84f0f29918c ro quiet splash You understand correctly that I do not need to apply the ChromeOS patch in order to get the behavior I see. No output in dmesg when you insert an SD card? Hm. Good luck :) By the way, I have made a separate report for the problem as I have experienced it: http://www.mail-archive.com/linux-mmc@vger.kernel.org/msg26725.html (In reply to Nico Schlömer from comment #8) > The described workaround does not work for me. The workaround is described a bit sloppily. Note that you need to rebuild initrd for updates to modprobe.d to stick after a reboot ("sudo update-initramfs -u -k all" in Ubuntu) Alternatively, you can reload the sdhci module with parameters explicitly: sudo rmmod sdhci-pci sdhci sudo modprobe sdhci debug_quirks2=4 sudo modprobe sdhci-pci Comment #1 suggests this might be a duplicate of bug #59311. In that bug, turning off ASPM L1 seems to be a workaround. Can somebody try that here? Figure out the upstream bridge leading to the card reader by doing "lspci -vvxxx" and looking for the "secondary=04" line (for leonfauster@) or the "secondary=02" line (for maghoff@ and Nico). Then find the PCIe capability offset, e.g., if that bridge has "Capabilities: [40] Express (v2) Root Port", the offset is 0x40. Now add 0x10 to the PCIe capability offset, which in this case would give you 0x50. Look at the hex dump of the byte at offset 0x50, e.g., 50: 42 00 11 70 00 b2 14 00 00 00 40 01 00 00 00 00 The low-order two bits of the 0x42 are 10 (binary), which means ASPM L1 is enabled. You can turn it off by doing something like this (use the device and offset for your bridge, and only change the low two bits of the byte): setpci -s 00:1c.2 0x50.B=0x41 Or, just attach the complete "lspci -vvxxx" output here, and I'll figure out the setpci command for you to try. Please attach the complete dmesg log and "lspci -vvxxx" output here for reference. (In reply to Bjorn Helgaas from comment #11) Thanks Bjorn for your comment. I checked `lspci -vvxxx`, and the output is ``` 02:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 10) (prog-if 01) Subsystem: Broadcom Corporation Device 0000 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin B routed to IRQ 17 Region 0: Memory at a0420000 (64-bit, prefetchable) [size=64K] Capabilities: [48] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- Capabilities: [58] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [ac] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+ MaxPayload 128 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [150 v1] Power Budgeting <?> Capabilities: [160 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Kernel driver in use: sdhci-pci 00: e4 14 bc 16 06 00 10 00 10 01 05 08 40 00 80 00 10: 0c 00 42 a0 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 e4 14 00 00 30: 00 00 00 00 48 00 00 00 00 00 00 00 00 02 00 00 40: 00 00 00 00 00 00 00 00 01 58 03 c8 08 20 00 00 50: 03 00 00 00 00 00 00 00 05 ac 80 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 11 00 05 00 02 00 00 00 22 01 00 00 10 00 02 00 b0: 80 8d 90 05 10 5c 19 00 11 5c 07 00 43 01 11 10 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 1f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 51 78 57 ``` so there is no such line as "secondary=02". Thanks Nico, can you just attach the entire output of "lspci -vvxxx" (for the whole system, not just the card reader)? Use the "Add an attachment" link instead of pasting it in, because it's going to be pretty big :) Created attachment 138221 [details]
lspci -vvxxx output of my MacBookPro8,1
lspci output as requested.
Note that, as opposed to other people with SD card reader problems in this thread, I'm not seeing any dmesg or syslog output when I insert an SD card into the reader.
Hi Nico, your system has this: 00:1c.0 PCI bridge to [bus 02] (PCIe Root Port) 02:00.1 SDXC/MMC card reader There's a PCIe link between these two devices, and ASPM affects the management of that link. You can turn off ASPM L1 on the link with this: setpci -s 00:1c.0 0x50.B=0x41 If that makes a difference, it's a pretty good clue that the problem is related to ASPM. It that doesn't work, you could also try it with "0x50.B=0x40", which would turn off ASPM completely on that link. Please attach a complete dmesg log. Created attachment 138321 [details]
dmesg output
Neither of the `setpci` suggestions changed anything in the behavior. Full dmesg log attached.
unfortunately no positive result Settings: _________ $ lspci -vvxxx -s 00:1c.2 > before $ setpci -s 00:1c.2 0x50.B=0x41 $ lspci -vvxxx -s 00:1c.2 > after $ diff before after 20c20 < LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+ --- > LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- Retrain- > CommClk+ 52c52 < 50: 42 00 11 70 00 b2 14 00 00 00 40 01 00 00 00 00 --- > 50: 41 00 11 70 00 b2 14 00 00 00 40 01 00 00 00 00 Verify: _______ $ dd if=/dev/zero of=/dev/mmcblk0 bs=1024 count=2M $ tailf /var/log/messages Jun 6 15:32:12 ane kernel: end_request: I/O error, dev mmcblk0, sector 6208 Jun 6 15:32:12 ane kernel: Buffer I/O error on device mmcblk0, logical block 776 Jun 6 15:32:12 ane kernel: lost page write due to I/O error on mmcblk0 Jun 6 15:32:12 ane kernel: end_request: I/O error, dev mmcblk0, sector 6216 Jun 6 15:32:12 ane kernel: Buffer I/O error on device mmcblk0, logical block 777 Jun 6 15:32:12 ane kernel: lost page write due to I/O error on mmcblk0 Jun 6 15:32:12 ane kernel: mmcblk0: error -123 sending status command, retrying Jun 6 15:32:12 ane kernel: mmcblk0: error -123 sending status command, retrying Jun 6 15:32:12 ane kernel: mmcblk0: error -123 sending status command, aborting Jun 6 15:32:12 ane kernel: mmc0: card 59b4 removed Question: __________ How can i find the upstream link? I'm not sure if this is the chain: 00:1c.2 -> 04:00.1 ASPM OFF: __________ $ lspci -vvxxx > lspci-vvxxx_before_setpci40.txt $ setpci -s 00:1c.2 0x50.B=0x40 $ lspci -vvxxx > lspci-vvxxx_after_setpci40.txt Same result! Created attachment 138381 [details]
after booting
Created attachment 138391 [details]
lspci-vvxxx_before_setpci40
Created attachment 138401 [details]
lspci-vvxxx_after_setpci40
okay - i got it, 00:1c.2 (secondary=04) -> 04:00.1 is the right chain $ lspci -tv Both Nico and leonfauster tried disabling ASPM and it made no difference. So I think this problem is different from bug 59311. I don't have any more ideas to pursue from a PCI perspective. I think we need to get the driver maintainer to look at this. I'm not sure they pay attention to bugzilla. Here are some possible contacts from the MAINTAINERS file: BROADCOM BCM281XX/BCM11XXX ARM ARCHITECTURE (wrong chip, I know, but maybe the Broadcom email address goes somewhere useful) L: bcm-kernel-feedback-list@broadcom.com SECURE DIGITAL HOST CONTROLLER INTERFACE (SDHCI) DRIVER M: Chris Ball <chris@printf.net> L: linux-mmc@vger.kernel.org I haven't been able to test the ASPM angle yet, but I thought it worth mentioning that I have made a report of the problem as I see it to linux-mmc: http://thread.gmane.org/gmane.linux.kernel.mmc/27194 It has received no response for the two weeks it's been sitting there :( I don't know if this is to be expected or if I am supposed to nag :) Worth a ping, I'd say. The difference between upstream and Ubuntu is certainly interesting. I don't know how much work you want to go to, but it's possible you could learn something by comparing the upstream vs. Ubuntu drivers, trying the Ubuntu driver in upstream, etc. Obviously even the Ubuntu driver is broken compared to OS X, but maybe there's a hint there. Hi, I'm one of the MMC maintainers. I can't think of anything obvious that would cause this, which probably means it's not going to be fixed until someone with experience debugging kernel drivers gets hold of the hardware somehow. Chris, can you tell whether there's any PCI-related problem that would cause this? It seems like the driver claimed the device and seemed to mostly like what it saw, which means the BARs are probably programmed correctly. The only other PCI-related things I can think of would be DMA mapping issues or interrupt issues. I guess Magnus could compare lspci output (e.g., "lspci -vvxxx") from upstream/Ubuntu/OS X and see if there are any relevant differences. Okay, I tested disabling ASPM with both the upstream and downstream kernels, trying both `sudo setpci -s 00:1c.0 0x50.B=0x41` and `sudo setpci -s 00:1c.0 0x50.B=0x40`. Neither of these commands made any difference with either of the kernels. I will attach output of `dmesg` and `sudo lspci -vvxxx` for both kernels. Created attachment 139311 [details]
Output of lspci -vvxxx running the upstream kernel
Created attachment 139321 [details]
Output of lspci -vvxxx running the downstream kernel
Created attachment 139331 [details]
Output of dmesg running the upstream kernel
Created attachment 139341 [details]
Output of dmesg running the downstream kernel
Huh, in the upstream kernel, we don't turn on bus mastering: --- lspci-vvxxx-upstream.txt 2014-06-11 16:53:29.547799427 -0600 +++ lspci-vvxxx-downstream.txt 2014-06-11 16:53:43.560963858 -0600 02:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 21) (prog-if 01) - Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- + Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- That means the device wouldn't be able to perform DMA, which seems like a problem. Can you add some debug output in pci_set_master() to make sure it gets called and does turn on bus mastering? Uhh... sure. Is pci_set_master() a C function in the kernel? Can you give me the full path to the file? I take it that I will need to compile the kernel from source, right? Yep, pci_set_master() is a C function in drivers/pci/pci.c, and this would involve compiling the kernel from source. If you copy dev_info() examples, that output will show up in dmesg. Sorry, I thought you had already built the upstream kernel, but now I see that you're using Ubuntu's packaged upstream kernel. If you want to try it, maybe these instructions would be helpful: https://wiki.ubuntu.com/KernelTeam/GitKernelBuild Hey, i can reprocude this problem easily here: [ 13.768577] sdhci: Secure Digital Host Controller Interface driver [ 13.768585] sdhci: Copyright(c) Pierre Ossman [ 13.777050] Linux video capture interface: v2.00 [ 13.842686] sdhci-pci 0000:02:00.1: SDHCI controller found [14e4:16bc] (rev 10) [ 13.842960] sdhci-pci 0000:02:00.1: No vmmc regulator found [ 13.842964] sdhci-pci 0000:02:00.1: No vqmmc regulator found [ 13.843152] mmc0: SDHCI controller on PCI [0000:02:00.1] using ADMA this is with upstream Linux version 3.17.0-rc7: 02:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 10) (prog-if 01) Subsystem: Acer Incorporated [ALI] Device 0647 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- maybe we can start from here, if you need more information, let me know and i'll gladly provide it! Hi Tobias, can you please reproduce the problem and then attach the complete dmesg log and output of "lspci -vvxxx"? The original report was of errors like this: mmc0: Timeout waiting for hardware interrupt. mmcblk0: error -110 transferring data, sector 4096, nr 1024, cmd response 0x900, card status 0xc00 mmc0: Timeout waiting for hardware interrupt. end_request: I/O error, dev mmcblk0, sector 4096 Buffer I/O error on device mmcblk0, logical block 512 Is that what you're seeing? Does it work correctly with any OS on your hardware? Created attachment 152341 [details]
lspci -vvxxx (3.17-rc7)
Created attachment 152351 [details]
dmesg (3.17-rc7)
Yes that is exactly what i see here, see attached dmesg!
Last to note: tested this hardware working with windows now Thanks, Tobias. I don't see anything wrong with your device from a PCI perspective. Unlike maghoff's case mentioned in comment #32, bus mastering is enabled for your device: 02:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 10) (prog-if 01) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- There have been cases where the PCIe ASPM feature causes trouble. Previous reporters have not found a connection with *this* bug (see comment #22), but I am still curious because there is an _OSC error in your dmesg log: \_SB_.PCI0:_OSC invalid UUID _OSC request data:1 1f 0 acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM This looks like it might be a BIOS bug (I'm pretty sure Linux does supply a valid UUID when it evaluates _OSC). But maybe there's some issue there that we don't understand. Can you attach an acpidump, too? The "disabling ASPM" message really means OS *control* of ASPM is disabled, which means Linux won't touch the existing configuration. 00:1c.0 and 02:00.1 both show "LnkCtl: ASPM L1 Enabled", so the BIOS must have enabled ASPM before Linux started. Before using the device, you could try this: setpci -s02:00.0 0xbc.l=0x10110140 setpci -s02:00.1 0xbc.l=0x10110140 setpci -s02:00.2 0xbc.l=0x10110140 setpci -s02:00.3 0xbc.l=0x10110140 setpci -s00:1c.0 0x50.l=0x70110040 Collect another "lspci -vvxxx" (it should show "ASPM disabled" for these devices) and then try the card reader again. Created attachment 152821 [details]
acpidump
Created attachment 152831 [details]
lscpi (3.17)
lspci -vvxxx with the following commands done:
setpci -s02:00.0 0xbc.l=0x10110140
setpci -s02:00.1 0xbc.l=0x10110140
setpci -s02:00.2 0xbc.l=0x10110140
setpci -s02:00.3 0xbc.l=0x10110140
setpci -s00:1c.0 0x50.l=0x70110040
After setting these commands:
...
29258.082291] mmc0: Timeout waiting for hardware interrupt.
[29258.082382] end_request: I/O error, dev mmcblk0, sector 2361344
[29258.082386] Buffer I/O error on device mmcblk0p1, logical block 294912
...
so unfortunately this hasn't helped.
Acer TravelMate B113 04:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 10) 3.18.8-201.fc21.x86_64 Kingston 8Gb Class 10 UHS-I SDHC/SDXC debug_quirks=SDHCI_QUIRK_BROKEN_ADMA debug_quirks2=SDHCI_QUIRK2_NO_1_8_V options sdhci debug_quirks=0x40 debug_quirks2=0x4 works for me. (In reply to vga from comment #43) > Acer TravelMate B113 > > 04:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC > Card Reader (rev 10) > > 3.18.8-201.fc21.x86_64 > > Kingston 8Gb Class 10 UHS-I SDHC/SDXC > > debug_quirks=SDHCI_QUIRK_BROKEN_ADMA debug_quirks2=SDHCI_QUIRK2_NO_1_8_V > > options sdhci debug_quirks=0x40 debug_quirks2=0x4 > > works for me. Hey, nice findings! Tested this and it is working for me as well. Kernel 4.0rc3 with Kingston 1GB microSD SDC and 32GB microSD Class 10 SDHC cards. To add one more data point: I see something possibly related on a MacBookPro10,2. I have a JetDrive Lite which is an ultra-high speed card, working fine when running the machine with OS X. On Linux (tested with the 4.0.0-rc6 kernel) I need to apply a combination of the tricks in this thread to get things working: I need to disable ASPM with setpci -s 00:1c.2 0x50.B=0x41 in order to get anything out of the card reader at all (except for mmc0: Timeout waiting for hardware interrupt in dmesg). However, if I do not add sdhci.debug_quirks=0x40 sdhci.debug_quirks2=0x4 then the card is correctly recognised as an ultra-high speed card but nothing ever gets read (mmc0 complains about not being able to read the partition table). With a combination of the two things work fine, but horribly slow. If there's anything else I can try, let me know. I have noticed in the past that messing with the wireless (which is a Broadcom BCM4331) sometimes enables the machine to mount the card even without the two settings above, but that might have been related to ASPM. I attach an lspci -vvxxx just in case. Created attachment 173251 [details]
lspci -vvxxx on MacBookPro10,2 after applying the sdhci options and setpci
Incidentally, while this problem occurs for me with normal and high speed cards as well (probably because ASPM is involved, not just ultra-high speed), I used to have no problems with those two types of cards on this machine with Linux at some point in the distant past (something like 2 years ago). I have no high hopes that this will help (and no time to run a git bisect going back that far), but just in case this rings a bell with anyone... (In reply to Kasper Peeters from comment #47) > Incidentally, while this problem occurs for me with normal and high speed > cards as well (probably because ASPM is involved, not just ultra-high > speed), I used to have no problems with those two types of cards on this > machine with Linux at some point in the distant past (something like 2 years > ago). I have no high hopes that this will help (and no time to run a git > bisect going back that far), but just in case this rings a bell with > anyone... Try to disable MSI interrupts with kernel boot option "pci=nomsi". Disabling MSI interrupts on the MacBookPro10,2 does not help, unfortunately. # dmidecode System Information Manufacturer: Acer Product Name: AO756 Version: V1.09 # lspci -s 04:00.1 -vn 04:00.1 0805: 14e4:16bc (rev 10) (prog-if 01) Subsystem: 1025:0742 Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at c0400000 (64-bit, prefetchable) [size=64K] Capabilities: [48] Power Management version 3 Capabilities: [58] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [ac] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [150] Power Budgeting <?> Capabilities: [160] Virtual Channel Kernel driver in use: sdhci-pci Kernel modules: sdhci-pci # lspci | grep 04:00.1 04:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 10) # uname -a Linux x.localdomain 2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux # cat /etc/redhat-release CentOS release 6.6 (Final) # cat /proc/cmdline ro <snip> elevator=noop sdhci.debug_quirks=0x40 intel_pstate=disable acpi_backlight=vendor # cat /etc/modprobe.d/mmc.conf options sdhci debug_quirks=0x40 this config works (slow writes, not setpci command) for me so far now. Confirm bug for kernels 3.13 to 4.2.0.32. Kernels earlier than 3.12 won't see network card and wi-fi, so don't tested. Acer E1-570G, internal broadcom card reader 14e4:16bc. Tried this: rmmod mmc_block rmmod sdhci_pci rmmod sdhci modprobe mmc_block modprobe sdhci debug_quirks=0x40 modprobe sdhci_pci Don't help. modprobe sdhci debug_quirks=SDHCI_QUIRK_BROKEN_ADMA debug_quirks2=SDHCI_QUIRK2_NO_1_8_V says "Unknown option for debug_quirks" and quirks 2 too. Now working on kernel 3.13, may be it matters. Syslog: Mar 7 00:03:51 term kernel: [ 840.514927] sdhci: Secure Digital Host Controller Interface driver Mar 7 00:03:51 term kernel: [ 840.514932] sdhci: Copyright(c) Pierre Ossman Mar 7 00:03:59 term kernel: [ 848.922240] sdhci-pci 0000:02:00.1: SDHCI controller found [14e4:16bc] (rev 1) Mar 7 00:03:59 term kernel: [ 848.922447] mmc0: no vqmmc regulator found Mar 7 00:03:59 term kernel: [ 848.922451] mmc0: no vmmc regulator found Mar 7 00:03:59 term kernel: [ 848.922840] mmc0: SDHCI controller on PCI [0000:02:00.1] using DMA Mar 7 00:04:11 term kernel: [ 860.685496] mmc0: Timeout waiting for hardware interrupt. Mar 7 00:04:11 term kernel: [ 860.687663] mmc0: error -110 whilst initialising SD card black@term:~$ cat /etc/modprobe.d/mmc.conf options sdhci debug_quirks=0x40 debug_quirks2=0x4 and reboot don't help. So, there is no live developer for Broadcom in linux community? ;-) This is bad news. Addon: Tested old live cd's. In kernel 3.2.* driver says: "Invalid iomem size. You may experience problems." and repeatedly "Timeout waiting for hardware interrupt". Card not detected. Very old live-cd's don't see sata dvd-rw, and cannot boot. This bug is still present on my Acer V5-171 with Debian's linux-image-4.5.0-2-amd64 kernel: This happens with different SD cards which I can of course all read just fine if I just reboot to Windows. [ 1.211091] tg3 0000:04:00.0 eth0: Tigon3 [partno(BCM57785) rev 57785100] (PCI Express) MAC address b8:88:e3:0f:3e:60 [ 1.212285] tg3 0000:04:00.0 eth0: attached PHY is 57765 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1]) [ 1.213285] tg3 0000:04:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] [ 1.214282] tg3 0000:04:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit] [ 1.215650] sdhci-pci 0000:04:00.1: SDHCI controller found [14e4:16bc] (rev 10) [ 1.218393] mmc0: SDHCI controller on PCI [0000:04:00.1] using ADMA 64-bit ... [ 30.654383] mmc0: Timeout waiting for hardware interrupt. [ 40.669967] mmc0: Timeout waiting for hardware interrupt. ... 04:00.0 Ethernet controller [0200]: Broadcom Corporation NetLink BCM57785 Gigabit Ethernet PCIe [14e4:16b5] (rev 10) Subsystem: Acer Incorporated [ALI] NetLink BCM57785 Gigabit Ethernet PCIe [1025:0743] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 18 Region 0: Memory at c0410000 (64-bit, prefetchable) [size=64K] Region 2: Memory at c0420000 (64-bit, prefetchable) [size=64K] Expansion ROM at afb00000 [disabled] [size=2K] Capabilities: <access denied> Kernel driver in use: tg3 Kernel modules: tg3 04:00.1 SD Host controller [0805]: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader [14e4:16bc] (rev 10) (prog-if 01) Subsystem: Acer Incorporated [ALI] BCM57765/57785 SDXC/MMC Card Reader [1025:0743] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin B routed to IRQ 19 Region 0: Memory at c0400000 (64-bit, prefetchable) [size=64K] Capabilities: <access denied> Kernel driver in use: sdhci-pci Kernel modules: sdhci_pci I can confirm the bug is still present on the following HW and SW: Intel BayTrail platform tested with kernel-4.0.4 up till Kernel-4.6 However the target here is eMMC. in both (HS200 and SDR50 timing specs) Putting the eMMC under read/write stress results in the following errors in dmesg: [ 4724.616669] mmc1: Timeout waiting for hardware interrupt. [ 4724.619147] mmcblk0: error -110 transferring data, sector 4196480, nr 8, cmd response 0x900, card status 0xc00 [ 4724.619179] blk_update_request: I/O error, dev mmcblk0, sector 4196480 [ 4724.619208] Buffer I/O error on dev mmcblk0p1, logical block 524304, lost async page write [ 4734.646604] mmc1: Timeout waiting for hardware interrupt. [ 4734.648653] mmc1: Got data interrupt 0x02000000 even though no data operation was in progress. [ 4734.648675] mmc1: Got data interrupt 0x00000002 even though no data operation was in progress. [ 4734.650896] mmcblk0: error -110 sending stop command, original cmd response 0x900, card status 0x400900 [ 4734.650917] mmcblk0: error -110 transferring data, sector 9112, nr 8, cmd response 0x900, card status 0x0 [ 4734.650941] mmc1: tried to reset card [ 4734.650951] mmcblk0: retrying using single block read The errors continue to appear in case eMMC is accessed further, a system reboot gets the eMMC functional again. Following workarounds avoid the problem : 1. using maxcpus=3 as kernel parameter 2. disabling any additional SDHCI controller in BIOS (SD-CARD controller/support in my case) this helped only a particular type of HW. An additional hint which I can share is, with kernel-4.5.0-0.rc7.git0.2 the error appeared within 6 hours of operation, where as with kernel-4.5.0-0.rc7.git2.1 the error appears after 21 hours of operation. Though nothing changed in the sdhci/mmc driver between the two but there are lots of kernel configs which got added to the later. pci=nomsi did not help and disabling the ADMA engine with the help of quirks is a sub optimal solution causing severe performance degradation, thus no applicable in my case. An optimal solution is however expected. I have the same issue here on a Macbook Pro 13" Mid-2012 v9.2. Originally I get issues such as: [ 1197.661949] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock [ 1197.661966] mmc0: new ultra high speed DDR50 SDXC card at address e624 [ 1197.662673] mmcblk0: mmc0:e624 SU64G 59.5 GiB [ 1197.672982] mmc0: Controller never released inhibit bit(s). [ 1207.689035] mmc0: Timeout waiting for hardware interrupt. [ 1207.689316] mmcblk0: error -110 sending status command, retrying [ 1207.739072] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock Workaround mentioned by Maghoff: sudo rmmod sdhci-pci sdhci sudo modprobe sdhci debug_quirks2=4 sudo modprobe sdhci-pci Results in a working card for me albeit at only high speed [ 1746.384709] sdhci: Secure Digital Host Controller Interface driver [ 1746.384714] sdhci: Copyright(c) Pierre Ossman [ 1752.738484] sdhci-pci 0000:01:00.1: SDHCI controller found [14e4:16bc] (rev 10) [ 1752.739905] mmc0: SDHCI controller on PCI [0000:01:00.1] using ADMA 64-bit [ 1758.999576] mmc0: new high speed SDXC card at address e624 [ 1759.000574] mmcblk0: mmc0:e624 SU64G 59.5 GiB [ 1759.001786] mmcblk0: p1 p2 Want to thank maghoff for the workaround as now at least I can transfer stuff reliably without having to use MTP (Horrible) or Samba (Death slow). Not sure if I want to take this further but if I do I will report back. New information. I have been messing around with the kernel source see if I can get further towards a fix for this issue. Definitely something earlier in the UHS init is breaking the tuning. I am unsure. Can someone else confirm that me thinking that the BCM57785 has an on board voltage switching regular and so there is no reason that it wouldn't be capable of v1.8? Some Hardware Tests: --TEST1-- Physically disconnected BCM4331 Wifi Card + UHS-I card + Cat5 not plugged into BCM57785 Ethernet + tg3 (ethernet) module loaded = BAD: Timeouts occur waiting for hardware interrupt. [ 42.784679] mmc0: Timeout waiting for hardware interrupt. [ 52.812088] mmc0: Timeout waiting for hardware interrupt. [ 60.889130] mmc0: Card removed during transfer! [ 60.889138] mmc0: Resetting controller. [ 60.899787] mmc0: error -123 whilst initialising SD card [ 72.120058] mmc0: Timeout waiting for hardware interrupt. [ 82.147469] mmc0: Timeout waiting for hardware interrupt. [ 101.135438] mmc0: Timeout waiting for hardware interrupt. [ 111.162834] mmc0: Timeout waiting for hardware interrupt. [ 121.190203] mmc0: Timeout waiting for hardware interrupt. [ 140.204860] mmc0: Timeout waiting for hardware interrupt. [ 140.213640] mmc0: error -123 whilst initialising SD card --TEST2-- Physically disconnected BCM4331 Wifi Card + UHS-I card + Cat5 plugged into BCM57785 Ethernet <--- Change here + tg3 (ethernet) module loaded = Good: voltage switch gets skipped and card loads as high speed [ 1755.779744] tg3 0000:01:00.0 enp1s0f0: Link is up at 1000 Mbps, full duplex [ 1755.779766] tg3 0000:01:00.0 enp1s0f0: Flow control is on for TX and on for RX [ 1755.779772] tg3 0000:01:00.0 enp1s0f0: EEE is disabled [ 1775.595860] mmc0: Skipping voltage switch [ 1776.610955] mmc0: new high speed SDXC card at address e624 [ 1776.614773] mmcblk0: mmc0:e624 SU64G 59.5 GiB (ro) [ 1776.616032] mmcblk0: p1 p2 --TEST3-- Physically disconnected BCM4331 Wifi Card + UHS-I card + Cat5 plugged into BCM57785 Ethernet + tg3 (ethernet) module unloaded <--- Change here = Bad: Same result as TEST1 --TEST4-- Physically disconnected BCM4331 Wifi Card + UHS-I card + Cat5 plugged into BCM57785 Ethernet + tg3 (ethernet) module unloaded + sdhci debug_quirks2=4 <--- Change here = Bad: Same result as TEST1 --TEST5-- Connected BCM4331 Wifi Card + UHS-I card + Cat5 not plugged into BCM57785 Ethernet + tg3 module loaded + no sdhci debug quriks = Bad: Main bug result [ 177.120002] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock [ 177.120023] mmc0: new ultra high speed DDR50 SDXC card at address e624 [ 177.129048] mmcblk0: mmc0:e624 SU64G 59.5 GiB (ro) [ 177.139349] mmc0: Controller never released inhibit bit(s). [ 187.163886] mmc0: Timeout waiting for hardware interrupt. [ 187.163944] mmcblk0: error -110 sending status command, retrying [ 187.213949] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock --TEST6-- Connected BCM4331 Wifi Card + UHS-I card + Cat5 not plugged into BCM57785 Ethernet + tg3 module loaded + sdhci debug_quirks2=4 = Good: Card gets deteched but with no voltage skips - Similar to TEST2 [ 310.925853] sdhci-pci 0000:01:00.1: SDHCI controller found [14e4:16bc] (rev 10) [ 310.927211] mmc0: SDHCI controller on PCI [0000:01:00.1] using ADMA 64-bit [ 313.079711] mmc0: new high speed SDXC card at address e624 [ 313.081903] mmcblk0: mmc0:e624 SU64G 59.5 GiB (ro) [ 313.082996] mmcblk0: p1 p2 I will also be attaching lspci -v and dmesg for two wifi plugged in/unplugged states that includes logs of me doing the tests above: wifi_unplugged.lspci wifi_unplugged.dmesg wifi_plugged_in.lspci wifi_plugged_in.dmesg Created attachment 218931 [details]
wifi_unplugged lspci -v
Created attachment 218941 [details]
wifi_unplugged dmesg
Created attachment 218951 [details]
wifi_plugged_in lspci -v
Created attachment 218961 [details]
wifi_plugged_in dmesg
Wow! This works for me too. My devices: Notebook Acer V5-171 lspci 04:00.0 Ethernet controller: Broadcom Corporation NetLink BCM57785 Gigabit Ethernet PCIe (rev 10) 04:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 10) SD Card 2GB Speed class 2 Plugging in an Ethernet cable 'fixes' this mmc0 error for me too! Same devices as Ivan Romanov: Notebook Acer V5-171 03:00.0 Network controller: Qualcomm Atheros AR9462 Wireless Network Adapter (rev 01) 04:00.0 Ethernet controller: Broadcom Corporation NetLink BCM57785 Gigabit Ethernet PCIe (rev 10) 04:00.1 SD Host controller: Broadcom Corporation BCM57765/57785 SDXC/MMC Card Reader (rev 10) But the SD card is a Transcend SDHC class 6 8GB. Note: I also dug up a Linux driver on Broadcom's site. I thought diff-ing it against the current Linux driver might give hints as to what's wrong. I have not had time to look into it though (and I don't know anything about kernel code in the first place so it would be a long shot). https://www.broadcom.com/support/ethernet-nic-netlink-57xx Seems the trick with debug_quirks and Ethernet cable was broken in kernel-4.4.13. I use Fedora 22 system. So Ethernet cable works with kernel-4.4.12-200.fc22.x86_64 package but doesn't work with kernel-4.4.13-200.fc22.x86_64. On Fedora 25 sd-card reader works again $ uname -a Linux localhost.localdomain 4.11.3-200.fc25.x86_64 #1 SMP Thu May 25 19:03:07 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux $ dmesg # when remove and insert mmc card [ 133.055810] mmc0: card 0007 removed [ 144.610306] nf_conntrack: default automatic helper assignment has been turned off for security reasons and CT-based firewall rule not found. Use the iptables CT target to attach helpers instead. [ 146.422410] mmc0: Timeout waiting for hardware cmd interrupt. [ 146.422422] sdhci: =========== REGISTER DUMP (mmc0)=========== [ 146.422461] sdhci: Sys addr: 0x00000000 | Version: 0x00001502 [ 146.422469] sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000 [ 146.422476] sdhci: Argument: 0x00000c00 | Trn mode: 0x00000000 [ 146.422484] sdhci: Present: 0x1eff0001 | Host ctl: 0x00000001 [ 146.422491] sdhci: Power: 0x0000000f | Blk gap: 0x00000000 [ 146.422498] sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa07 [ 146.422505] sdhci: Timeout: 0x00000000 | Int stat: 0x00000000 [ 146.422512] sdhci: Int enab: 0x00ff0083 | Sig enab: 0x00ff0083 [ 146.422519] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000 [ 146.422527] sdhci: Caps: 0x176ec8b0 | Caps_1: 0x03002177 [ 146.422534] sdhci: Cmd: 0x0000341a | Max curr: 0x00000000 [ 146.422539] sdhci: Host ctl2: 0x00000000 [ 146.422541] sdhci: =========================================== I don't know it this helps, but after the system is woken up from hibernate, the MMC Card works. After a reboot or suspend, it doesn't. (In reply to Johannes Geiss from comment #65) > I don't know it this helps, but after the system is woken up from hibernate, > the MMC Card works. > > After a reboot or suspend, it doesn't. Johannes, it looks like Konstantin closed this as "obsolete", I assume because the original report is so old and there's no obvious progress. But it sounds like you're still seeing an issue. Can you open a new report from scratch, with current details (platform, kernel version, complete dmesg attached, complete "sudo lspci -vv" output from after reboot (MMC doesn't work) + after wakeup (MMC does work) attached)? (In reply to Bjorn Helgaas from comment #66) > Johannes, it looks like Konstantin closed this as "obsolete", I assume > because the original report is so old and there's no obvious progress. Thanks for the hint. I tried reading SD card on MacBook Pro 9,2, Mac mini 6,1 & 6,2 with 4 cards: 32 GB SanDisk, 256 MB SanDisk, 64 GB SanDisk and 8 GB Transcend. So far, everything seems to be OK after a cold-boot, reboot, suspend and hibernate. One of my previous SD cards was defective. For me this issue is closed. Sounds good, thanks Johannes! If you see problems in the future, please open a new report. |