Bug 204003

Summary: RTS5229 built in card reader not detected after Linux 5.0.x
Product: Drivers Reporter: josiahspore
Component: MMC/SDAssignee: drivers_mmc-sd
Status: NEW ---    
Severity: high CC: chris2553, david.lindstrom, dougmiles42, i.chryssochoos+kernbug, rogerheflin, scott
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.1.x, 5.2.x, 5.3.x Subsystem:
Regression: Yes Bisected commit-id:
Attachments: kernel dmesg
cat of /proc/modules when stuck on initramfs
cpu info
lsmod on Manjaro
lspci -vvv on Manjaro
MMC trace on SD card insertion, kernel 5.0.13
MMC trace on SD card insertion, kernel 5.2.2
attachment-15581-0.html

Description josiahspore 2019-06-27 04:32:51 UTC
Intel NUC NUC6CAYB's card reader acts is not detected after Linux 5.0.x. Everything is currently working just fine on 5.0.21 but I can't use anything higher because my system is installed on a sdcard. Anything higher than 5.0 causes my system to hang on finding root.
Comment 1 josiahspore 2019-06-27 04:33:19 UTC
Created attachment 283451 [details]
kernel dmesg
Comment 2 josiahspore 2019-06-27 04:34:17 UTC
Created attachment 283453 [details]
cat of /proc/modules when stuck on initramfs
Comment 3 josiahspore 2019-06-27 04:34:45 UTC
Created attachment 283455 [details]
cpu info
Comment 4 josiahspore 2019-06-27 04:35:32 UTC
Created attachment 283457 [details]
lsmod on Manjaro
Comment 5 josiahspore 2019-06-27 04:36:03 UTC
Created attachment 283459 [details]
lspci -vvv on Manjaro
Comment 6 josiahspore 2019-06-27 17:20:33 UTC
Another thing to add is that 5.1-rc1 is where the bug started.
Comment 7 David Lindstrom 2019-07-23 13:41:06 UTC
This bug is also present on the Intel NUC7CJYH board, which uses the RTS5229 PCI Express Card reader.
Kernel >= 5.1 fails to detect any SD card. Nothing shows up when running lsblk.

Also tried 
# udevadm monitor

while inserting a card. Still no output.
Downgrading the kernel to < 5.1 solves the problem.

I did a tracing of kernel mmc events on kernel 5.0.13 and 5.2.2, whilst inserting a card, see the attachments. 
According to the trace, all mmc requests return cmd_err=-110 on kernel 5.2.2.
Comment 8 David Lindstrom 2019-07-23 13:45:53 UTC
Created attachment 283931 [details]
MMC trace on SD card insertion, kernel 5.0.13

SD card is detected correctly on kernel 5.0.13
Obtained using:

echo 1 >  /sys/kernel/debug/tracing/events/mmc/enable 
cat /sys/kernel/debug/tracing/trace_pipe > mmc_trace.txt
Comment 9 David Lindstrom 2019-07-23 13:48:16 UTC
Created attachment 283933 [details]
MMC trace on SD card insertion, kernel 5.2.2

SD card is not detected on kernel 5.2.2
Obtained using:

echo 1 >  /sys/kernel/debug/tracing/events/mmc/enable 
cat /sys/kernel/debug/tracing/trace_pipe > mmc_trace.txt

Mmc requests return cmd_err -110
Comment 10 josiahspore 2019-10-07 06:25:47 UTC
Did a git bisect and found the culprit.

bede03a579b3b4a036003c4862cc1baa4ddc351f is the first bad commit
commit bede03a579b3b4a036003c4862cc1baa4ddc351f
Author: RickyWu <ricky_wu@realtek.com>
Date:   Tue Feb 19 20:49:58 2019 +0800

    misc: rtsx: Enable OCP for rts522a rts524a rts525a rts5260
    
    this enables and adds OCP function for Realtek A series cardreader chips
    and fixes some OCP flow in rts5260.c
    
    Signed-off-by: RickyWu <ricky_wu@realtek.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

:040000 040000 65bfdc473b7b85cb423ff528309fc92d73eae5b4 1292d8564f678027d0e5c77550e37d696b134b28 M	drivers

Just revert that and you'll be golden.

rts522a,rts524a,rts525a,rts5260
So somehow OCP got enabled for rts5229 unless a means rts522x. I guess they need to make sure its not enabled for 5229.
Comment 11 josiahspore 2019-10-07 06:32:35 UTC
I wounder if this has anything to do with that not working.
[    2.614469] mmc0: cannot verify signal voltage switch
Comment 12 Scott Brown 2020-05-17 18:51:25 UTC
Condition described in this bug is still present on NUC7PJYH with Realtek Semiconductor Co., Ltd. RTS5229 PCI Express Card Reader. Now on kernel version 5.6.12.

The title is a bit misleading as the reader is detected in e.g. lspci, just card insertion doesn't cause anything to happen, nothing in dmesg, nothing in lsblk, as described above.

There was apparently commit bede03a579b3b4a036003c4862cc1baa4ddc351f singled out above. Is no one working on reverting this for the RTS5229? Can we get this assigned?
Comment 13 Chris Clayton 2020-06-30 20:05:38 UTC
This bug is a year old now, so here's some more diagnostics that may also act as a ping for the developer who created the regression to fix it, please.

lspci on Ubuntu 20.04 running on Intel NUC6CAYH. Kernel is 5.4.0-39-generic.

01:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5229 PCI Express Card Reader (rev 01)
        Subsystem: Intel Corporation RTS5229 PCI Express Card Reader
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 255
        Region 0: Memory at 91300000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+, NROPrPrP-, LTR-
                         10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS-, TPHComp-, ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [140 v1] Device Serial Number 00-00-00-01-00-4c-e0-00
        Kernel modules: rtsx_pci
Comment 14 Chris Clayton 2020-07-08 08:05:31 UTC
I've reverted* bede03a579b3b4a036003c4862cc1baa4ddc351f and built the 5.4.0-40 kernel**.
The RTS5229 card reader in my NUC6CAYH now works with both SD and MMC cards.

* The commit does not revert cleanly because of some later changes. The manual fix up is not too difficult, however.

** Actually, the build fails but after the modules have been built so the rtsx_pci.ko driver module is available in the build directory. I installed it by hand and then ran depmod manually. The build system is a maze of voodoo that I am still trying to untangle to get a full set of deb files produced. Nevertheless, my testing strongly suggests that commit bede03a579b3b4a036003c4862cc1baa4ddc351f is the culprit.

I've also reverted the commit and built the resultant kernel on my (non-debian/ubuntu powered) laptop, which also has a card reader (RTL8411B PCI Express Card Reader) that needs this driver. That continues to work after he reversion.
Comment 15 Chris Clayton 2020-08-02 06:39:13 UTC
I've done more investigating of this regression and I've narrowed it down to rtsx_pci_init_ocp() in drivers/misc/cardreader/rtsx_pcr.c. It is called by rtsx_pci_init_hw() (from the same source file. If a comment out that call, build and install the resultant kernel, the rts5229 reader and the card inserted in are detected and work fine. The machine is an Intel NUC Tall Arches Canyon NUC6CAYH Celeron J3455. I can't take this diagnosis any further because, as far as I can see, the only change the patch identified as the culprit (see comment 10 above) makes to rtsx_pci_init_ocp(), shouldn't be executed for the rts5229. So far, I've built and installed 5.7.12 and 5.4.54 the latter of which I am currently running on the machine. I think I tried an unpatched 5.7 series kernel earlier in my investagations and the card reader did not work. To be sure, I'll build an older 5.4 kernel that more closely matches the one currently included in Ubuntu 20.04, and check whether that works when rtsx_pci_init_ocp() is not called.
Comment 16 Chris Clayton 2020-08-03 05:56:01 UTC
Following on from comment 15, I've patched and tested 5.4.44 (which Ubuntu's latest kernel seems to be based on) and the card reader works fine.

Despite saying that I could take this no further, I did push on a little further and I've isolated the code that causes the rts5229 to no longer work since bede03a579b3 was applied. I've also reported the bug to LKML - see https://marc.info/?l=linux-kernel&m=159639774221968

Since reverting bede03a579b3 on recent kernels needs manual intervention, I've devised a simple patch (against linux-5.7.12). It applies correctly to 5.8.0 too but with offsets. If it doesn't apply properly on other kernels  between 5.1-rc1 and 5.7.12, it is simple enough to apply manually.
Comment 17 Yannis 2020-08-22 10:22:16 UTC
I have tested the patch proposed in https://marc.info/?l=linux-kernel&m=159665699504104&w=2 on the Intel NUC7CJYH (RTS5529 is the onboard card reader) and I can now read/write on SD cards again.

This is under Ubuntu 20.04 with a standard Ubuntu 5.4.0-42-generic kernel.

Thank you for the fix Chris.
Comment 18 Chris Clayton 2020-11-01 22:21:48 UTC
Kernels 5.9.3, 5.8.18 and 5.4.74 have been released today and they contain the patch that enables the card reader on the Intel NUC boxes. Hopefully the patch will make its way through to distro kernels in the not too distant future.
Comment 19 Scott Brown 2020-11-06 22:42:56 UTC
My results in dmesg when inserting an SD card on kernel 5.8.18-200.fc32.x86_64 on a NUC7PJYH:

mmc1: cannot verify signal voltage switch
mmc1: error -110 whilst initialising SD card
mmc1: cannot verify signal voltage switch
mmc1: error -110 whilst initialising SD card
mmc1: cannot verify signal voltage switch
mmc1: error -110 whilst initialising SD card

That's more than what happened before (when dmesg would show nothing) but I guess maybe the complete fix wasn't pulled in to the Fedora build, or some other issue still exists. Not sure.
Comment 20 Scott Brown 2020-11-17 01:18:59 UTC
Same as above on kernel 5.9.8-100.fc32.x86_64.
Comment 21 Chris Clayton 2020-11-17 09:43:48 UTC
Sorry, I can't help to debug that because the card reader works fine on my NUC6CAYH. All I can say is that error 110 is a timeout.

It's interesting that at comment 17 above, Yannis reported that the patch worked on the same hardware platform. However, that was on an Ubuntu 5.4.x kernel. I'd be tempted to raise this issue on with the Fedora support people.
Comment 22 Yannis 2020-11-23 15:19:12 UTC
Hi,
Since my comment above, I've moved to Ubuntu 20.10 which uses kernel 5.8. It looks like it's going to be a couple of weeks before Ubuntu reaches 5.8.18 so I can't confirm 100% that the problem is fixed yet.
However, as the kernel code change is the same as Chris' patch I can't see why it wouldn't work.

Scott, Chris' patch just prevents the SD card reader from being powered down and this is working in your case. The error message seems to point to an incompatibility between your card and the reader. I haven't had this problem with the NUC but other, older, machines have trouble reading my high speed/capacity cards.
Comment 23 Scott Brown 2020-11-23 17:43:32 UTC
I see. The two cards I have tested with are a 64GB ADATA MicroSDXC rated USH-1 Class 10 bought in 2014 and a 32GB ADATA MicroSDXC rated UHS-1 Class 10 V10 A1 bought in 2018, each in a micro-to-full SD adapter. I will see if I can scare up some other cards to test with.
Comment 24 Doug Miles 2023-07-01 13:22:31 UTC
Bumping this, because it seems to have died on the vine and is still an issue in kernel 6.2, as far as I can tell.
Comment 25 Chris Clayton 2023-07-01 15:29:12 UTC
Created attachment 304518 [details]
attachment-15581-0.html

I no longer have the hardware, so cant help any further.

Chris

On Sat, 1 Jul 2023, 14:22 , <bugzilla-daemon@kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=204003
>
> Doug Miles (dougmiles42@gmail.com) changed:
>
>            What    |Removed                     |Added
>
> ----------------------------------------------------------------------------
>                  CC|                            |dougmiles42@gmail.com
>
> --- Comment #24 from Doug Miles (dougmiles42@gmail.com) ---
> Bumping this, because it seems to have died on the vine and is still an
> issue
> in kernel 6.2, as far as I can tell.
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 26 i.chryssochoos 2023-07-01 20:31:13 UTC
Hi,
I don't have access to 6.2. I'm currently on Ubuntu 22.04 with kernel
5.15.0-76-generic (on the same box as back in 2020) and behavior is
mixed:

I tried a Samsung Evo 256GB (U3, V30 etc.)  and had an error:
[5189171.233955] mmc1: cannot verify signal voltage switch
[5189171.568328] mmc1: error -110 whilst initialising SD card

Switching to a 10+ year old 8GB SDHC got things working:
[5189234.395708] mmc1: new high speed SDHC card at address 0007
[5189234.412143] mmcblk1: mmc1:0007 SD08G 7.50 GiB 

Reinserting the 256GB card worked:
[5189290.073393] mmc1: cannot verify signal voltage switch
[5189290.191013] mmc1: new ultra high speed SDR104 SDXC card at address
59b4
[5189290.191462] mmcblk1: mmc1:59b4 EE4S5 239 GiB 

After a warm reboot (card left in slot during this), it's still
recognized without the voltage switch issue message being printed:
[2.053896] mmc1: new ultra high speed SDR104 SDXC card at address 59b4
[2.057427] mmcblk1: mmc1:59b4 EE4S5 239 GiB 

All in all, the problem of the reader not working at all is (still)
fixed but it looks like some hardware registers or similar are not
properly initialized at boot?  Maybe using an older/simpler/etc. card
brings them to a working state?

Happy to supply more details etc.

Yannis

On Sat, 2023-07-01 at 15:29 +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=204003
> 
> --- Comment #25 from Chris Clayton (chris2553@googlemail.com) ---
> I no longer have the hardware, so cant help any further.
> 
> Chris
> 
> On Sat, 1 Jul 2023, 14:22 , <bugzilla-daemon@kernel.org> wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=204003
> > 
> > Doug Miles (dougmiles42@gmail.com) changed:
> > 
> >            What    |Removed                     |Added
> > 
> > -------------------------------------------------------------------
> > ---------
> >                  CC|                           
> > |dougmiles42@gmail.com
> > 
> > --- Comment #24 from Doug Miles (dougmiles42@gmail.com) ---
> > Bumping this, because it seems to have died on the vine and is
> > still an
> > issue
> > in kernel 6.2, as far as I can tell.
> > 
> > --
> > You may reply to this email to add a comment.
> > 
> > You are receiving this mail because:
> > You are on the CC list for the bug.
>
Comment 27 Roger Heflin 2023-08-10 19:30:20 UTC
(In reply to i.chryssochoos from comment #26)
> Hi,
> I don't have access to 6.2. I'm currently on Ubuntu 22.04 with kernel
> 5.15.0-76-generic (on the same box as back in 2020) and behavior is
> mixed:
> 
> I tried a Samsung Evo 256GB (U3, V30 etc.)  and had an error:
> [5189171.233955] mmc1: cannot verify signal voltage switch
> [5189171.568328] mmc1: error -110 whilst initialising SD card
> 
> Switching to a 10+ year old 8GB SDHC got things working:
> [5189234.395708] mmc1: new high speed SDHC card at address 0007
> [5189234.412143] mmcblk1: mmc1:0007 SD08G 7.50 GiB 
> 
> Reinserting the 256GB card worked:
> [5189290.073393] mmc1: cannot verify signal voltage switch
> [5189290.191013] mmc1: new ultra high speed SDR104 SDXC card at address
> 59b4
> [5189290.191462] mmcblk1: mmc1:59b4 EE4S5 239 GiB 
> 
> After a warm reboot (card left in slot during this), it's still
> recognized without the voltage switch issue message being printed:
> [2.053896] mmc1: new ultra high speed SDR104 SDXC card at address 59b4
> [2.057427] mmcblk1: mmc1:59b4 EE4S5 239 GiB 
> 
> All in all, the problem of the reader not working at all is (still)
> fixed but it looks like some hardware registers or similar are not
> properly initialized at boot?  Maybe using an older/simpler/etc. card
> brings them to a working state?
> 
> Happy to supply more details etc.
> 
> Yannis

I can confirm that this is still a bug on 6.4.7 (close to current).

Mine fails with Lexar UHS-II cards.  The 2 cards work fine on slow usb sd card readers.   Older SD cards do work on both of these readers.  The issue seems to be the newer cards that have this issue.

Errors are this:
160866.008169] mmc0: cannot verify signal voltage switch
[160866.441037] mmc0: error -110 whilst initialising SD card


I have at least 2 different machines that have these cards in them, and this behavior is on both.

I can collect for this issue.