Bug 10925

Summary: Memory reader not working with kernel >2.6.24 but works with kernel 2.6.23 and 2.6.22!
Product: Drivers Reporter: Ali Abdallah (ali.slackware)
Component: MMC/SDAssignee: Pierre Ossman (pierre-bugzilla)
Status: CLOSED CODE_FIX    
Severity: normal CC: bunk, devzero
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: >= 2.6.24 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 9243    
Attachments: scrdebug.patch
enedma.patch

Description Ali Abdallah 2008-06-16 01:58:42 UTC
Latest working kernel version:
------------------------------
My memory stick reader used to work with kernel version 2.6.22.x and 2.6.23.x, but not with 2.6.20.x ( not tested with 2.6.21.x) .

Earliest failing kernel version:
--------------------------------
>=2.6.24

Distribution:
-------------
Slackware 12.1, and Ubuntu 8.04

Hardware Environment:
----------------------
06:00.0 FLASH memory: ENE Technology Inc ENE PCI Memory Stick Card Reader Controller
06:00.1 SD Host controller: ENE Technology Inc ENE PCI SmartMedia / xD Card Reader Controller
06:00.2 FLASH memory: ENE Technology Inc Unknown device 0720
06:00.3 FLASH memory: ENE Technology Inc ENE PCI Secure Digital / MMC Card Reader Controller

Software Environment:

Problem Description:
when inserting a memory card in the reader, nothing comes from /var/log/messages but dmesg shows 
mmc1: unrecognised SCR structure version 1
mmc1: error -22 whilst initialising SD card

searching over the net i didn't find anything useful, but it seems this SCR checking is only implemented in sd.c source with kernel version >=2.6.24, the card works perfectly with 2.6.23.x and with 2.6.22.x
Comment 1 Adrian Bunk 2008-06-16 14:15:56 UTC
*** Bug 10929 has been marked as a duplicate of this bug. ***
Comment 2 Roland Kletzing 2008-06-22 10:34:47 UTC
really looks like a regression. 

http://lkml.org/lkml/2008/2/6/359
>My Toshiba SD-512-T card (Toshiba-specific specs are available as
>TOSHIBA_SD_Card_Specification.pdf) shows the same "SCR structure version 1"
>error message with 2.6.24 on a Motorola E680 (PXAMCI), whereas
>on 2.6.21 it did _NOT_ have it (and all SCR values there are a 1:1 match
>with the values listed in the spec .pdf!).

Ali, do you know about git-bisect ? with that you can find the change in the linux kernel which is causing this issue
Comment 3 Pierre Ossman 2008-07-02 15:06:27 UTC
Before we go bisecting, the first step is enabling MMC_DEBUG in the kernel config and checking if the problem is the same one that was previously reported.

Ali, have you tested this with different cards? Could you check what the model markings on the card(s) is? (it's normally on the back, and usually very faint)
Comment 4 Ali Abdallah 2008-07-03 01:38:36 UTC
(In reply to comment #3)
> Before we go bisecting, the first step is enabling MMC_DEBUG in the kernel
> config and checking if the problem is the same one that was previously
> reported.
> 
> Ali, have you tested this with different cards? Could you check what the
> model
> markings on the card(s) is? (it's normally on the back, and usually very
> faint)
> 

I'll enable the MMC_DEBUG in the kernel and check, actually i've tested this with different cards, but i use mainly kingston cards.
I'll report today about the debug option.
Comment 5 Pierre Ossman 2008-07-03 02:10:17 UTC
Unlucky choice of cards. :)

Kingston cards are relabeled Toshiba ones, and those are the only ones where we've seen this behaviour.
Comment 6 Ali Abdallah 2008-07-03 04:34:06 UTC
(In reply to comment #5)
> Unlucky choice of cards. :)
> 
> Kingston cards are relabeled Toshiba ones, and those are the only ones where
> we've seen this behaviour.
> 

but i don't understand why it was working with the same cards with the kernel 2.6.23 and 22!

anyway here is the output from dmesg

sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000040
mmc1: clock 0Hz busmode 1 powermode 1 cs 0 Vdd 21 width 0 timing 0
mmc1: clock 128906Hz busmode 1 powermode 2 cs 0 Vdd 21 width 0 timing 0
mmc1: clock 128906Hz busmode 1 powermode 2 cs 1 Vdd 21 width 0 timing 0
mmc1: starting CMD0 arg 00000000 flags 000000c0
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD0): 0: 00000000 00000000 00000000 00000000
mmc1: clock 128906Hz busmode 1 powermode 2 cs 0 Vdd 21 width 0 timing 0
mmc1: starting CMD8 arg 000001aa flags 000002f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD8): 0: 000001aa 00000000 00000000 00000000
mmc1: starting CMD5 arg 00000000 flags 000002e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00018000
mmc1: req failed (CMD5): -110, retrying...
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00018000
mmc1: req failed (CMD5): -110, retrying...
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00018000
mmc1: req failed (CMD5): -110, retrying...
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00018000
mmc1: req done (CMD5): -110: 00000000 00000000 00000000 00000000
mmc1: starting CMD55 arg 00000000 flags 000000f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00400120 00000000 00000000 00000000
mmc1: starting CMD41 arg 00000000 flags 000000e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD41): 0: 00ff8000 00000000 00000000 00000000
mmc1: clock 128906Hz busmode 1 powermode 2 cs 0 Vdd 20 width 0 timing 0
mmc1: clock 128906Hz busmode 1 powermode 2 cs 1 Vdd 20 width 0 timing 0
mmc1: starting CMD0 arg 00000000 flags 000000c0
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD0): 0: 00000000 00000000 00000000 00000000
mmc1: clock 128906Hz busmode 1 powermode 2 cs 0 Vdd 20 width 0 timing 0
mmc1: starting CMD8 arg 000001aa flags 000002f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD8): 0: 000001aa 00000000 00000000 00000000
mmc1: starting CMD55 arg 00000000 flags 000000f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00000120 00000000 00000000 00000000
mmc1: starting CMD41 arg 40300000 flags 000000e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD41): 0: 00ff8000 00000000 00000000 00000000
mmc1: starting CMD55 arg 00000000 flags 000000f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00000120 00000000 00000000 00000000
mmc1: starting CMD41 arg 40300000 flags 000000e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD41): 0: 00ff8000 00000000 00000000 00000000
mmc1: starting CMD55 arg 00000000 flags 000000f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00000120 00000000 00000000 00000000
mmc1: starting CMD41 arg 40300000 flags 000000e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD41): 0: 80ff8000 00000000 00000000 00000000
mmc1: starting CMD2 arg 00000000 flags 00000067
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD2): 0: 02544d53 44303147 32917dbf 31007500
mmc1: starting CMD3 arg 00000000 flags 00000075
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD3): 0: c58b0520 00000000 00000000 00000000
mmc1: clock 128906Hz busmode 2 powermode 2 cs 0 Vdd 20 width 0 timing 0
mmc1: starting CMD9 arg c58b0000 flags 00000007
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD9): 0: 005d0032 5b5983b2 ffffff80 16400000
mmc1: starting CMD7 arg c58b0000 flags 00000015
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD7): 0: 00000700 00000000 00000000 00000000
mmc1: starting CMD55 arg c58b0000 flags 00000095
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00000920 00000000 00000000 00000000
mmc1: starting CMD51 arg 00000000 flags 000000b5
mmc1:     blksz 8 blocks 1 flags 00000200 tsac 50 ms nsac 0
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000002
mmc1: req done (CMD51): 0: 00000920 00000000 00000000 00000000
mmc1:     8 bytes transferred: 0
mmc1: unrecognised SCR structure version 1
mmc1: clock 0Hz busmode 1 powermode 0 cs 0 Vdd 0 width 0 timing 0
mmc1: error -22 whilst initialising SD card
mmc1: clock 0Hz busmode 1 powermode 0 cs 0 Vdd 0 width 0 timing 0
Comment 7 Pierre Ossman 2008-07-03 05:47:21 UTC
The kernel didn't really check that the SCR data was sensible until 2.6.24, so bugs like this went unnoticed.
Comment 8 Pierre Ossman 2008-07-03 05:48:42 UTC
Created attachment 16720 [details]
scrdebug.patch

Please apply the attached patch and rerun your test. It prints a bit more debugging that hopefully allows us to get to the bottom of this.
Comment 9 Ali Abdallah 2008-07-03 09:39:20 UTC
Ok i've applied that patch and here dmesg

sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000040
mmc1: clock 0Hz busmode 1 powermode 1 cs 0 Vdd 21 width 0 timing 0
mmc1: clock 128906Hz busmode 1 powermode 2 cs 0 Vdd 21 width 0 timing 0
mmc1: clock 128906Hz busmode 1 powermode 2 cs 1 Vdd 21 width 0 timing 0
mmc1: starting CMD0 arg 00000000 flags 000000c0
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD0): 0: 00000000 00000000 00000000 00000000
mmc1: clock 128906Hz busmode 1 powermode 2 cs 0 Vdd 21 width 0 timing 0
mmc1: starting CMD8 arg 000001aa flags 000002f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD8): 0: 000001aa 00000000 00000000 00000000
mmc1: starting CMD5 arg 00000000 flags 000002e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00018000
mmc1: req failed (CMD5): -110, retrying...
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00018000
mmc1: req failed (CMD5): -110, retrying...
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00018000
mmc1: req failed (CMD5): -110, retrying...
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00018000
mmc1: req done (CMD5): -110: 00000000 00000000 00000000 00000000
mmc1: starting CMD55 arg 00000000 flags 000000f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00400120 00000000 00000000 00000000
mmc1: starting CMD41 arg 00000000 flags 000000e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD41): 0: 00ff8000 00000000 00000000 00000000
mmc1: clock 128906Hz busmode 1 powermode 2 cs 0 Vdd 20 width 0 timing 0
mmc1: clock 128906Hz busmode 1 powermode 2 cs 1 Vdd 20 width 0 timing 0
mmc1: starting CMD0 arg 00000000 flags 000000c0
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD0): 0: 00000000 00000000 00000000 00000000
mmc1: clock 128906Hz busmode 1 powermode 2 cs 0 Vdd 20 width 0 timing 0
mmc1: starting CMD8 arg 000001aa flags 000002f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD8): 0: 000001aa 00000000 00000000 00000000
mmc1: starting CMD55 arg 00000000 flags 000000f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00000120 00000000 00000000 00000000
mmc1: starting CMD41 arg 40300000 flags 000000e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD41): 0: 00ff8000 00000000 00000000 00000000
mmc1: starting CMD55 arg 00000000 flags 000000f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00000120 00000000 00000000 00000000
mmc1: starting CMD41 arg 40300000 flags 000000e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD41): 0: 00ff8000 00000000 00000000 00000000
mmc1: starting CMD55 arg 00000000 flags 000000f5
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00000120 00000000 00000000 00000000
mmc1: starting CMD41 arg 40300000 flags 000000e1
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD41): 0: 80ff8000 00000000 00000000 00000000
mmc1: starting CMD2 arg 00000000 flags 00000067
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD2): 0: 02544d53 44303147 32917dbf 31007500
mmc1: starting CMD3 arg 00000000 flags 00000075
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD3): 0: c58b0520 00000000 00000000 00000000
mmc1: clock 128906Hz busmode 2 powermode 2 cs 0 Vdd 20 width 0 timing 0
mmc1: starting CMD9 arg c58b0000 flags 00000007
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD9): 0: 005d0032 5b5983b2 ffffff80 16400000
mmc1: starting CMD7 arg c58b0000 flags 00000015
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD7): 0: 00000700 00000000 00000000 00000000
mmc1: starting CMD55 arg c58b0000 flags 00000095
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
mmc1: req done (CMD55): 0: 00000920 00000000 00000000 00000000
Reading SCR e657f960 e657f960
mmc1: starting CMD51 arg 00000000 flags 000000b5
mmc1:     blksz 8 blocks 1 flags 00000200 tsac 50 ms nsac 0
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000001
sdhci [sdhci_irq()]: *** sdhc1:slot0 got interrupt: 0x00000002
mmc1: req done (CMD51): 0: 00000920 00000000 00000000 00000000
mmc1:     8 bytes transferred: 0
Raw data: 0271011802710118
Swapped: 1801710218017102
Decoding 1801710218017102
mmc1: unrecognised SCR structure version 1
mmc1: clock 0Hz busmode 1 powermode 0 cs 0 Vdd 0 width 0 timing 0
mmc1: error -22 whilst initialising SD card
mmc1: clock 0Hz busmode 1 powermode 0 cs 0 Vdd 0 width 0 timing 0
Comment 10 Pierre Ossman 2008-07-03 10:15:38 UTC
That wasn't even close to a valid SCR. :)

It seems that the upper four bytes got overwritten by the lower four. Considering that ENE controllers have been consistently crap, I'd suspect a bug in their DMA engine.

I'm not sure which kernel version you're running, but try loading sdhci with debug_quirks=0x31. That should disable DMA so we can determine if that is indeed the problem.
Comment 11 Ali Abdallah 2008-07-03 10:26:29 UTC
(In reply to comment #10)
> That wasn't even close to a valid SCR. :)
> 
> It seems that the upper four bytes got overwritten by the lower four.
> Considering that ENE controllers have been consistently crap, I'd suspect a
> bug
> in their DMA engine.
> 
> I'm not sure which kernel version you're running, but try loading sdhci with
> debug_quirks=0x31. That should disable DMA so we can determine if that is
> indeed the problem.
>
 
I'm running 2.6.25.5

If i pass that option to sdhci, the card works.
Comment 12 Pierre Ossman 2008-07-03 11:09:39 UTC
That's a real shame. I guess we'll have to disable the DMA engine for the newer ENE controllers as well.

Please contact your laptop vendor and ENE though, as until they start cooperating ENE will keep on being the worst supported controller.
Comment 13 Pierre Ossman 2008-07-03 11:10:29 UTC
Created attachment 16723 [details]
enedma.patch

Please try this patch. It should make sure that DMA is always disabled for ENE controllers.
Comment 14 Ali Abdallah 2008-07-03 11:38:27 UTC
With this patch and without passing any option to sdhci, the card works. 

I'll contact ENE and my laptop vendor, but i don't think "Acer" they will like to hear my sound again, since i was calling them every day to get the refund of the bundled OS wish comes with the laptop ( with success).

Anyway many thanks for your support.
Comment 15 Adrian Bunk 2008-07-04 14:40:13 UTC
fixed by commit bf5b1935d8e42b36a34645788eb261461fe07f2e