Bug 88801

Summary: SCM PCMCIA CompactFlash Adapter seems not to be recognized
Product: Drivers Reporter: Elmar Stellnberger (estellnb)
Component: PCMCIAAssignee: linux-pcmcia
Status: RESOLVED CODE_FIX    
Severity: normal CC: alan, ingvarthorvald, szg00000
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.8.0-rc1-ARCH-26316-g65ea11e Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
lspci -tv (when CF+adapter is plugged)
lspci -vvv (when CF+adapter is plugged)
lspci -vvxx (when CF+adapter is plugged)
ADP-CF2-C datasheet
dmesg with 4.8.0-rc1-ARCH-26316-g65ea11e
Test Patch
... backtrace on insertion - device otherwise working fine
Fix Ordering of Hosting Bus Resources
dmesg after appyling patches 1 + 2
Null Pointer Test Fix
dmesg for patches 1 + 3: neither resolved nor broken
Ninja32 Test Fix for Init Routine
double backtrace: ninja32-Fix-init-function.patch
Irq Line Null Pointer Test Fix
4.8.0-rc1-ARCH-26316-g65ea11e dmesg without any patch: works
latest patch applied on vanilla: backtraces as always while rw works
Pci Null Pointer Fix
Ninja32 Register Fix
Pci Flags Test Fix
Test Port on Card is Not Null
screenshot of the backtrace (your last patch 0009)
Port Setup when allocating host structure test fix
dmesg with the proposed fix: issue resolved / no backtrace

Description Elmar Stellnberger 2014-11-23 18:31:31 UTC
When I plug my SCM PCMCIA CompactFlash adapter together with a working CF card then the dmesg shows nothing and the CF-disk is not recognized. The SCM PC Card adapter should not only allow to access CF medias from cameras but also provide an SSD emulation with 100/133MB/s. You may contact the company for technical details via info@scm-pc-card.de or hotline@scm-pc-card.de, TEL.: +49 8441 788 840. Please also consider the following report: Bug 43192. It may affect multiple PCMCIA cards.
Comment 1 Elmar Stellnberger 2014-11-23 18:33:21 UTC
Created attachment 158581 [details]
dmesg
Comment 2 Elmar Stellnberger 2014-11-23 18:34:03 UTC
Created attachment 158591 [details]
lspci -tv (when CF+adapter is plugged)
Comment 3 Elmar Stellnberger 2014-11-23 18:34:28 UTC
Created attachment 158601 [details]
lspci -vvv (when CF+adapter is plugged)
Comment 4 Elmar Stellnberger 2014-11-23 18:34:57 UTC
Created attachment 158611 [details]
lspci -vvxx (when CF+adapter is plugged)
Comment 5 Elmar Stellnberger 2014-11-23 19:25:05 UTC
Created attachment 158621 [details]
ADP-CF2-C datasheet

article no is: 775-5001-01, product is: ADP-CF2-C. It supports the PC card standard release 8.0.
Comment 6 Alan 2014-12-08 19:50:22 UTC
Ok this cardbus not PCMCIA.

PCMCIA ATA adapters follow a standard, cardbus ones don't. You would need to find out what chip is on the device. Almost all of them use the INIC 162x chip (I think it must be the only cardbus ATA adapter manufactured).

First though your cardbus controller doesn't seem to be seeing the device at all, which is probably bug 43192, and I suspect your machine side not the card side of things.

What happens if you boot with the card already inserted ?
Comment 7 Elmar Stellnberger 2014-12-08 21:02:55 UTC
  Unfortunately it does not even boot. The last message it gives me with debug ignore_loglevel is "pcmcia_socket pcmcia_socket0: pccard: CardBus card inserted into slot 0". Not even the SysRq keys will work.
  However now again I remember this card having worked with FreeBSD.
Comment 8 Alan 2014-12-08 22:09:26 UTC
Ok in which case I don't think there's anything that can be done until the cardbus bug is resolved by someone.
Comment 9 Elmar Stellnberger 2016-08-21 06:16:17 UTC
Created attachment 229531 [details]
dmesg with 4.8.0-rc1-ARCH-26316-g65ea11e

Now with 4.8.0-rc1-ARCH-26316-g65ea11e many bugs have improved an also this one - no more crashes on card insertion - now I get a clean backtrace.
Comment 10 Elmar Stellnberger 2016-08-21 06:17:20 UTC
switched from Linux localhost 3.18.0-rc3 #1 SMP Sun Nov 9 08:47:09 CET 2014 i686 i686 i686 GNU/Linux -> 4.8.0-rc1-ARCH-26316-g65ea11e.
Comment 11 [account disabled by the administrator] 2016-08-23 18:18:21 UTC
See if the below patch helps out at all.
Comment 12 [account disabled by the administrator] 2016-08-23 18:18:38 UTC
Created attachment 229891 [details]
Test Patch
Comment 13 Elmar Stellnberger 2016-08-23 19:50:33 UTC
Created attachment 229931 [details]
... backtrace on insertion - device otherwise working fine

Groovy; that has resolved the issue and I can now exuberantely make use of my SCM PC Card readers; copying, comparing and creating files from and on CF now works fine through this card. Still there has ended up a backtrace on insertion of the reader (including a CF) in the dmesg ...
Comment 14 [account disabled by the administrator] 2016-08-23 21:50:04 UTC
Leave the first patch applied. I am sending a second patch to attempt to fix the offending secondary issue you are currently facing.
Comment 15 [account disabled by the administrator] 2016-08-23 21:50:32 UTC
Created attachment 229961 [details]
Fix Ordering of Hosting Bus Resources
Comment 16 Elmar Stellnberger 2016-08-24 08:12:10 UTC
Created attachment 229981 [details]
dmesg after appyling patches 1 + 2

Patch 2 did unfortunately not seem to change much.
Comment 17 [account disabled by the administrator] 2016-08-24 13:12:21 UTC
See if the patch removes the still there warning. Keep Patch 1 as that fixes the original issue, however this may break the driver so let me known if that does arise for you.
Comment 18 [account disabled by the administrator] 2016-08-24 13:12:44 UTC
Created attachment 230031 [details]
Null Pointer Test Fix
Comment 19 Elmar Stellnberger 2016-08-25 09:24:35 UTC
Created attachment 230151 [details]
dmesg for patches 1 + 3: neither resolved nor broken

The application of patches 1 + 3 neither breaks the driver nor would it resolve the backtrace. Copying/Reading data on CF still works fine.
Comment 20 [account disabled by the administrator] 2016-08-25 13:57:57 UTC
Remove all patches and see if the below patch causes around trace or fixes the NULL pointer also together.
Comment 21 [account disabled by the administrator] 2016-08-25 13:58:29 UTC
Created attachment 230191 [details]
Ninja32 Test Fix for Init Routine
Comment 22 Elmar Stellnberger 2016-08-25 19:13:47 UTC
Created attachment 230251 [details]
double backtrace: ninja32-Fix-init-function.patch

Now that has produced a double backtrace; CF adapter still asserted to work (read+write) with this patch.
Comment 23 [account disabled by the administrator] 2016-08-25 20:38:05 UTC
Keep the patch that works and apply this on top, I am very doubtful that the NULL pointer is the irq line passed but after looking some closing seems it's a good idea to try it.
Comment 24 [account disabled by the administrator] 2016-08-25 20:40:05 UTC
Created attachment 230261 [details]
Irq Line Null Pointer  Test Fix
Comment 25 Elmar Stellnberger 2016-08-26 07:56:29 UTC
Created attachment 230291 [details]
4.8.0-rc1-ARCH-26316-g65ea11e dmesg without any patch: works

  Now as the previous test results started to appear strange to me I have retested with 4.8.0-rc1-ARCH-26316-g65ea11e, completely unpatched. - and see the CF adapter was in deed working correctly (however spying out the usual backtrace). I do not hope that previous tests were in vain because of this while I wonder how the recognition of /dev/sdb1 could slip through the last time. As the card is sometimes not fully recognized on insertion I should probably have double tested it ...
  ... concerning your latest patch I will now test it on top of a vanilla rc2.
Comment 26 Elmar Stellnberger 2016-08-26 09:28:34 UTC
Created attachment 230301 [details]
latest patch applied on vanilla: backtraces as always while rw works

that patch does not seem to change much either.
Comment 27 [account disabled by the administrator] 2016-08-26 16:01:14 UTC
See if the below patch helps out if not, we will need to find out what is actually at that address causing issues.
Comment 28 [account disabled by the administrator] 2016-08-26 16:01:37 UTC
Created attachment 230341 [details]
Pci Null Pointer Fix
Comment 29 Elmar Stellnberger 2016-08-27 08:34:44 UTC
absolutely no difference in the backtrace towards attachement #26 "latest patch applied on vanilla: backtraces as always while rw works" / insertion-patch5.dmesg with kernel 4.8.0-rc2-ARCH-00351-ga5539a8.
Comment 30 [account disabled by the administrator] 2016-08-27 14:29:44 UTC
See if the below patch helps out at all.
Comment 31 [account disabled by the administrator] 2016-08-27 14:30:11 UTC
Created attachment 230531 [details]
Ninja32 Register Fix
Comment 32 Elmar Stellnberger 2016-08-27 18:20:06 UTC
OOpsla; there is something wrong about the patch:
ap->ioaddr.altstatus_addr = base + ??;

What value would you suggest to assign here?
I have seen different values being assigned here in different locations like base + 0x8a, 0xe, 0xca or ap->ioaddr.ctl_addr like before.
Comment 33 [account disabled by the administrator] 2016-08-28 02:26:49 UTC
That was incorrect see if this patch helps out on top of the other patch that fixes your original issue of the card not being able to be read/written to after boot.
Comment 34 [account disabled by the administrator] 2016-08-28 02:27:13 UTC
Created attachment 230871 [details]
Pci Flags Test Fix
Comment 35 Elmar Stellnberger 2016-08-29 14:34:31 UTC
Ingvar; are you sure that your latest patch does not break the whole kernel IO subsystem? Last time I have got stuck in the initrd (not even the kyboard was working). Please test that patch yourself on your system and whether it boots like this at you.
Comment 36 [account disabled by the administrator] 2016-08-29 15:50:34 UTC
Most drivers need to check that was just making sure it wasn't this one. I don't have this hardware otherwise I would have not asked you to test for me and just give me your traces so I can debug this on my own hardware. Anyhow there is patch below that is going to allow me to see if the ports the driver is using during port are not pointing to NULL.
Comment 37 [account disabled by the administrator] 2016-08-29 15:51:11 UTC
Created attachment 231151 [details]
Test Port on Card is Not Null
Comment 38 Elmar Stellnberger 2016-08-29 17:04:17 UTC
Created attachment 231221 [details]
screenshot of the backtrace (your last patch 0009)

Now that definitely worsens things: after giving the backtrace the whole machine is hanging without any reaction on SysRq keys.
Comment 39 [account disabled by the administrator] 2016-08-29 17:55:55 UTC
That patch was not meant a fix it was to see if their was a issue with NULL pointer deference due to ports not being setup correctly. Please test the below patch and see if it helps out due to port setup being possibly being incorrect.
Comment 40 [account disabled by the administrator] 2016-08-29 17:56:33 UTC
Created attachment 231271 [details]
Port Setup when allocating host structure test fix
Comment 41 Elmar Stellnberger 2016-08-29 18:23:37 UTC
dropped to initrd, no fs mountable, keyboard not working.
Comment 42 Elmar Stellnberger 2016-08-29 18:36:20 UTC
  Isn`t there any better way to debug this? If the backtrace does not give you enough information isn`t there something like kdb that can generate core files / analyse variables etc.? At last I wonder if it is worth all of the work because the SCM PCMCIA adapter seem to work quite well since 4.8.0-rc1-ARCH-26316-g65ea11e (comment #9); you can mount read and write CF cards with it. - and I have a couple of other bugs that are pending to be filed ...
  ... for Xorg and my own programs I had at least programmed a program that could annotate line numbers for functions at backtraces which I remember to already have been of great help (if you do not even know the line where it fails the backtrace may be misleading).
Comment 43 Elmar Stellnberger 2016-08-29 18:53:41 UTC
  Normally you have one, two or in rare cases three attempts to fix and then it needs to work (at last that was the way it was for me when hacking the Modula-3 compiler). I know it was different when I did my last kernel patch (https://bugs.mageia.org/show_bug.cgi?id=19231) but that was apparently a result of my poor possibilities to analyse the C code statically (f.i. jump the line of execution between modules) and scarce debugging features. I did not have anything like gdb either for the M3-compiler but it was easy to find out how the code would execute (probably a feature of that programming language as I just had a plain text editor). Nonetheless for the radeon and drm modules I still had no idea how the flow of execution would proceed through it, not even at the end when I came up with a perfectly working patch.
  Any comments on these issues will be highly appreciated by me, Ingvar (you can email me outside this thread, of course).
Comment 44 [account disabled by the administrator] 2016-08-29 19:17:46 UTC
First yes you can use kgb if you have a second system and can mount the debugged system over serial. On the other hand remove this line:
ap->pflags = ATA_PFLAG_PIO32 | ATA_PFLAG_PIO32CHANGE;
from ninja32_init_one. If this doesn't fix it then close the bug as we fixed the original issue and open another one with your other bug reports as this is a secondary issue.
Comment 45 Alan 2016-08-29 20:31:32 UTC
From the trace and the subsequent poking around Ingvar did I think actually the problem is a bit different

ata_port_desc checks ATA_PFLAG_INITIALIZING which should be set by port_alloc and cleared by port_probe.

ninja32_init_one sets ap->pflags rather than updating it.

Please try removing the other patches and just changing pata_ninja32.c where it says

        ap->pflags = ATA_PFLAG_PIO32 | ATA_PFLAG_PIO32CHANGE;

to

        ap->pflags |= ATA_PFLAG_PIO32 | ATA_PFLAG_PIO32CHANGE;

And if it works let us know.

Alan
Comment 46 [account disabled by the administrator] 2016-08-29 21:12:08 UTC
That probably is the issue. Seems that it was either ports or flags not being set correctly. After tracing for a bit that was conclusion, just checking if removing the line fixes the issue or do be need to actually bitwise shift it. On the other hand will send a patch for the first issue unless someone wants to write up a patch for it themselves.
Comment 47 Alan 2016-08-30 10:45:46 UTC
I'll submit the patch for the flags one once Elmar confirms its a fix.
Comment 48 Elmar Stellnberger 2016-08-30 12:57:46 UTC
Created attachment 231431 [details]
dmesg with the proposed fix: issue resolved / no backtrace

Yes, that in deed does resolve the issue (no backtrace any more).
however that error message persists: Synchronize Cache(10) failed: ...
Comment 49 Alan 2016-08-30 15:03:15 UTC
The synchronize cache failed is unrelated - it may even be a drive firmware funny, and in the cases you hot unplug the card I'd expect to see it in the trace anyway.

I'll get the pflags patch upstreamed ASAP