Bug 88801
Description
Elmar Stellnberger
2014-11-23 18:31:31 UTC
Created attachment 158581 [details]
dmesg
Created attachment 158591 [details]
lspci -tv (when CF+adapter is plugged)
Created attachment 158601 [details]
lspci -vvv (when CF+adapter is plugged)
Created attachment 158611 [details]
lspci -vvxx (when CF+adapter is plugged)
Created attachment 158621 [details]
ADP-CF2-C datasheet
article no is: 775-5001-01, product is: ADP-CF2-C. It supports the PC card standard release 8.0.
Ok this cardbus not PCMCIA. PCMCIA ATA adapters follow a standard, cardbus ones don't. You would need to find out what chip is on the device. Almost all of them use the INIC 162x chip (I think it must be the only cardbus ATA adapter manufactured). First though your cardbus controller doesn't seem to be seeing the device at all, which is probably bug 43192, and I suspect your machine side not the card side of things. What happens if you boot with the card already inserted ? Unfortunately it does not even boot. The last message it gives me with debug ignore_loglevel is "pcmcia_socket pcmcia_socket0: pccard: CardBus card inserted into slot 0". Not even the SysRq keys will work. However now again I remember this card having worked with FreeBSD. Ok in which case I don't think there's anything that can be done until the cardbus bug is resolved by someone. Created attachment 229531 [details]
dmesg with 4.8.0-rc1-ARCH-26316-g65ea11e
Now with 4.8.0-rc1-ARCH-26316-g65ea11e many bugs have improved an also this one - no more crashes on card insertion - now I get a clean backtrace.
switched from Linux localhost 3.18.0-rc3 #1 SMP Sun Nov 9 08:47:09 CET 2014 i686 i686 i686 GNU/Linux -> 4.8.0-rc1-ARCH-26316-g65ea11e. See if the below patch helps out at all. Created attachment 229891 [details]
Test Patch
Created attachment 229931 [details]
... backtrace on insertion - device otherwise working fine
Groovy; that has resolved the issue and I can now exuberantely make use of my SCM PC Card readers; copying, comparing and creating files from and on CF now works fine through this card. Still there has ended up a backtrace on insertion of the reader (including a CF) in the dmesg ...
Leave the first patch applied. I am sending a second patch to attempt to fix the offending secondary issue you are currently facing. Created attachment 229961 [details]
Fix Ordering of Hosting Bus Resources
Created attachment 229981 [details]
dmesg after appyling patches 1 + 2
Patch 2 did unfortunately not seem to change much.
See if the patch removes the still there warning. Keep Patch 1 as that fixes the original issue, however this may break the driver so let me known if that does arise for you. Created attachment 230031 [details]
Null Pointer Test Fix
Created attachment 230151 [details]
dmesg for patches 1 + 3: neither resolved nor broken
The application of patches 1 + 3 neither breaks the driver nor would it resolve the backtrace. Copying/Reading data on CF still works fine.
Remove all patches and see if the below patch causes around trace or fixes the NULL pointer also together. Created attachment 230191 [details]
Ninja32 Test Fix for Init Routine
Created attachment 230251 [details]
double backtrace: ninja32-Fix-init-function.patch
Now that has produced a double backtrace; CF adapter still asserted to work (read+write) with this patch.
Keep the patch that works and apply this on top, I am very doubtful that the NULL pointer is the irq line passed but after looking some closing seems it's a good idea to try it. Created attachment 230261 [details]
Irq Line Null Pointer Test Fix
Created attachment 230291 [details]
4.8.0-rc1-ARCH-26316-g65ea11e dmesg without any patch: works
Now as the previous test results started to appear strange to me I have retested with 4.8.0-rc1-ARCH-26316-g65ea11e, completely unpatched. - and see the CF adapter was in deed working correctly (however spying out the usual backtrace). I do not hope that previous tests were in vain because of this while I wonder how the recognition of /dev/sdb1 could slip through the last time. As the card is sometimes not fully recognized on insertion I should probably have double tested it ...
... concerning your latest patch I will now test it on top of a vanilla rc2.
Created attachment 230301 [details]
latest patch applied on vanilla: backtraces as always while rw works
that patch does not seem to change much either.
See if the below patch helps out if not, we will need to find out what is actually at that address causing issues. Created attachment 230341 [details]
Pci Null Pointer Fix
absolutely no difference in the backtrace towards attachement #26 "latest patch applied on vanilla: backtraces as always while rw works" / insertion-patch5.dmesg with kernel 4.8.0-rc2-ARCH-00351-ga5539a8. See if the below patch helps out at all. Created attachment 230531 [details]
Ninja32 Register Fix
OOpsla; there is something wrong about the patch: ap->ioaddr.altstatus_addr = base + ??; What value would you suggest to assign here? I have seen different values being assigned here in different locations like base + 0x8a, 0xe, 0xca or ap->ioaddr.ctl_addr like before. That was incorrect see if this patch helps out on top of the other patch that fixes your original issue of the card not being able to be read/written to after boot. Created attachment 230871 [details]
Pci Flags Test Fix
Ingvar; are you sure that your latest patch does not break the whole kernel IO subsystem? Last time I have got stuck in the initrd (not even the kyboard was working). Please test that patch yourself on your system and whether it boots like this at you. Most drivers need to check that was just making sure it wasn't this one. I don't have this hardware otherwise I would have not asked you to test for me and just give me your traces so I can debug this on my own hardware. Anyhow there is patch below that is going to allow me to see if the ports the driver is using during port are not pointing to NULL. Created attachment 231151 [details]
Test Port on Card is Not Null
Created attachment 231221 [details]
screenshot of the backtrace (your last patch 0009)
Now that definitely worsens things: after giving the backtrace the whole machine is hanging without any reaction on SysRq keys.
That patch was not meant a fix it was to see if their was a issue with NULL pointer deference due to ports not being setup correctly. Please test the below patch and see if it helps out due to port setup being possibly being incorrect. Created attachment 231271 [details]
Port Setup when allocating host structure test fix
dropped to initrd, no fs mountable, keyboard not working. Isn`t there any better way to debug this? If the backtrace does not give you enough information isn`t there something like kdb that can generate core files / analyse variables etc.? At last I wonder if it is worth all of the work because the SCM PCMCIA adapter seem to work quite well since 4.8.0-rc1-ARCH-26316-g65ea11e (comment #9); you can mount read and write CF cards with it. - and I have a couple of other bugs that are pending to be filed ... ... for Xorg and my own programs I had at least programmed a program that could annotate line numbers for functions at backtraces which I remember to already have been of great help (if you do not even know the line where it fails the backtrace may be misleading). Normally you have one, two or in rare cases three attempts to fix and then it needs to work (at last that was the way it was for me when hacking the Modula-3 compiler). I know it was different when I did my last kernel patch (https://bugs.mageia.org/show_bug.cgi?id=19231) but that was apparently a result of my poor possibilities to analyse the C code statically (f.i. jump the line of execution between modules) and scarce debugging features. I did not have anything like gdb either for the M3-compiler but it was easy to find out how the code would execute (probably a feature of that programming language as I just had a plain text editor). Nonetheless for the radeon and drm modules I still had no idea how the flow of execution would proceed through it, not even at the end when I came up with a perfectly working patch. Any comments on these issues will be highly appreciated by me, Ingvar (you can email me outside this thread, of course). First yes you can use kgb if you have a second system and can mount the debugged system over serial. On the other hand remove this line: ap->pflags = ATA_PFLAG_PIO32 | ATA_PFLAG_PIO32CHANGE; from ninja32_init_one. If this doesn't fix it then close the bug as we fixed the original issue and open another one with your other bug reports as this is a secondary issue. From the trace and the subsequent poking around Ingvar did I think actually the problem is a bit different ata_port_desc checks ATA_PFLAG_INITIALIZING which should be set by port_alloc and cleared by port_probe. ninja32_init_one sets ap->pflags rather than updating it. Please try removing the other patches and just changing pata_ninja32.c where it says ap->pflags = ATA_PFLAG_PIO32 | ATA_PFLAG_PIO32CHANGE; to ap->pflags |= ATA_PFLAG_PIO32 | ATA_PFLAG_PIO32CHANGE; And if it works let us know. Alan That probably is the issue. Seems that it was either ports or flags not being set correctly. After tracing for a bit that was conclusion, just checking if removing the line fixes the issue or do be need to actually bitwise shift it. On the other hand will send a patch for the first issue unless someone wants to write up a patch for it themselves. I'll submit the patch for the flags one once Elmar confirms its a fix. Created attachment 231431 [details]
dmesg with the proposed fix: issue resolved / no backtrace
Yes, that in deed does resolve the issue (no backtrace any more).
however that error message persists: Synchronize Cache(10) failed: ...
The synchronize cache failed is unrelated - it may even be a drive firmware funny, and in the cases you hot unplug the card I'd expect to see it in the trace anyway. I'll get the pflags patch upstreamed ASAP |