Bug 7306

Summary: Yenta-socket causes oops on insertion of any PCMCIA card
Product: Drivers Reporter: Jon Sharp (jon)
Component: PCMCIAAssignee: Jon Sharp (jon)
Status: CLOSED CODE_FIX    
Severity: high CC: benh, fuzzyTew, linux-pcmcia, osl2008, protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.26 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Image of stacktrace
Results of lspcmcia etc. after booting without reserve=0x...
Results of lspcmcia etc. after booting with reserve=0x...
file /etc/pcmcia/config.opts as discussed
dmesg straight after booting without "reserve=" etc.
Remove the ISA hole from the bridge resources

Description Jon Sharp 2006-10-11 07:17:25 UTC
Most recent kernel where this bug did not occur: 2.6.12
Distribution: Gentoo 2006.1
Hardware Environment: Apple PowerBook G3 (Lombard)
Software Environment: Gentoo Linux 2006.1 (GCC 4.1.1)
Problem Description: Machine panics with "Oops: ...machine check..." message on insertion of (also if card 
is already in slot on boot) any PCMCIA card with Yenta-socket driver, either as module, or compiled in.  
Failure appears to be caused by pcmcia_read_cis_mem call according to stacktrace.

Steps to reproduce: Compile yenta-socket support either into kernel or as module and insert PCMCIA card.
Comment 1 Jon Sharp 2006-10-11 07:23:43 UTC
Created attachment 9221 [details]
Image of stacktrace
Comment 2 Jon Sharp 2006-10-16 08:24:21 UTC
Found workaround in the way of a "reserve=0xfd000000,0xffffff" parameter passed to the kernel using 
yaboot.  (BootX would apparently work on oldworld systems that exhibit this behavior)

References:

http://lists.infradead.org/pipermail/linux-pcmcia/2006-March/003321.html
http://lists.infradead.org/pipermail/linux-pcmcia/2006-April/003487.html
http://lists.infradead.org/pipermail/linux-pcmcia/2006-April/003488.html

I don't know how I missed these posts before filing this bug, but I did.  Still, it seems that perhaps this 
is still a bug that can be resolved?  Maybe it's not in the PCMCIA code, necessarily, but perhaps in the 
ppc32 platform code?  Is there a way to mark this memory range reserved without requiring this 
workaround?

I will leave this bug deferred for now, until someone else can answer these questions.
Comment 3 Natalie Protasevich 2007-07-06 10:11:50 UTC
Any update on this problem so far?
Thanks.
Comment 4 Natalie Protasevich 2008-03-30 19:20:54 UTC
Closing the bug. Please reopen if still a problem and/or someone resumes the work on it.
Comment 5 Richard Meek 2008-07-15 07:24:28 UTC
I'm seeing this with kernel 2.6.25.9 (OpenSuSE 11.0) on a Lombard Powerbook. The fix given above still works but is annoying

Please can the bug be re-opened?

Cheers
Richard.
Comment 6 Larry Finger 2008-07-15 08:38:42 UTC
We had a similar problem a year ago with bcm43xx and b43legacy. In that case, it was attempted reads of memory that was only present for certain versions of the wireless adapter. On x86 platforms, the read returned all ones, but on ppc one got machine checks. Nasty to find as I only has x86.
Comment 7 Dominik Brodowski 2008-07-15 12:11:47 UTC
what's the output of "lspcmcia -vvv" (without any card inserted), of "lspci -v" and of /proc/iomem (without the "reserved" boot option)?
Comment 8 Richard Meek 2008-07-16 00:47:26 UTC
Created attachment 16832 [details]
Results of lspcmcia etc. after booting without reserve=0x...
Comment 9 Richard Meek 2008-07-16 00:48:03 UTC
Created attachment 16833 [details]
Results of lspcmcia etc. after booting with reserve=0x...
Comment 10 Richard Meek 2008-07-16 00:56:04 UTC
These files don't look very different but were definitely created with / without the "reserve=0xfd000000,0xffffff" kernel boot option as hinted by the file names. Also the wireless LAN card (Netgear MA401) worked fine in the second case.
Comment 11 Dominik Brodowski 2008-07-16 03:25:17 UTC
Oh. ppc doesn't have /proc/iomem output? (this _is_ a PPC box, right?) Also, this is a strange box as it doesn't define which iomem area to use for PCI devices. Therefore, pcmcia uses /etc/pcmcia/config.opts. So the proper fix would be to comment out the line in /etc/pcmcia/config.opts which reads

inclode memory 0xfd000000-0xfdffffff

Or what am I missing here?
Comment 12 Richard Meek 2008-07-17 00:01:15 UTC
I see what you're suggesting, but there's no relevant line in /etc/pcmcia/config.opts - file attached.
Comment 13 Richard Meek 2008-07-17 00:03:08 UTC
Created attachment 16856 [details]
file /etc/pcmcia/config.opts as discussed
Comment 14 Richard Meek 2008-07-17 00:07:53 UTC
And yes, it certainly _is_ a PPC box! More completely, it's an Apple Powerbook G3 (Lombard) with a PPC 740 running at 333 MHz
Comment 15 Dominik Brodowski 2008-07-17 01:09:15 UTC
Strange. Could you post a full dmesg output (without the reserved option), please?
Comment 16 Richard Meek 2008-07-17 01:38:51 UTC
Created attachment 16858 [details]
dmesg straight after booting without "reserve=" etc.
Comment 17 Dominik Brodowski 2008-07-17 01:45:44 UTC
Now this contains interesting information:

pcmcia: parent PCI bridge Memory window: 0xfd000000 - 0xfdffffff

means the PCI host bridge is configured to allow "downstream" devices to use this memory area. However, when the PCMCIA socket tries to do so, you get the machine check. So my question would be to the powerpc folks: why is the PCI host bridge configured this way, even if this memory area is not usable?
Comment 18 Richard Meek 2008-07-29 05:49:44 UTC
No response then from the PPC maintainers? Is anyone there following this thread?

I note this bug is still shown as CLOSED WILL_FIX_LATER too :-(
Comment 19 Benjamin Herrenschmidt 2008-07-30 00:28:26 UTC
So I got paulus lombard here. I booted 2.6.27-rc1 (with a couple of totally unrelated fixes to make it build :-), with yenta built-in (couldn't be bothered netbooting with modules), orinoco built-in, prism54 (cardbus) built-in, 8250_cs built-in.

I tried 3 cards: orinoco card worked just fine. prism54 card worked just fine, 8250 modem didn't work because it apparently failed to match it to the 8250_cs driver, maybe it really wants modules there, I haven't looked too closely.

I'll try to investigate more tomorrow. It looks like yenta didn't pickup the fd000000 area on my machine and that worked, but I'll see if I can make it pick it up. It's possible that the firmware is lying and that this range hasn't actually been enabled on the bridge. I'll have a look.

Also make sure your config file doesn't try to feed ranges to the kernel driver, that's obsolete, it should be able to find free ones all by itself.
Comment 20 Benjamin Herrenschmidt 2008-07-30 21:40:29 UTC
I think I see the problem. In your dmesg, one can see:

PCI host bridge /pci@80000000 (primary) ranges:
  IO 0x00000000fe000000..0x00000000fe7fffff -> 0x0000000000000000
 MEM 0x00000000fd000000..0x00000000fdffffff -> 0x0000000000000000 
 MEM 0x0000000080000000..0x00000000fcffffff -> 0x0000000080000000 

That means that the region at 0xfd000000 is mapped to 0 ...
not 0xfd000000 on the PCI side. ie. It's not a 1:1 mapping, but rather
it's a way to access the ISA memory space on the PCI side.

As such it should -not- have been added to the PCI host bridge resources.

I'll see if I can find why the kernel is getting that wrong.

Cheers,
Ben.
Comment 21 Benjamin Herrenschmidt 2008-07-30 22:01:58 UTC
Created attachment 17035 [details]
Remove the ISA hole from the bridge resources

This patch should fix the problem. Please let me know.
Comment 22 Richard Meek 2008-07-30 23:37:06 UTC
OK will try as soon as I can, but the machine is at home and I'm now at work (for the next 9 hours or so ;=( )

I assume it's good for kernel 2.6.25.9 ? I'm not quite as bleeding-edge as you guys! Also not too experienced in kernel building either, but I have a friend who is. Will report back as soon as I can...

Thanks
Richard.
Comment 23 Richard Meek 2008-08-01 12:03:45 UTC
Ben - Looks good.

dmesg shows "Removing ISA hole at 0x00000000fd000000" otherwise very much as previously. BUT - no OOPS on PCMCIA card insertion, WiFi seems to be OK too.

Will test further over coming days but I think this fixes the problem, many thanks. 

PS - It took over 8 hours to build (remember this is only a PPC 740 with 333 MHz clock and 384 M RAM) and I kept running out of HD space but eventually I got there.

Is this patch likely to make it into the next kernel release? 
Comment 24 Benjamin Herrenschmidt 2008-08-01 15:45:01 UTC
Yes, I'll also send it to -stable for .25 and .26 after I've fully convinced myself that it won't hurt anything.
Comment 25 Natalie Protasevich 2008-08-01 21:24:01 UTC
Ouch, fixing the status...