Bug 12006

Summary: GN-WMKG PCMCIA card (based on Ralink RT2500) doesn't work
Product: Drivers Reporter: Maciej Grela (acct-bugzilla.kernel.org)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED INVALID    
Severity: normal CC: bjorn.helgaas, IvDoorn, linux, ylonen
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27 Subsystem:
Regression: No Bisected commit-id:
Attachments: my dmesg
lspci -vvv
lspci -xxx
dmesg output after plugging in the card
lspci -vvxx after plugging in the card
dmesg output after plugging in the card (uber verbose)
dmesg output after plugging in the card (uber verbose)
dmesg output after plugging in the card (uber verbose)
__pci_read_base tracing patch for GN-WMKG PCMCIA card resource alloc
complain about broken devices
Dmesg with applied broken device detection patch

Description Maciej Grela 2008-11-10 13:26:50 UTC
Latest working kernel version: unknown
Earliest failing kernel version: 2.6.27
Distribution: Gentoo
Hardware Environment: Dell D600 + GN-WMKG (RT2500 based card)
Software Environment: 
Problem Description:

The mentioned card doesn't work after inserting into the slot. The error in dmesg says:

pccard: CardBus card inserted into slot 0
rt2500pci 0000:03:00.0: PCI INT A -> Link[LNKD] -> GSI 11 (level, low) -> IRQ 11
rt2500pci 0000:03:00.0: setting latency timer to 64
rt2x00pci -> rt2x00pci_alloc_reg: Error - Failed to allocate registers.
rt2500pci 0000:03:00.0: PCI INT A disabled
rt2500pci: probe of 0000:03:00.0 failed with error -12

This problem is also mentioned on Ubuntu lanuchpad: https://bugs.launchpad.net/ubuntu/+source/rt2x00/+bug/189711

I've attached my dmesg and lspci information.
Comment 1 Maciej Grela 2008-11-10 13:27:21 UTC
Created attachment 18788 [details]
my dmesg
Comment 2 Maciej Grela 2008-11-10 13:27:42 UTC
Created attachment 18789 [details]
lspci -vvv
Comment 3 Maciej Grela 2008-11-10 13:28:03 UTC
Created attachment 18790 [details]
lspci -xxx
Comment 4 John W. Linville 2008-11-11 11:02:58 UTC
The error you are seeing looks straight-forward.  Either your machine can't allocate memory for some data structures or (more likely) it can't remap the I/O memory for your device.  See the "Region 0" line below -- it looks like your machine is failing to allocate PCI resources for the device.

03:00.0 Network controller: RaLink RT2500 802.11g Cardbus/mini-PCI (rev 01)
	Subsystem: Giga-byte Technology Device 6831
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at <ignored> (32-bit, non-prefetchable) [disabled]
	Capabilities: [40] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Kernel modules: rt2500pci

I really don't think this is an rt2500pci problem...
Comment 5 Maciej Grela 2008-11-12 13:26:40 UTC
(In reply to comment #4)
> The error you are seeing looks straight-forward.  Either your machine can't
> allocate memory for some data structures or (more likely) it can't remap the
> I/O memory for your device.  See the "Region 0" line below -- it looks like
> your machine is failing to allocate PCI resources for the device.
> 
> 03:00.0 Network controller: RaLink RT2500 802.11g Cardbus/mini-PCI (rev 01)
>         Subsystem: Giga-byte Technology Device 6831
>         Control: I/O- Mem- BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx-
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=slow >TAbort-
>         <TAbort-
> <MAbort- >SERR- <PERR- INTx-
>         Interrupt: pin A routed to IRQ 11
>         Region 0: Memory at <ignored> (32-bit, non-prefetchable) [disabled]
>         Capabilities: [40] Power Management version 2
>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>                 Status: D0 PME-Enable- DSel=0 DScale=0 PME-
>         Kernel modules: rt2500pci
> 
> I really don't think this is an rt2500pci problem...
> 

Does this mean, that the card is broken ?
Comment 6 Juha Ylönen 2008-11-21 05:39:24 UTC
I've got the same card, and the same problem. Quickly googling the issue it seems there are some reports of the same issue, with different card but same chipset. They also report that it some times work, if you boot your machine with the card connected.

Could it be this a normal case of HW failure for this chipset?
Comment 7 Dominik Brodowski 2010-03-06 14:14:40 UTC
There are some resource handling bugfixes merged for 2.6.33 and 2.6.34-rc1. Might be worth to try it out once 2.6.34-rc1 is released?
Comment 8 Maciej Grela 2010-03-14 14:21:12 UTC
Tried on 2.6.34-rc1 - didn't help. Here is the dmesg output after plugging in the card (PCI debugging enabled):

[   68.847174] pcmcia_socket pcmcia_socket0: pccard: CardBus card inserted into slot 0
[   68.847234] pci 0000:16:00.0: found [1814:0201] class 000280 header type 00
[   68.847474] pci 0000:16:00.0: calling quirk_resource_alignment+0x0/0x19d
[   68.925482] calling  rt2500pci_init+0x0/0x20 [rt2500pci] @ 2049
[   68.925571] rt2500pci 0000:16:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[   68.925585] rt2500pci 0000:16:00.0: enabling bus mastering
[   68.925601] rt2500pci 0000:16:00.0: setting latency timer to 64
[   68.925623] rt2500pci 0000:16:00.0: enabling Mem-Wr-Inval
[   68.925646] ------------[ cut here ]------------
[   68.925658] WARNING: at drivers/pci/pci.c:104 pci_ioremap_bar+0x2e/0x66()
[   68.925664] Hardware name: 7665D71
[   68.925668] Modules linked in: rt2500pci(+) rt2x00pci rt2x00lib eeprom_93cx6 sco rfcomm bridge bnep l2cap fbcon tileblit font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor tun fuse 8021q garp stp llc loop kvm_intel kvm btusb bluetooth snd_hda_codec_analog nouveau ttm usb_storage drm_kms_helper iwlagn snd_hda_intel drm snd_hda_codec snd_hwdep fb ohci1394 snd_pcm sr_mod pcmcia ieee1394 e1000e iwlcore snd_timer cdrom i2c_algo_bit intel_agp sg snd soundcore snd_page_alloc cfbcopyarea cfbimgblt cfbfillrect tpm_tis yenta_socket uhci_hcd rsrc_nonstatic psmouse ehci_hcd i2c_i801 tpm tpm_bios
[   68.925801] Pid: 2049, comm: modprobe Not tainted 2.6.34-rc1-t61-debug #3
[   68.925807] Call Trace:
[   68.925821]  [<ffffffff8103dec5>] warn_slowpath_common+0x77/0x8f
[   68.925831]  [<ffffffff8103deec>] warn_slowpath_null+0xf/0x11
[   68.925840]  [<ffffffff811bcabe>] pci_ioremap_bar+0x2e/0x66
[   68.925852]  [<ffffffffa044b871>] rt2x00pci_probe+0x197/0x25a [rt2x00pci]
[   68.925864]  [<ffffffff811bcd03>] local_pci_probe+0x12/0x16
[   68.925873]  [<ffffffff811bdb8e>] pci_device_probe+0x60/0x8f
[   68.925886]  [<ffffffff8121f647>] ? driver_sysfs_add+0x47/0x6c
[   68.925896]  [<ffffffff8121f78b>] driver_probe_device+0xa6/0x137
[   68.925907]  [<ffffffff8121f866>] __driver_attach+0x4a/0x66
[   68.925917]  [<ffffffff8121f81c>] ? __driver_attach+0x0/0x66
[   68.925927]  [<ffffffff8121eddf>] bus_for_each_dev+0x4e/0x83
[   68.925938]  [<ffffffff8121f5fe>] driver_attach+0x19/0x1b
[   68.925948]  [<ffffffff8121f258>] bus_add_driver+0xb1/0x1fd
[   68.925958]  [<ffffffff8121fb24>] driver_register+0x98/0x109
[   68.925969]  [<ffffffff811bdde9>] __pci_register_driver+0x53/0xc3
[   68.925982]  [<ffffffffa0458000>] ? rt2500pci_init+0x0/0x20 [rt2500pci]
[   68.925995]  [<ffffffffa045801e>] rt2500pci_init+0x1e/0x20 [rt2500pci]
[   68.926037]  [<ffffffff810001ef>] do_one_initcall+0x59/0x149
[   68.926048]  [<ffffffff8106c735>] sys_init_module+0xd1/0x22e
[   68.926059]  [<ffffffff81002a6b>] system_call_fastpath+0x16/0x1b
[   68.929620] ---[ end trace 41f70445da978be0 ]---
[   68.929628] rt2x00pci -> rt2x00pci_alloc_reg: Error - Failed to allocate registers.
[   68.929652] rt2500pci 0000:16:00.0: PCI INT A disabled
[   68.929665] rt2500pci: probe of 0000:16:00.0 failed with error -12
[   68.929719] initcall rt2500pci_init+0x0/0x20 [rt2500pci] returned 0 after 4119 usecs

New dmesg and lspci attached.

Best regards,
Maciej Grela
Comment 9 Maciej Grela 2010-03-14 14:22:08 UTC
Created attachment 25511 [details]
dmesg output after plugging in the card
Comment 10 Maciej Grela 2010-03-14 14:22:55 UTC
Created attachment 25512 [details]
lspci -vvxx after plugging in the card
Comment 11 Dominik Brodowski 2010-03-14 14:56:35 UTC
I tend to agree with comment #4 . You may want to try out some kernel boot options to modify the resource handling, such as:

pci=nocrs

pci=cbmemsize=128M

pci=cbmemsize=256M

pci=cbiosize=1K
Comment 12 Maciej Grela 2010-03-14 19:42:58 UTC
Hi,

Thanks for the info. I've tried the following options:

pci=nocrs
pci=nocrs,cbmemsize=128M,cbiosize=1K
pci=nocrs,cbmemsize=256M,cbiosize=1K
pci=cbmemsize=256M,cbiosize=1K
pci=cbmemsize=128M
pci=cbmemsize=256M
pci=cbiosize=1K

Unfortunately, without success. The resource allocation error is the same every time. If you are interested I can post dmesg and lspci for each combination I tried. Can I enable more debug for the resource allocation ? I already have CONFIG_PCI_DEBUG.

Best regards,
Maciej Grela
Comment 13 Dominik Brodowski 2010-03-15 14:27:23 UTC
What about yenta_socket's module option override_bios=1 ?
Comment 14 Maciej Grela 2010-03-15 17:38:03 UTC
No change. dmesg output after modprobe yenta_socket override_bios=1 & insert card:

[  540.109303] calling  yenta_socket_init+0x0/0x20 [yenta_socket] @ 2714
[  540.109401] yenta_cardbus 0000:15:00.0: CardBus bridge found [17aa:20c6]
[  540.109503] yenta_cardbus 0000:15:00.0: CardBus bridge to [bus 16-17]
[  540.109511] yenta_cardbus 0000:15:00.0:   bridge window [io  0x8000-0x81ff]
[  540.109527] yenta_cardbus 0000:15:00.0:   bridge window [io  0x8400-0x85ff]
[  540.109544] yenta_cardbus 0000:15:00.0:   bridge window [mem 0xf4000000-0xf43fffff pref]
[  540.109561] yenta_cardbus 0000:15:00.0:   bridge window [mem 0xf8400000-0xf87fffff]
[  540.232127] yenta_cardbus 0000:15:00.0: ISA IRQ mask 0x0cb8, PCI irq 16
[  540.232138] yenta_cardbus 0000:15:00.0: Socket status: 30000826
[  540.232152] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge I/O window: 0x8000 - 0xbfff
[  540.232161] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xf8100000 - 0xfbffffff
[  540.232170] pcmcia_socket pcmcia_socket0: cs: memory probe 0xf8100000-0xfbffffff: excluding 0xf8100000-0xf88dffff
[  540.232213] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xf4000000 - 0xf7ffffff
[  540.232221] pcmcia_socket pcmcia_socket0: cs: memory probe 0xf4000000-0xf7ffffff: excluding 0xf4000000-0xf7ffffff
[  540.233122] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge I/O window: 0x0 - 0xcf7
[  540.233130] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge I/O window: 0xd00 - 0xffff
[  540.233139] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xa0000 - 0xbffff
[  540.233146] pcmcia_socket pcmcia_socket0: cs: memory probe 0x0a0000-0x0bffff: clean.
[  540.233186] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xd4000 - 0xd7fff
[  540.233194] pcmcia_socket pcmcia_socket0: cs: memory probe 0x0d4000-0x0d7fff: clean.
[  540.233206] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xd8000 - 0xdbfff
[  540.233214] pcmcia_socket pcmcia_socket0: cs: memory probe 0x0d8000-0x0dbfff: clean.
[  540.233227] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xdc000 - 0xdffff
[  540.233235] pcmcia_socket pcmcia_socket0: cs: memory probe 0x0dc000-0x0dffff: clean.
[  540.233248] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0x7e000000 - 0xfebfffff
[  540.234808] pcmcia_socket pcmcia_socket0: cs: memory probe 0x7e000000-0xfebfffff: excluding 0xcc000000-0xcdffffff 0xd0000000-0xd1ffffff 0xd4000000-0xd6ffffff 0xd8000000-0xd9ffffff 0xdc000000-0xf87fffff 0xfc000000-0xfeffffff
[  540.235327] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xfed40000 - 0xfed4bfff
[  540.235330] pcmcia_socket pcmcia_socket0: cs: memory probe 0xfed40000-0xfed4bfff: excluding 0xfed44000-0xfed4bfff
[  540.235671] initcall yenta_socket_init+0x0/0x20 [yenta_socket] returned 0 after 123391 usecs
[  548.341171] pcmcia_socket pcmcia_socket0: pccard: CardBus card inserted into slot 0
[  548.341755] rt2500pci 0000:16:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[  548.341778] rt2500pci 0000:16:00.0: setting latency timer to 64
[  548.341817] ------------[ cut here ]------------
[  548.341831] WARNING: at drivers/pci/pci.c:104 pci_ioremap_bar+0x2e/0x66()
[  548.341837] Hardware name: 7665D71
[  548.341841] Modules linked in: yenta_socket rsrc_nonstatic rt2500pci rt2x00pci rt2x00lib eeprom_93cx6 hidp hid sco bridge rfcomm bnep l2cap fbcon tileblit font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor tun fuse 8021q garp stp llc loop kvm_intel kvm snd_hda_codec_analog btusb bluetooth nouveau snd_hda_intel ttm snd_hda_codec drm_kms_helper drm snd_hwdep pcmcia iwlagn snd_pcm ohci1394 e1000e snd_timer fb iwlcore snd ehci_hcd ieee1394 soundcore uhci_hcd sr_mod cdrom i2c_i801 i2c_algo_bit snd_page_alloc tpm_tis psmouse intel_agp tpm tpm_bios sg cfbcopyarea cfbimgblt cfbfillrect [last unloaded: rsrc_nonstatic]
[  548.341991] Pid: 2715, comm: pccardd Tainted: G        W  2.6.34-rc1-t61 #2
[  548.341997] Call Trace:
[  548.342011]  [<ffffffff8103dec5>] warn_slowpath_common+0x77/0x8f
[  548.342021]  [<ffffffff8103deec>] warn_slowpath_null+0xf/0x11
[  548.342030]  [<ffffffff811bc944>] pci_ioremap_bar+0x2e/0x66
[  548.342043]  [<ffffffffa046f871>] rt2x00pci_probe+0x197/0x25a [rt2x00pci]
[  548.342053]  [<ffffffff811bcb8b>] local_pci_probe+0x12/0x16
[  548.342063]  [<ffffffff811bda16>] pci_device_probe+0x60/0x8f
[  548.342075]  [<ffffffff8121f38f>] ? driver_sysfs_add+0x47/0x6c
[  548.342085]  [<ffffffff8121f4d3>] driver_probe_device+0xa6/0x137
[  548.342096]  [<ffffffff8121f5ff>] __device_attach+0x35/0x3a
[  548.342105]  [<ffffffff8121f5ca>] ? __device_attach+0x0/0x3a
[  548.342115]  [<ffffffff8121e87e>] bus_for_each_drv+0x51/0x88
[  548.342125]  [<ffffffff8121f685>] device_attach+0x57/0x77
[  548.342134]  [<ffffffff8121e6ee>] bus_probe_device+0x1f/0x36
[  548.342143]  [<ffffffff8121d3a9>] device_add+0x36a/0x4d5
[  548.342153]  [<ffffffff811bb4dc>] ? pci_set_cacheline_size+0x72/0xc6
[  548.342164]  [<ffffffff811b8a3b>] pci_bus_add_device+0x16/0x3c
[  548.342173]  [<ffffffff811b8a8a>] pci_bus_add_devices+0x29/0x121
[  548.342184]  [<ffffffff8135f244>] cb_alloc+0xc4/0xdf
[  548.342194]  [<ffffffff812507bd>] socket_insert+0xc2/0x104
[  548.342204]  [<ffffffff81250ad6>] pccardd+0x1d4/0x366
[  548.342213]  [<ffffffff81250902>] ? pccardd+0x0/0x366
[  548.342223]  [<ffffffff81056677>] kthread+0x7a/0x82
[  548.342233]  [<ffffffff81003814>] kernel_thread_helper+0x4/0x10
[  548.342243]  [<ffffffff810565fd>] ? kthread+0x0/0x82
[  548.342251]  [<ffffffff81003810>] ? kernel_thread_helper+0x0/0x10
[  548.342292] ---[ end trace 002fb76d67ebaa6d ]---
[  548.342299] rt2x00pci -> rt2x00pci_alloc_reg: Error - Failed to allocate registers.
[  548.342324] rt2500pci 0000:16:00.0: PCI INT A disabled
[  548.342336] rt2500pci: probe of 0000:16:00.0 failed with error -12

Best regards,
Maciej Grela
Comment 15 Maciej Grela 2010-03-15 17:43:12 UTC
Hi,

BTW, the initial post mentioned that I am testing the card on a Dell D600. This is no longer the case and since comment #7 all testing is done on a Lenovo T61. Just so you know.

Br,
Maciej Grela
Comment 16 Dominik Brodowski 2010-03-15 18:26:36 UTC
Björn, do you -- as a PCI resource expert -- have any idea on what to do?
Comment 17 Dominik Brodowski 2010-03-15 18:28:00 UTC
Maciej: It'd be much better if you posted the WARNING for a non-tainted kernel...
Comment 18 Maciej Grela 2010-03-15 19:00:12 UTC
Sorry, the taint flag was raised by my previous attempt to plug in the card. Here is a non-tainted warning from a freshly booted kernel with a modprobed yenta_socket override_bios=1. It also has CONFIG_PCI_DEBUG enabled (-debug in kernel label):

[  137.937879] yenta_cardbus 0000:15:00.0: CardBus bridge found [17aa:20c6]
[  137.937976] yenta_cardbus 0000:15:00.0: CardBus bridge to [bus 16-17]
[  137.937984] yenta_cardbus 0000:15:00.0:   bridge window [io  0x8000-0x81ff]
[  137.938001] yenta_cardbus 0000:15:00.0:   bridge window [io  0x8400-0x85ff]
[  137.938020] yenta_cardbus 0000:15:00.0:   bridge window [mem 0xf4000000-0xf43fffff pref]
[  137.938037] yenta_cardbus 0000:15:00.0:   bridge window [mem 0xf8400000-0xf87fffff]
[  138.061112] yenta_cardbus 0000:15:00.0: ISA IRQ mask 0x0cb8, PCI irq 16
[  138.061123] yenta_cardbus 0000:15:00.0: Socket status: 30000006
[  138.061136] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge I/O window: 0x8000 - 0xbfff
[  138.061144] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xf8100000 - 0xfbffffff
[  138.061153] pcmcia_socket pcmcia_socket0: cs: memory probe 0xf8100000-0xfbffffff: excluding 0xf8100000-0xf88dffff
[  138.061195] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xf4000000 - 0xf7ffffff
[  138.061202] pcmcia_socket pcmcia_socket0: cs: memory probe 0xf4000000-0xf7ffffff: excluding 0xf4000000-0xf7ffffff
[  138.062114] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge I/O window: 0x0 - 0xcf7
[  138.062122] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge I/O window: 0xd00 - 0xffff
[  138.062129] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xa0000 - 0xbffff
[  138.062137] pcmcia_socket pcmcia_socket0: cs: memory probe 0x0a0000-0x0bffff: clean.
[  138.062177] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xd4000 - 0xd7fff
[  138.062184] pcmcia_socket pcmcia_socket0: cs: memory probe 0x0d4000-0x0d7fff: clean.
[  138.062196] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xd8000 - 0xdbfff
[  138.062204] pcmcia_socket pcmcia_socket0: cs: memory probe 0x0d8000-0x0dbfff: clean.
[  138.062217] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xdc000 - 0xdffff
[  138.062224] pcmcia_socket pcmcia_socket0: cs: memory probe 0x0dc000-0x0dffff: clean.
[  138.062237] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0x7e000000 - 0xfebfffff
[  138.062245] pcmcia_socket pcmcia_socket0: cs: memory probe 0x7e000000-0xfebfffff: excluding 0xcc000000-0xcdffffff 0xd0000000-0xd1ffffff 0xd4000000-0xd6ffffff 0xd8000000-0xd9ffffff 0xdc000000-0xf87fffff 0xfc000000-0xfeffffff
[  138.064554] yenta_cardbus 0000:15:00.0: pcmcia: parent PCI bridge Memory window: 0xfed40000 - 0xfed4bfff
[  138.064562] pcmcia_socket pcmcia_socket0: cs: memory probe 0xfed40000-0xfed4bfff: excluding 0xfed44000-0xfed4bfff
[  140.005185] pcmcia_socket pcmcia_socket0: pccard: CardBus card inserted into slot 0
[  140.005244] pci 0000:16:00.0: found [1814:0201] class 000280 header type 00
[  140.005482] pci 0000:16:00.0: calling quirk_resource_alignment+0x0/0x19d
[  140.085441] rt2500pci 0000:16:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[  140.085455] rt2500pci 0000:16:00.0: enabling bus mastering
[  140.085472] rt2500pci 0000:16:00.0: setting latency timer to 64
[  140.085493] rt2500pci 0000:16:00.0: enabling Mem-Wr-Inval
[  140.085515] ------------[ cut here ]------------
[  140.085528] WARNING: at drivers/pci/pci.c:104 pci_ioremap_bar+0x2e/0x66()
[  140.085534] Hardware name: 7665D71
[  140.085538] Modules linked in: rt2500pci(+) rt2x00pci rt2x00lib eeprom_93cx6 yenta_socket rsrc_nonstatic hidp hid sco bridge bnep rfcomm l2cap fbcon tileblit font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor tun fuse 8021q garp stp llc loop kvm_intel kvm snd_hda_codec_analog nouveau ttm snd_hda_intel drm_kms_helper snd_hda_codec usb_storage btusb bluetooth drm snd_hwdep snd_pcm fb sr_mod cdrom pcmcia snd_timer e1000e iwlagn i2c_algo_bit snd ohci1394 cfbcopyarea cfbimgblt sg psmouse soundcore snd_page_alloc intel_agp tpm_tis tpm tpm_bios ehci_hcd cfbfillrect iwlcore uhci_hcd ieee1394 i2c_i801 [last unloaded: rsrc_nonstatic]
[  140.085674] Pid: 2600, comm: modprobe Not tainted 2.6.34-rc1-t61-debug #4
[  140.085679] Call Trace:
[  140.085693]  [<ffffffff8103dec5>] warn_slowpath_common+0x77/0x8f
[  140.085704]  [<ffffffff8103deec>] warn_slowpath_null+0xf/0x11
[  140.085713]  [<ffffffff811bce8e>] pci_ioremap_bar+0x2e/0x66
[  140.085725]  [<ffffffffa049e871>] rt2x00pci_probe+0x197/0x25a [rt2x00pci]
[  140.085737]  [<ffffffff811bd0d3>] local_pci_probe+0x12/0x16
[  140.085746]  [<ffffffff811bdf5e>] pci_device_probe+0x60/0x8f
[  140.085758]  [<ffffffff8121fa17>] ? driver_sysfs_add+0x47/0x6c
[  140.085769]  [<ffffffff8121fb5b>] driver_probe_device+0xa6/0x137
[  140.085779]  [<ffffffff8121fc36>] __driver_attach+0x4a/0x66
[  140.085789]  [<ffffffff8121fbec>] ? __driver_attach+0x0/0x66
[  140.085798]  [<ffffffff8121f1af>] bus_for_each_dev+0x4e/0x83
[  140.085808]  [<ffffffff8121f9ce>] driver_attach+0x19/0x1b
[  140.085818]  [<ffffffff8121f628>] bus_add_driver+0xb1/0x1fd
[  140.085827]  [<ffffffff8121fef4>] driver_register+0x98/0x109
[  140.085837]  [<ffffffff811be1b9>] __pci_register_driver+0x53/0xc3
[  140.085850]  [<ffffffffa04ab000>] ? rt2500pci_init+0x0/0x20 [rt2500pci]
[  140.085863]  [<ffffffffa04ab01e>] rt2500pci_init+0x1e/0x20 [rt2500pci]
[  140.085875]  [<ffffffff81000203>] do_one_initcall+0x6d/0x175
[  140.085885]  [<ffffffff8106c735>] sys_init_module+0xd1/0x22e
[  140.085895]  [<ffffffff81002a6b>] system_call_fastpath+0x16/0x1b
[  140.085903] ---[ end trace e3cfc819bf0c52ce ]---
[  140.085909] rt2x00pci -> rt2x00pci_alloc_reg: Error - Failed to allocate registers.
[  140.085931] rt2500pci 0000:16:00.0: PCI INT A disabled
[  140.085944] rt2500pci: probe of 0000:16:00.0 failed with error -12
Comment 19 Bjorn Helgaas 2010-03-15 19:41:49 UTC
There's only one pci_ioremap_bar() call in the rt2x00 driver, and it's for BAR 0, which lspci claims is a memory BAR.  But this:

> WARNING: at drivers/pci/pci.c:104 pci_ioremap_bar+0x2e/0x66()

means IORESOURCE_MEM is not set in the resource flags.  We should be setting that in __pci_read_base(), but I don't see the debug output I would expect from that function.  Can you instrument it and see what we're reading from the BARs?
Comment 20 Maciej Grela 2010-03-15 21:41:05 UTC
I haven't gone this deep yet into tracing the kernel - can you give me some pointers what to use to instrument this call ? 
The only thing I could think of was to add some printks to the __pci_get_base with the following patch:

diff -urp /usr/src/linux-2.6.34-rc1/drivers/pci/probe.c /usr/src/linux-2.6.34-rc1.mod/drivers/pci/probe.c
--- /usr/src/linux-2.6.34-rc1/drivers/pci/probe.c	2010-03-15 22:26:07.000000000 +0100
+++ /usr/src/linux-2.6.34-rc1.mod/drivers/pci/probe.c	2010-03-15 22:19:00.000000000 +0100
@@ -169,6 +169,10 @@ int __pci_read_base(struct pci_dev *dev,
 
 	res->name = pci_name(dev);
 
+	dev_printk(KERN_DEBUG, &dev->dev, 
+		"__pci_read_base called: dev=%p, pci_bar_type=%u, resource=%pR, pos=%u",
+		dev, type, res, pos);
+
 	pci_read_config_dword(dev, pos, &l);
 	pci_write_config_dword(dev, pos, l | mask);
 	pci_read_config_dword(dev, pos, &sz);
@@ -256,8 +260,12 @@ int __pci_read_base(struct pci_dev *dev,
 	}
 
  out:
+ 	dev_printk(KERN_DEBUG, &dev->dev, "__pci_get_base returning: res=%pR, rval=%u", res, (type == pci_bar_mem64) ? 1 : 0);
+
 	return (type == pci_bar_mem64) ? 1 : 0;
  fail:
+ 	dev_printk(KERN_DEBUG, &dev->dev, "__pci_get_base got to fail:");
+
 	res->flags = 0;
 	goto out;
 }

I also enabled CONFIG_DEBUG_DRIVER and CONFIG_DEBUG_DEVRES. The whole dmesg from bootup until inserting the card is attached.
Comment 21 Maciej Grela 2010-03-15 21:43:26 UTC
Created attachment 25532 [details]
dmesg output after plugging in the card (uber verbose)
Comment 22 Bjorn Helgaas 2010-03-15 22:12:29 UTC
Nice work; your patch is exactly the sort of thing I was looking for.  The interesting thing here is that we got to fail.  I think if you move your printks to right after the pci_read_config_dword() calls and print the values we read, we'll get enough information to deduce exactly what path we're taking.

Do you have any other CardBus cards?  Do they work?  The fact that we were able to read the card vendor and device IDs and bind a driver means that at least *some* config accesses work, so I wonder if the card is just broken somehow.  I did notice an msleep() somewhere in the card insertion path, so I guess you could blindly add another one and see if the card eventually starts responding.
Comment 23 Maciej Grela 2010-03-16 08:11:32 UTC
I have two Ethernet Cardbus cards and they work properly. They are both RTL-8139. I've added more prinkts to the patch and I see now it's failing on the pci_size call. The tracing patch is as follows. New dmesg attached.

diff -urp /usr/src/linux-2.6.34-rc1.orig/drivers/pci//probe.c /usr/src/linux-2.6.34-rc1/drivers/pci//probe.c
--- /usr/src/linux-2.6.34-rc1.orig/drivers/pci//probe.c 2010-03-15 22:26:07.000000000 +0100
+++ /usr/src/linux-2.6.34-rc1/drivers/pci//probe.c      2010-03-16 08:54:12.000000000 +0100
@@ -169,11 +169,18 @@ int __pci_read_base(struct pci_dev *dev,
 
        res->name = pci_name(dev);
 
+       dev_printk(KERN_DEBUG, &dev->dev, 
+               "__pci_read_base called: dev=%p, pci_bar_type=%u, resource=%pR, pos=%u",
+               dev, type, res, pos);
+
        pci_read_config_dword(dev, pos, &l);
        pci_write_config_dword(dev, pos, l | mask);
        pci_read_config_dword(dev, pos, &sz);
        pci_write_config_dword(dev, pos, l);
 
+       dev_printk(KERN_DEBUG, &dev->dev, "__pci_read_base BAR size @ pos=%u is %08x", pos, sz);
+       dev_printk(KERN_DEBUG, &dev->dev, "__pci_read_base BAR addr @ pos=%u is %08x", pos, l);
+
        /*
         * All bits set in sz means the device isn't working properly.
         * If the BAR isn't implemented, all bits must be 0.  If it's a
@@ -192,6 +199,10 @@ int __pci_read_base(struct pci_dev *dev,
 
        if (type == pci_bar_unknown) {
                type = decode_bar(res, l);
+               dev_printk(KERN_DEBUG, &dev->dev, 
+                          "__pci_read_base unknown BAR res=%pR @ addr=%08x decoded to be type=%u", 
+                          res, l, type);
+
                res->flags |= pci_calc_resource_flags(l) | IORESOURCE_SIZEALIGN;
                if (type == pci_bar_io) {
                        l &= PCI_BASE_ADDRESS_IO_MASK;
@@ -200,10 +211,17 @@ int __pci_read_base(struct pci_dev *dev,
                        l &= PCI_BASE_ADDRESS_MEM_MASK;
                        mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;
                }
+               dev_printk(KERN_DEBUG, &dev->dev,
+                          "__pci_read_base after fixup: mask=%08x, addr=%08x, res=%pR",
+                          mask, l, res);
+
        } else {
                res->flags |= (l & IORESOURCE_ROM_ENABLE);
                l &= PCI_ROM_ADDRESS_MASK;
                mask = (u32)PCI_ROM_ADDRESS_MASK;
+               dev_printk(KERN_DEBUG, &dev->dev, 
+                          "__pci_read_base known BAR type=%u after fixup: mask=%08x, addr=%08x, res=%pR",
+                          type, mask, l, res);
        }
 
        if (type == pci_bar_mem64) {
@@ -245,6 +263,9 @@ int __pci_read_base(struct pci_dev *dev,
                }
        } else {
                sz = pci_size(l, sz, mask);
+               dev_printk(KERN_DEBUG, &dev->dev, 
+                       "__pci_get_base pci_size for res=%pR @ addr=%08x, mask=%08x is %u", 
+                       res, l, mask, sz);
 
                if (!sz)
                        goto fail;
@@ -256,8 +277,12 @@ int __pci_read_base(struct pci_dev *dev,
        }
 
  out:
+       dev_printk(KERN_DEBUG, &dev->dev, "__pci_get_base returning: res=%pR, rval=%u", res, (type == pci_bar_mem64) ? 1 : 0);
+
        return (type == pci_bar_mem64) ? 1 : 0;
  fail:
+       dev_printk(KERN_DEBUG, &dev->dev, "__pci_get_base got to fail:");
+
        res->flags = 0;
        goto out;
 }

I will try later today to add another msleep to the rt2500pci code.
Comment 24 Maciej Grela 2010-03-16 08:12:14 UTC
Created attachment 25547 [details]
dmesg output after plugging in the card (uber verbose)
Comment 25 Bjorn Helgaas 2010-03-16 17:59:54 UTC
> pci 0000:16:00.0: __pci_read_base BAR size @ pos=16 is 7fffe000

We wrote 0xffffffff to the BAR and read back 0x7fffe000.  This is illegal for PCI devices, where the MSB would have to be set.

I don't know much about CardBus, but the MindShare CardBus book says one difference between PCI and CardBus is that software learns about PCI resource requirements from config space (BARs), but learns about CardBus resource requirements from CIS tuples.

I see some dev_dbg()s in drivers/pcmcia/cistpl.c; can you figure out how to turn those on and see if we're looking at CIS?  (BTW, you can just attach dmesg logs without compressing them.)

Sorry if I'm wasting your time on a wild-goose chase; I don't know anything about CIS at all.
Comment 26 Dominik Brodowski 2010-03-16 18:14:18 UTC
No, cistplc.c is unused for CardBus (32-bit) devices; it is only used for PCMCIA (16-bit) devices.
Comment 27 Bjorn Helgaas 2010-03-16 18:41:22 UTC
OK.  What does that mean?  Is CIS irrelevant for 32-bit CardBus devices?

The device has 0x601 in the CIS pointer at offset 0x28 in PCI config space.  It looks like that means the CIS is at offset 0x600 in the memory pointed to by BAR 0.  The book suggests (p. 309) we should use standard PCI probing to determine the size, then program the BAR, then read the CIS out of it.

But it seems like BAR 0 doesn't follow the PCI spec, so we can't size it.  As an experiment, I suppose we could try setting that MSB to see if we get any farther, e.g.,

    if (dev->bus == 0x16 && dev->devfn == PCI_DEVFN(0, 0) && pos == 16)
        sz |= 0x80000000;
Comment 28 Maciej Grela 2010-03-16 20:40:40 UTC
We have progress :), adding the quirk you've written helped the driver to initialize the card. Unfortunately, it doesn't work. When I plugged it in (T=47s) it logged some I/O errors but it *did* manage to at least read the EEPROM which probably means we are on the right track. Trying to do `ifconfig wlan1 up` resulted in:

[  116.815682] phy1 -> rt2x00pci_regbusy_read: Error - Indirect register access failed: offset=0x000000f0, value=0xffffffff
[  116.816615] phy1 -> rt2x00pci_regbusy_read: Error - Indirect register access failed: offset=0x000000f0, value=0xffffffff
[  116.817500] phy1 -> rt2x00pci_regbusy_read: Error - Indirect register access failed: offset=0x000000f0, value=0xffffffff
[  116.818155] phy1 -> rt2x00pci_regbusy_read: Error - Indirect register access failed: offset=0x000000f0, value=0xffffffff
[  116.819190] phy1 -> rt2x00pci_regbusy_read: Error - Indirect register access failed: offset=0x000000f0, value=0xffffffff
[  116.819494] phy1 -> rt2500pci_wait_bbp_ready: Error - BBP register access failed, aborting.
[  116.819503] phy1 -> rt2500pci_set_device_state: Error - Device failed to enter state 4 (-5).

Both the tracing patch and dmesg attached (uncompressed this time).
Comment 29 Maciej Grela 2010-03-16 20:41:42 UTC
Created attachment 25558 [details]
dmesg output after plugging in the card (uber verbose)
Comment 30 Maciej Grela 2010-03-16 20:42:36 UTC
Created attachment 25559 [details]
__pci_read_base tracing patch for GN-WMKG PCMCIA card resource alloc
Comment 31 Bjorn Helgaas 2010-03-16 21:07:07 UTC
We tried to put the device at 0x80000000, but when we wrote that to the BAR, it didn't stick:

  pci 0000:16:00.0: BAR 0: assigned [mem 0x80000000-0x80001fff]
  pci 0000:16:00.0: BAR 0: error updating (0x80000000 != 0x000000)
  pci 0000:16:00.0: BAR 0: set to [mem 0x80000000-0x80001fff] (PCI address [0x80000000-0x80001fff]

That explains why the card still doesn't work -- our accesses go to 0x80000000, but the card is listening at 0x0, so nobody responds, and we get 0xffffffff.

I think you have a bad address line, either on the card or in the socket.  You mentioned two RTL-8139 cards that do work.  If they got programmed to an address above 0x80000000 (you can check with lspci -v), that would suggest that the socket works.
Comment 32 Maciej Grela 2010-03-16 22:21:44 UTC
Indeed, the RTL8139 card has memory @ 0x80000000:

16:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
	Subsystem: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+
	Flags: bus master, medium devsel, latency 64, IRQ 16
	I/O ports at 8000 [size=256]
	Memory at 80000000 (32-bit, non-prefetchable) [size=512]
	Capabilities: [50] Power Management version 2
	Kernel driver in use: 8139too
	Kernel modules: 8139too, 8139cp

Dmesg output after plugging in:

[ 6067.643299] pcmcia_socket pcmcia_socket0: pccard: CardBus card inserted into slot 0
[ 6067.643355] pci 0000:16:00.0: reg 10: [io  0x0000-0x00ff]
[ 6067.643367] pci 0000:16:00.0: reg 14: [mem 0x00000000-0x000001ff]
[ 6067.643454] pci 0000:16:00.0: supports D1 D2
[ 6067.643457] pci 0000:16:00.0: PME# supported from D1 D2 D3hot
[ 6067.643464] pci 0000:16:00.0: PME# disabled
[ 6067.643481] pci 0000:16:00.0: BAR 1: assigned [mem 0x80000000-0x800001ff]
[ 6067.643490] pci 0000:16:00.0: BAR 1: set to [mem 0x80000000-0x800001ff] (PCI address [0x80000000-0x800001ff]
[ 6067.643495] pci 0000:16:00.0: BAR 0: assigned [io  0x8000-0x80ff]
[ 6067.643503] pci 0000:16:00.0: BAR 0: set to [io  0x8000-0x80ff] (PCI address [0x8000-0x80ff]
[ 6067.643519] pci 0000:16:00.0: cache line size of 64 is not supported
[ 6067.669384] calling  cp_init+0x0/0x35 [8139cp] @ 3004
[ 6067.669392] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
[ 6067.669446] 8139cp 0000:16:00.0: This (id 10ec:8139 rev 10) is not an 8139C+ compatible chip, use 8139too
[ 6067.670002] initcall cp_init+0x0/0x35 [8139cp] returned 0 after 587 usecs
[ 6067.680944] calling  rtl8139_init_module+0x0/0x2e [8139too] @ 3004
[ 6067.680952] 8139too: 8139too Fast Ethernet driver 0.9.28
[ 6067.681024] 8139too 0000:16:00.0: enabling device (0000 -> 0003)
[ 6067.681047] 8139too 0000:16:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 6067.681074] 8139too 0000:16:00.0: setting latency timer to 64
[ 6067.683413] 8139too 0000:16:00.0: eth1: RealTek RTL8139 at 0x8000, 00:a1:b0:ec:6c:26, IRQ 16
[ 6067.683468] initcall rtl8139_init_module+0x0/0x2e [8139too] returned 0 after 2448 usecs
[ 6067.696392] udev: renamed network interface eth1 to eth4
[ 6067.700060] eth4: link down
[ 6067.700582] ADDRCONF(NETDEV_UP): eth4: link is not ready

Sad, this would suggest that the card is indeed broken.
Comment 33 Bjorn Helgaas 2010-03-16 22:33:55 UTC
Maybe you can try the card in a different system, or under Windows?  If it works anywhere, take a look at the lspci -v output or Windows Device Manager to see what resources it's using.
Comment 34 Maciej Grela 2010-03-16 22:53:13 UTC
I've tried under Ubuntu but the result is the same (not able to allocate registers). I don't have a Windows laptop with PCMCIA to test but I will find someone who has and check for sure. 
Now that I think about all this it makes perfect sense - the bit 31 on the data bus is broken and cannot be set to 1. This is why we have read 0x7fffe000 from the BAR instead of 0xffffe000 and also why the base address cannot be set to 0x80000000 by the kernel. But I will test anyway on Windows to be 100 % sure. 

Thanks for help, I really appreciate the amount of time you've spent assisting me in hunting down this stuff :)
Comment 35 Maciej Grela 2010-03-17 09:52:08 UTC
Checked the card under Windows - the same error allocating the resources. I have another confirmation, that the HW is broken - the .inf file of Windows driver mentiones the subsystem ID of the card to be 1458:E831 but in the configuration space we see 1458:6831. 0xE831 = 0x6831 + 0x8000 -> bit 31 is stuck to 0.

From my pov we can close this ticket. Juha, can you post your lspci -vx so we can check if the case is the same for your card ?
Comment 36 Bjorn Helgaas 2010-03-18 18:45:07 UTC
Created attachment 25593 [details]
complain about broken devices

Here's a possible Linux patch to complain about devices that appear to be broken.  This doesn't fix anything; the idea is only to make it easier to debug problems like this in the future.  If you try this please attach the complete dmesg so I can make sure I included enough information and the format is right.
Comment 37 Maciej Grela 2010-03-19 09:41:10 UTC
Dmesg attached, the patch works.
Comment 38 Maciej Grela 2010-03-19 09:42:10 UTC
Created attachment 25606 [details]
Dmesg with applied broken device detection patch
Comment 39 Bjorn Helgaas 2010-03-19 14:03:55 UTC
That looks reasonable to me.  Right now, I don't want to put too much code in Linux to deal with broken devices, just enough to give us a pointer in the right direction.  But it'd be interesting to know how Windows reported the problem.  Did it just refuse to deal with the card at all?  Or did it bind a driver and then the driver failed somehow?
Comment 40 Maciej Grela 2010-03-19 15:08:52 UTC
First of all it didn't want to install the driver because the subsystem ID didn't match the one in driver .inf file (see comment #35). 
After I hacked the .inf the driver installed but later the system reported that it "cannot allocate resources for the device" in the hw manager device info window. The driver was "bound" in the sense that it was visible in the hardware manager but I don't know if it was initialized or not. I should've done a screenshot of the window but unfortunately I haven't though about this.