Bug 16111 - hostap_pci: infinite registered netdevice wifi0
Summary: hostap_pci: infinite registered netdevice wifi0
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks: 15310
  Show dependency tree
 
Reported: 2010-06-02 20:55 UTC by Petr Pisar
Modified: 2010-08-02 18:25 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.34
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
0001-hostap_pci-set-dev-base_addr-during-probe.patch (1.16 KB, patch)
2010-07-13 18:13 UTC, John W. Linville
Details | Diff
Fix backport for 2.6.34 (6.59 KB, patch)
2010-07-14 22:50 UTC, Petr Pisar
Details | Diff

Description Petr Pisar 2010-06-02 20:55:19 UTC
Hardware: PCI 802.11b network card ZCOMAX XI-626 based on Intersil Prism 2.5 chipset
Driver: hostap_pci
Architecture: i586 (Pentium TSC), i686 (AMD Thunderbid)

Loading hostap_pci module on 2.6.34 kernel causes kernel infinite loop. Latest known working kernel is 2.6.32.14 (more precisely where the module can be loaded). Transcription of kernel messages follows (text in brackets is abbreviated:

# modprobe hostap_pci
lib80211: common routine for IEEE802.11 drivers
ACPI: PCI Interrupt Link [...] enabled at IRQ 10
hostap_pci: [PCI ... -> IRQ 10]
hostap_pci: registered netdevice wifi0
net_ratelimit: 609587 callbacks suppressed

And the last message repeats ad infinitum (with different but similar number of callbacks).

I'm pretty sure it's not a user space problem because I tested it with init process being /bin/bash without any other daemons or init scripts.
Comment 1 Petr Pisar 2010-06-03 08:03:30 UTC
It works on Fedora's kernel 2.6.33.5-112.fc13.x86_64. Thus it seems like a regression between 2.6.33 and 2.6.34.
Comment 2 John W. Linville 2010-06-03 19:14:03 UTC
Seems very strange...could you try reverting the following commits?

fbc87d67af5ccd733f894273b215564c67e3a749
15920d8afc87861672e16fa95ae2764b065d6dd3

Does that address the problem?
Comment 3 Petr Pisar 2010-06-04 11:30:52 UTC
At first, I tested vanilla 2.6.34 with Fedora .config on the same machine as I tested 2.6.33 already. It's 4-core SMP system. The module loaded, but card initialization failed:

hostap_pci 0000:10:09.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
hostap_pci: Registered netdevice wifi0
wifi0: Original COR value: 0x0
wifi0: Interrupt, but dev not configured
prism2_hw_init: initialized in 196 ms
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
[...]
wifi0: hfa384x_cmd: entry still in list? (entry=ffff880117744900, type=0, res=0)
wifi0: hfa384x_cmd: command was not completed (res=0, entry=ffff880117744900, type=0, cmd=0x0021, param0=0xfd0b, EVSTAT=8010 INTEN=0010)
wifi0: interrupt delivery does not seem to work
wifi0: hfa384x_get_rid: CMDCODE_ACCESS failed (res=-110, rid=fd0b, len=8)
Could not get RID for component NIC
hostap_pci: Initialization failed
hostap_pci: hardware initialization failed
hostap_pci 0000:10:09.0: PCI INT A disabled

And the device did not appeared in network device list, nor interrupt remains assigned.

After reverting 15920d8afc87861672e16fa95ae2764b065d6dd3 only, the problem disappears, the device initialization succeeds and radio works very well:

hostap_pci 0000:10:09.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
hostap_pci: Registered netdevice wifi0
wifi0: Original COR value: 0x0
prism2_hw_init: initialized in 195 ms
wifi0: NIC: id=0x8013 v1.0.0
wifi0: PRI: id=0x15 v1.1.1
wifi0: STA: id=0x1f v1.7.4
wifi0: Intersil Prism2.5 PCI: mem=0xf3300000, irq=21
wifi0: registered netdevice wlan0

I tested it only on the SMP machine. The before mentioned 2.6.34 kernel loop was seen on single CPU systems. I guess it's a race between card initialization and IRQ handling with different impacts on SMP and single-CPU systems.
Comment 4 Colin Ian King 2010-06-08 14:31:57 UTC
The underlying issue is that prism2_config() is flawed. It registers the prism2_interrupt handler which can oops when dev->base_addr is not configured. This oopsing happens when an interrupt occurs some time between the registration of the handler and the configuration of dev->base_addr.

This seems like a catch-22 here. We can only be sure the oops won't happen once we have configured dev->base_addr, but this can only be configured after we've registered the interrupt handler and configured the I/O windows and interrupt mapping with pcmcia_request_configuration(). 

Not sure about a sane way to fix this.
Comment 5 Rafael J. Wysocki 2010-06-21 22:23:28 UTC
Patch : https://patchwork.kernel.org/patch/105008/
Handled-By : Tim Gardner <tim.gardner@canonical.com>
Comment 6 Rafael J. Wysocki 2010-06-27 22:14:47 UTC
Fixed by d6a574ff6bfb842bdb98065da053881ff527be46 .
Comment 7 Petr Pisar 2010-07-13 16:49:50 UTC
I tested 2.6.35-rc5 and self back-ported 2.6.34 on SMP x86_64 and the patch does not work. Have somebody tested the patch successfully?

Kernel log from 2.6.35-rc5 follow:

hostap_pci 0000:10:09.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
hostap_pci: Registered netdevice wifi0
wifi0: Original COR value: 0x0
net_ratelimit: 376381 callbacks suppressed
wifi0: Interrupt, but dev not configured
prism2_hw_init: initialized in 195 ms
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: Interrupt, but dev not configured
wifi0: hfa384x_cmd: entry still in list? (entry=ffff8801178f88a0, type=0, res=0)
wifi0: hfa384x_cmd: command was not completed (res=0, entry=ffff8801178f88a0, type=0, cmd=0x0021, param0=0xfd0b, EVSTAT=8010 INTEN=0010)
wifi0: interrupt delivery does not seem to work
wifi0: hfa384x_get_rid: CMDCODE_ACCESS failed (res=-110, rid=fd0b, len=8)
Could not get RID for component NIC
hostap_pci: Initialization failed
hostap_pci: hardware initialization failed
hostap_pci 0000:10:09.0: PCI INT A disabled
Comment 8 John W. Linville 2010-07-13 18:12:11 UTC
I suspect this is due to Tim, Colin, and I all having PCMCIA devices rather than PCI ones.  Patch to follow -- does it help?
Comment 9 John W. Linville 2010-07-13 18:13:17 UTC
Created attachment 27090 [details]
0001-hostap_pci-set-dev-base_addr-during-probe.patch
Comment 10 Petr Pisar 2010-07-14 12:43:39 UTC
I'm happy I can confirm the "hostap_pci: set dev->base_addr during probe" patch fixes PCI driver. I have tested it on 2.6.34 and 2.6.35-rc5 kernels on SMP x86_64 successfully. Both initialization and subsequent usage work.

I try to verify it on single-CPU system in next few days.
Comment 11 Petr Pisar 2010-07-14 22:47:55 UTC
Ok, initialization works on single-CPU i686 with 2.6.34 kernel using all necessary patches.
Comment 12 Petr Pisar 2010-07-14 22:50:43 UTC
Created attachment 27109 [details]
Fix backport for 2.6.34

This is set of patches needed to apply onto 2.6.34 to get hostap_pci initialization working again.
Comment 13 Rafael J. Wysocki 2010-07-23 11:40:36 UTC
Handled-By :  Petr Pisar <petr.pisar@atlas.cz>
Patch : https://bugzilla.kernel.org/attachment.cgi?id=27109
Comment 14 John W. Linville 2010-07-23 13:37:36 UTC
Final patch is in 2.6.35-rc6...

commit 0f4da2d77e1bf424ac36424081afc22cbfc3ff2b
Author: John W. Linville <linville@tuxdriver.com>
Date:   Tue Jul 13 14:06:32 2010 -0400

    hostap_pci: set dev->base_addr during probe
    
    "hostap: Protect against initialization interrupt" (which reinstated
    "wireless: hostap, fix oops due to early probing interrupt")
    reintroduced Bug 16111.  This is because hostap_pci wasn't setting
    dev->base_addr, which is now checked in prism2_interrupt.  As a result,
    initialization was failing for PCI-based hostap devices.  This corrects
    that oversight.
    
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
Comment 15 Petr Pisar 2010-08-02 18:25:17 UTC
I cannot see the "hostap_pci: set dev->base_addr during probe" patch in 2.6.34.2. Also no base_addr string occurs in hostap_pci.c. There is only combined patch from the three older patches.

I did not checked the kernel on hardware, but I guess 2.6.34.2 still does not work with hostap PCI card.

Did somebody forget to push the patch?

Note You need to log in before you can comment on or make changes to this bug.