Bug 84821 - Compex WLE900VX-7A not recognized
Summary: Compex WLE900VX-7A not recognized
Status: RESOLVED INVALID
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-19 16:08 UTC by Matteo Croce
Modified: 2021-04-17 19:19 UTC (History)
12 users (show)

See Also:
Kernel Version: 3.16
Subsystem:
Regression: No
Bisected commit-id:


Attachments
first boot bootlog (48.82 KB, text/plain)
2014-09-19 16:09 UTC, Matteo Croce
Details
first boot lspci (21.18 KB, text/plain)
2014-09-19 16:09 UTC, Matteo Croce
Details
second boot bootlog (52.72 KB, text/plain)
2014-09-19 16:11 UTC, Matteo Croce
Details
second boot lspci (24.15 KB, text/plain)
2014-09-19 16:11 UTC, Matteo Croce
Details
bootlog with working WLE900VX (25.22 KB, text/plain)
2014-09-22 19:59 UTC, drserge
Details
lspci of working WLE900VX (2.94 KB, text/plain)
2014-09-22 19:59 UTC, drserge
Details
Wireless card status at Linux Kernel 3.10 (364.85 KB, image/png)
2015-02-17 10:06 UTC, Winston Ho
Details
Wireless card status at Linux Kernel 3.16 (45.22 KB, image/png)
2015-02-17 10:08 UTC, Winston Ho
Details
Wireless card status at Linux Kernel 3.18 (152.92 KB, image/png)
2015-03-03 04:16 UTC, Winston Ho
Details
Add Quirk for Qualcom Atheros QCA9882 No Bus Reset (1.42 KB, patch)
2016-06-05 00:59 UTC, Christian Lamparter
Details | Diff

Description Matteo Croce 2014-09-19 16:08:40 UTC
I have some mini PCI-express cards that fails to be recognized.

The card is a Compex WLE900VX-7A http://www.compex.com.sg/productdetailinfo.asp?model=WLE900VX

and it can't be enumerated as PCI device.

On a Lenovo Thinkpad notebook it can't never be enumerated, on my Samsung NP350V5C-S09IT it appears in the devices list only after a reboot (ie. must boot, then reboot)

I'm not sure it's a Linux bug, it could be an hardware bug, but maybe it's possible to bring the interface up with some quirk.

I attach syslog and lspci output of the first and second boot
Comment 1 Matteo Croce 2014-09-19 16:09:05 UTC
Created attachment 150921 [details]
first boot bootlog
Comment 2 Matteo Croce 2014-09-19 16:09:22 UTC
Created attachment 150931 [details]
first boot lspci
Comment 3 Matteo Croce 2014-09-19 16:11:00 UTC
Created attachment 150941 [details]
second boot bootlog
Comment 4 Matteo Croce 2014-09-19 16:11:15 UTC
Created attachment 150951 [details]
second boot lspci
Comment 5 Bjorn Helgaas 2014-09-19 22:07:36 UTC
I don't see a Linux PCI issue yet.  Could this be a card seating problem?  Does it work reliably in any laptop?  Under Windows?  Under a previous version of Linux?
Comment 6 Matteo Croce 2014-09-20 00:26:53 UTC
No it's not a seating issue, after a reboot it works without having to move it.

It doesn't work with Windows either, I have only 2 notebooks to test it.
Comment 7 Matteo Croce 2014-09-22 13:42:16 UTC
Probably it's an HW issue and the cards are broken by design, the manufacturer is making silly excuses like "drivers are not released by Atheros"

http://www.compex.com.sg/forum/topic.asp?TOPIC_ID=2391

while the driver, ath10k, actually exist and is actively developed by Qualcomm Atheros
Comment 8 Bjorn Helgaas 2014-09-22 13:49:38 UTC
OK.  If they don't want their card to work, I'm not going to argue with that :(

I'm going to resolve this as "invalid" on the assumption that this is non-conforming PCIe hardware.
Comment 9 drserge 2014-09-22 19:59:20 UTC
Created attachment 151411 [details]
bootlog with working WLE900VX
Comment 10 drserge 2014-09-22 19:59:58 UTC
Created attachment 151421 [details]
lspci of working WLE900VX
Comment 11 drserge 2014-09-22 20:00:22 UTC
I have working WLE900VX (not sure it is 7A - 168c:003c) under 3.16.3 (and some older kernels earlier).
I use at for a month and had only FW issues at the beginning.

bootlog & lspci attached.
Comment 12 Bjorn Helgaas 2014-09-22 20:15:15 UTC
Matteo's device looks the same as yours, drserge:

  pci 0000:03:00.0: [168c:003c] type 00 class 0x028000
  pci 0000:03:00.0: reg 0x10: [mem 0xc0000000-0xc01fffff 64bit]
  pci 0000:03:00.0: reg 0x30: [mem 0xc0200000-0xc020ffff pref]

What FW issues did you have?  How did you resolve them?  Is there FW on the card that Matteo might need to update?
Comment 13 drserge 2014-09-22 20:29:05 UTC
The FW is loaded by driver (ath10k) so Matteo have a problem not with the FW.
My FW issue was with FW crashes ('ath10k: firmware crashed!' in dmesg), but this issue was solved by replacing an Official FW from linuxwireless.org with a CandelaTech custom one (http://www.candelatech.com/ath10k.php)
Comment 14 Matteo Croce 2014-09-22 21:43:49 UTC
My issue is that the card fails to show as PCI device in some notebook like a Thinkpad x200s.

In the lspci output the device isn't here, as if it was not plugged, the FW is another issue.

The card is working almost fine in another Samsung notebook.
I say almost because I have to boot the computer twice to get the card enumerated as PCI device. Basically I have to boot and then reboot to get the card working, until next power cycle.

On a nVidia ION2 motherboard the card behaves even worse, the system is stuck in a reboot loop in the BIOS.

I had firmware issues too that's normal, use candelatech's firmware which works better, and reports tx rate feedbacks :)
Comment 15 Winston Ho 2015-02-17 09:57:14 UTC
We would like to investigate the problem Matteo encountered. What are the configurations e.g. wireless driver, Linux version etc.? WLE900VX is identical to QCA's reference design XB140. The difficulties Matteo experienced could be present with XB140 too.
Comment 16 Winston Ho 2015-02-17 10:06:06 UTC
Created attachment 167311 [details]
Wireless card status at Linux Kernel 3.10

Linux kernel version 3.10 can detect WLE900VX radio. Kindly see attached for reference.
 
It may cause the problem with kernel version 3.16 which is customer using. We'll install the same kernel version on PC to check the status of wireless card.
Comment 17 Winston Ho 2015-02-17 10:06:51 UTC
Linux kernel version 3.10 can detect WLE900VX radio. Kindly see attached for reference: Wireless card status at Linux Kernel 3.10.
 
It may cause the problem with kernel version 3.16 which is customer using. We'll install the same kernel version on PC to check the status of wireless card.
Comment 18 Winston Ho 2015-02-17 10:08:20 UTC
Created attachment 167321 [details]
Wireless card status at Linux Kernel 3.16

It is kernel version issue. We tested with kernel version 3.16 which can't detect WLE900VX radio. I would like to suggest Matteo needs to downgrade the kernel version.
Comment 19 Winston Ho 2015-02-17 10:08:51 UTC
It is kernel version issue. We tested with kernel version 3.16 which can't detect WLE900VX radio. I would like to suggest Matteo needs to downgrade the kernel version.
Attachment: Wireless card status at Linux Kernel 3.16.
Comment 20 Matteo Croce 2015-02-17 10:14:48 UTC
It's still broken in 3.19

what kernel version should I test?
it doesn't work even on Windows (need to boot twice too)
Comment 21 Bjorn Helgaas 2015-02-17 16:05:33 UTC
Let me summarize.  Matteo reports:

- on Lenovo Thinkpad with Linux v3.16, card not enumerated
- on Lenovo Thinkpad with Windows, card doesn't work after first boot, but does work after second boot
- on Samsung NP350V5C-S09IT with Linux v3.16, card not enumerated after first boot, but is enumerated and driver works after second boot 
- on nVidia ION2, BIOS gets stuck in reboot loop, so no OS can work

Drserge reports:

- on unspecified system with Linux v3.16.3, card is enumerated and driver works

Winston reports:

- on unspecified system with Linux v3.10, card is enumerated
- on unspecified system with Linux v3.16, card not enumerated

The reboot loop on the nVidia ION2 strongly suggests an issue with the card or its firmware, since there's no OS involved.  I think the best way to resolve this is for someone to put a PCIe analyzer on it.

The only other option I can see is if somebody has a system where it works reliably on Linux v3.10 and fails reliably on v3.16, we *could* bisect it and try to find a commit that broke it.  With that information, we might be able to figure out a quirk.  But is a fair amount of work and there's no guarantee that we'd learn anything useful.  And of course it wouldn't help the nVidia system at all.
Comment 22 Matteo Croce 2015-02-17 16:15:32 UTC
not really:

- on Lenovo Thinkpad with Linux v3.16, card not enumerated
- on Lenovo Thinkpad with Windows: no Windows on such notebook
- on Samsung NP350V5C-S09IT with Linux v3.16 or Windows, card not enumerated after first boot, but is enumerated and driver works after second boot
- on nVidia ION2, BIOS gets stuck in reboot loop, so no OS can work
Comment 23 drserge 2015-02-17 17:09:37 UTC
After I've reported working card under 3.16.3, the kernel was sequentially upgraded to current 3.19.0 - no problems with enumeration or other problems with the card were detected (using the latest Candelatech firmware).
I can provide any needed info about the system (Norco MITX-6932 - based).
Comment 24 Winston Ho 2015-03-03 04:14:58 UTC
Thank you very much, @drserge@inbox.ru. I installed Ubuntu Vivid Vervet Alpha 2 with Linux Kernel 3.18.3.

The WLE900VX card is able to be enumerated properly as shown in the lspci screenshot:
Wireless card status at Linux Kernel 3.18.png

I used a miniPCIe to PCIe adapter card.

(I had earlier made a mistake of mounting the card on the motherboard's PCIe FMC slot, causing a strange pattern of detecting the card only when I insert the card and restart once.)

With the miniPCIe converter card, the WLE900VX card can still be detected even with many reboots.
Comment 25 Winston Ho 2015-03-03 04:16:57 UTC
Created attachment 168631 [details]
Wireless card status at Linux Kernel 3.18

Wireless card status at Linux Kernel 3.18.png
WLE900VX card detected:
Network controller: Qualcomm Atheros QCA988x 802.11ac Wireless Network Adapter
Comment 26 Max Roder 2015-03-07 22:29:40 UTC
I can confirm this issue on a Lenovo Thinkpad X201. Kernel version: 3.18.6, 3.19 (-ck and not-ck). The WLE900VX is not shown in dmesg/lspci, while another (Qualcomm Atheros) is.
Might be a Lenovo issue?
Comment 27 Winston Ho 2015-03-25 03:13:10 UTC
@Max, what is the other Qualcomm Atheros card that you are saying is shown in dmesg/lspci?
Comment 28 Max Roder 2015-03-29 15:43:57 UTC
I used the AR9280 (Half Size Mini PCI-E) prior to the new one without any problem for four years. 

I returned the WLE900VX and got another ac-capable card (Intel 7260) - no trouble. Unfortunately I can't give any further information, though.
Comment 29 Winston Ho 2015-07-22 04:43:09 UTC
Summarizing ...

Based on user feedback and tests, we find that the WLE900VX card is able to work with ath10k in some PCs and not able to work with ath10k in some PCs.

Atheros 11ac radio can't be used in Windows because Atheros didn't release 11ac driver for Windows.

From a number of tests with updated Linux kernels, we find that:
WLE900VX card is detected for Linux Kernel 3.18 as:
Network controller: Qualcomm Atheros QCA988x 802.11ac Wireless Network Adapter

Note that the slot needs to be a miniPCIe slot. Otherwise, please use a PCIe card adapter.

Therefore, the WLE900VX card can be used with the ath10k for 11ac radios. There is no official support for this scenario with ath10k because the WLE900VX card is designed to work well when used in a Compex board with CompexWRT firmware which combines the Atheros proprietary wireless drivers with the OpenWRT.
Comment 30 Matteo Croce 2015-07-22 09:42:38 UTC
No luck with 4.0 and 4.1, nor with Windows 8.1 or Windows 10.

BTW product page was moved here:
http://compex.com.sg:809/WPproductdetailinfo.asp?model=WLE900VX

and forum here:
http://forum.compex.com.sg/topic.asp?TOPIC_ID=2391
Comment 31 Christian Lamparter 2016-06-05 00:59:45 UTC
Created attachment 219051 [details]
Add Quirk for Qualcom Atheros QCA9882 No Bus Reset

There's a patch on the linux-pci mailing list [0] which adds a quirk 
to get this device to work with PCI Passthrough (I have attached it as a patch).

The submitter also added a fix for the older AR9485 too. So it might be
very well the case that these issues were handed down to their next
generation of cards as well.

Furthermore, I looked around. There seems to be at least one arch code
(ppc4xx_pci.c) which describes the issue as well and works around.

see arch/powerpc/sysdev/ppc4xx_pci.c Line 1323 (linux 4.6) [1]

> /*
>  * Only reset the PHY when no link is currently established.
>  * This is for the Atheros PCIe board which has problems to establish
>  * the link (again) after this PHY reset. All other currently tested
>  * PCIe boards don't show this problem.

[0] <https://patchwork.ozlabs.org/patch/627782/>
[1] <<http://lxr.free-electrons.com/source/arch/powerpc/sysdev/ppc4xx_pci.c#L1323>

Please test and report your findings!
Comment 32 Matteo Croce 2016-06-05 11:56:42 UTC
Thanks.

I've tried it with no luck.

BTW the product page was moved here: http://www.compex.com.sg/product/wle900vx/
and  the forum post removed, Compex continues even reverting on the wikidevi product page the "known issues" linking to this bug.

There should be of course some hardware bug they are aware of, don't spend too much time on it, I've solved myself putting this in /etc/rc.local:

grep -q 168c003c /proc/bus/pci/devices || exec reboot -nf
Comment 33 Jeff 2016-12-05 05:24:00 UTC
(In reply to Matteo Croce from comment #32)
> Thanks.
> 
> I've tried it with no luck.
> 
> BTW the product page was moved here:
> http://www.compex.com.sg/product/wle900vx/
> and  the forum post removed, Compex continues even reverting on the wikidevi
> product page the "known issues" linking to this bug.
> 
> There should be of course some hardware bug they are aware of, don't spend
> too much time on it, I've solved myself putting this in /etc/rc.local:
> 
> grep -q 168c003c /proc/bus/pci/devices || exec reboot -nf

Funny on the above line into rc.local => probably should make a note that if the card never comes up, this will just reboot the system indefinitely.

In any case, I have this card sort of working.  I say sort of because I'm not entirely sure how I managed to do it, as I am having trouble duplicating the effort on a new machine.  The motherboard is Gigabyte MDH11HI, The Compex Card is WLE900VX 7AA so Qualcom Atheros QCA9880, and 4.4.4-040404-generic kernel with Ubuntu 14.04 LTS Server.

Below is the output from lshw -C network with the mac address XX out.

  *-network
       description: Wireless interface
       product: QCA986x/988x 802.11ac Wireless Network Adapter
       vendor: Qualcomm Atheros
       physical id: 0
       bus info: pci@0000:01:00.0
       logical name: wlan0
       version: 00
       serial: XX:XX:XX:XX:XX:XX
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list rom ethernet physical wireless
       configuration: broadcast=yes driver=ath10k_pci driverversion=4.4.4-040404-generic firmware=10.2.4.70.9-2 latency=0 link=yes promiscuous=yes wireless=IEEE 802.11abgn
       resources: irq:35 memory:f7800000-f79fffff memory:f7a00000-f7a0ffff

Since I have a bunch of these cards, does anybody want to try and help me debug this once and for all?
Comment 34 Christian Lamparter 2016-12-06 00:08:21 UTC
(In reply to Jeff from comment #33)
> In any case, I have this card sort of working.  I say sort of because I'm
> not entirely sure how I managed to do it, as I am having trouble duplicating
> the effort on a new machine.  The motherboard is Gigabyte MDH11HI, The
> Compex Card is WLE900VX 7AA so Qualcom Atheros QCA9880, and
> 4.4.4-040404-generic kernel with Ubuntu 14.04 LTS Server.

Well, the fix I posted earlier made it into 4.4-stable.
So there's a good change Canonical picked it up for their
4.4.4-040404-generic kernel.

> Below is the output from lshw -C network with the mac address XX out.
> 
>   *-network
>        description: Wireless interface
>        product: QCA986x/988x 802.11ac Wireless Network Adapter
>        vendor: Qualcomm Atheros
>        physical id: 0
>        bus info: pci@0000:01:00.0
>        logical name: wlan0
>        version: 00
>        serial: XX:XX:XX:XX:XX:XX
>        width: 64 bits
>        clock: 33MHz
>        capabilities: pm msi pciexpress bus_master cap_list rom ethernet
> physical wireless
>        configuration: broadcast=yes driver=ath10k_pci
> driverversion=4.4.4-040404-generic firmware=10.2.4.70.9-2 latency=0 link=yes
> promiscuous=yes wireless=IEEE 802.11abgn
>        resources: irq:35 memory:f7800000-f79fffff memory:f7a00000-f7a0ffff
> 
> Since I have a bunch of these cards, does anybody want to try and help me
> debug this once and for all?
AFAIK the bring-up issue is fixed was fixed by the later QCA9887.
I guess, there's a bit of a firesale for the older QCA9880 now, because the
vendors want to get rid of inventory. That said, I'll go for the
wave-2 QCA998X.

As for debugging it: Do have access to a proper PCIE Protocol Analyzer or
a way to debug Gigabyte's PCIE init code in the PEI/DXE?

Because from what I can tell, older PCIE Atheros chips had the
same issues. There's even this helpful comment in the ppc4xx_pci[0].

/*
 * Only reset the PHY when no link is currently established.
 * This is for the Atheros PCIe board which has problems to establish
 * the link (again) after this PHY reset. All other currently tested
 * PCIe boards don't show this problem.
 * This has to be re-tested and fixed in a later release!
 */
(This particular issue is still around today, I tested it with
a WNDR4700 (Has a apm82181 (PowerPC 464) SoC and two 9580 chips.
if I force the phy reset, the AR9580 won't show up.)

[0] http://lxr.free-electrons.com/source/arch/powerpc/sysdev/ppc4xx_pci.c#L1322
Comment 35 Jeff 2016-12-06 22:02:31 UTC
Thanks for getting back.  I have a feeling I have a few of the old cards, maybe even mixed in with some new ones, which is why this issue is intermittent.  Is there a revision or some other visual way via a product labeling to know which is next revision card?
Comment 36 Winston Ho 2018-03-13 10:00:03 UTC
The WLE900VX radio card is working well with the Linux kernel version 4.13.0.

We would like to present some detailed results.

The WLP1200 miniPCIe to PCIe adapter card was used.

Intel(R) 4xCore i5-3570 CPU @ 3.40GHz:
http://pastebin.com/u/doctech
  WLE900VX lspci, uname -a
  WLE900VX iw wlp2s0 info, iw list
  WLE900VX dmesg grep ath10k
  WLE900VX kern.log grep ath10k

The commands to run are:

cat /var/log/kern.log| grep -i ath10k
dmesg| grep -i ath10k
iw wlp2s0 info
iw list
lspci
uname -a

(For "iw wlp2s0 info", use "dmesg| grep -i ath10k" or "iwconfig" to check what wlan0 was renamed to.)
Comment 37 Matteo Croce 2018-03-13 10:28:26 UTC
It's not a kernel issue, it's hardware.

On my Samsung and Lenovo notebook it still doesn't work with kernel 4.15.9
Comment 38 Jeff 2018-03-14 04:06:57 UTC
Created attachment 274711 [details]
attachment-16363-0.html

Hi Matteo,

I tend to agree with you, but I really want a mini-pcie ac card, which can
run in both 2.4 and 5g ap modes simultaneously, to actually enumerate in
linux.  It has been quite frustrating.  Compex has provided proof of
something working.  I just need to figure out how exactly they managed to
do it.  Maybe it's some kind of script that has to power cycle / reset the
pci bus until it comes up.  If that is the case, then hopefully it can be
shared.  If it's something else then hopefully that can be shared as well.

Regards with thanks,
Jeff.

On Tue, Mar 13, 2018 at 6:28 AM, <bugzilla-daemon@bugzilla.kernel.org>
wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=84821
>
> --- Comment #37 from Matteo Croce (rootkit85@yahoo.it) ---
> It's not a kernel issue, it's hardware.
>
> On my Samsung and Lenovo notebook it still doesn't work with kernel 4.15.9
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
>
Comment 39 Winston Ho 2018-03-14 09:07:12 UTC
Dear Jeff,

The WLE900VX card cannot run in 2.4GHz and 5GHz bands simultaneously, only one band at a time.

We installed the Ubuntu 16.04.4 LTS (Xenial Xerus) on an ordinary desktop PC.

There is no need for a script that has to power cycle / reset the pci bus until it comes up.

There's no special configuration required, just automatic. The WLE900VX was able to connect wirelessly in station mode.

You mentioned that you have another Atheros Card in the M.2 slot. For testing purposes I recommend that you don't have other wireless cards in the PC at the same time.

There wasn't a need to input the ath10k drivers / firmware. I assume they come together with the Ubuntu automatically. 

Our PC hardware is - Intel Desktop Board DH77KC - BIOS Version KCH7710H.86A.0069.2012.0224.1825. I have some photos here:

https://photos.app.goo.gl/IQm0OxvuyNRnavFY2
Intel Desktop PC Splash and BIOS Screens

Can you let us know your PC hardware details please?

Thanks,
Winston
Comment 40 Matteo Croce 2018-03-14 10:04:45 UTC
(In reply to Jeff from comment #38)
> I tend to agree with you, but I really want a mini-pcie ac card, which can
> run in both 2.4 and 5g ap modes simultaneously, to actually enumerate in
> linux.  It has been quite frustrating.

Well the problem is not Linux, the card doesn't enumerate also in Windows or even in the BIOS itself.
The problem should be some bug in the HW which makes the card slow to bringup, and depending on how fast (or slow) your BIOS is it can or can't be enumerated.

I've tried such crappy card on many machines and I can observe the following behaviors:
- the card just works (PC Engines APU)
- the card doesn't work at all (Most Lenovo)
- the card works only from the second boot (Samsung NP350V5C), probably because it remains powered on system reset
- the BIOS won't post at all with the card inserted (Zotac ION ITX)

I assume that Compex just tested the card on their SoC, so I suggest you not to use any Compex card with standard PC hardware, as it seems to be untested
Comment 41 Sheila Gabe 2018-03-14 12:54:22 UTC
(In reply to Matteo Croce from comment #40)
> Well the problem is not Linux, the card doesn't enumerate also in Windows or
> even in the BIOS itself.
> The problem should be some bug in the HW which makes the card slow to
> bringup, and depending on how fast (or slow) your BIOS is it can or can't be
> enumerated.

[1] says that *"The general consensus at work is - BIOSes are buggy and don't
necessarily reset the PCI bus correctly. So either you can do your own PCI bus reset post-boot (and re-enumerate all the PCI devices, including initialising their BARs) or smack your vendor to fix their BIOSes"*.

[2] says that *"the version of AMI that was on this board did a reset at power-on and then the required one later. This first reset interferes with serial eeprom loader and causes it to stop in the middle of initialization. So when the second reset comes along the cards do not properly enumerate on the bus properly and you end up with cards in the reported state"*.

I guess that the problem may be fixed by patching the BIOS. [2] says that "the fix the manufacturer did was to remove the first Root Bridge reset from the BIOS code, after that our cards would initialize and enumerate onto the bus properly".    

If you have SBCs like Banana Pi R1 / R2, or embedded boards with mini PCI express slot(s), you can try to see if the card works. Bootloaders like Das U-Boot may be less buggy.

**I wonder if the Lenovo computers / Zotac computer are using American Megatrends (AMI) BIOSes?**

[1] http://ath9k-devel.ath9k.narkive.com/1PgCqOyP/sparklan-wpea-121n-ar9382-168c-abcd
[2] https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07533.html
Comment 42 Winston Ho 2018-03-15 01:45:31 UTC
(In reply to Sheila Gabe from comment #41)
> 
> [1] says that *"The general consensus at work is - BIOSes are buggy and don't
> necessarily reset the PCI bus correctly. So either you can do your own PCI
> bus reset post-boot (and re-enumerate all the PCI devices, including
> initialising their BARs) or smack your vendor to fix their BIOSes"*.
> 
> [2] says that *"the version of AMI that was on this board did a reset at
> power-on and then the required one later. This first reset interferes with
> serial eeprom loader and causes it to stop in the middle of initialization.
> So when the second reset comes along the cards do not properly enumerate on
> the bus properly and you end up with cards in the reported state"*.
> 
> I guess that the problem may be fixed by patching the BIOS. [2] says that
> "the fix the manufacturer did was to remove the first Root Bridge reset from
> the BIOS code, after that our cards would initialize and enumerate onto the
> bus properly".    
> 
> If you have SBCs like Banana Pi R1 / R2, or embedded boards with mini PCI
> express slot(s), you can try to see if the card works. Bootloaders like Das
> U-Boot may be less buggy.
> 
> **I wonder if the Lenovo computers / Zotac computer are using American
> Megatrends (AMI) BIOSes?**
> 
> [1]
> http://ath9k-devel.ath9k.narkive.com/1PgCqOyP/sparklan-wpea-121n-ar9382-168c-
> abcd
> [2] https://www.mail-archive.com/ath9k-devel@lists.ath9k.org/msg07533.html

Many thanks for your advice, Sheila.

I understand that the BIOS is causing the PCI bus to reset incorrectly. If they wanted to use the QCA9880 module, they need to do something with the BIOS. The vendors are Samsung and Lenovo.

Thanks,
Winston
Comment 43 Jeff 2018-03-15 03:18:04 UTC
Thank you for the comments.  Instead of pointing fingers elsewhere, i.e. it's clearly the Bios fault.  Let's focus on how best to code a work around for slow cards (from any manufacturer), terrible biosii (from any manufacturer), and flaky motherboards (from any manufacturer) etc.  Linux is wonderful, let's use it.

I am perfectly happy scanning for something that should be there and if it's not or not properly enumerated then power cycle the PCI bus or along with whatever else is needed to let the card ultimately be detected.  Yes, that is not technically Pcie compliant, but I am past that concern at this point and there really is not anything else out there that, at least theoretically, offers this functionality in this package.

Sheila, I read the link.  I think you are referring to:
i2cset -y 14 0x20 0x0
sleep 1
i2cset -y 14 0x20 0x1f

which is cool, but I have no way to know what the registers will be / should be if the card is not at least partially enumerated.  The card should be on PCI 02.

Winston,

Our hardware:
SMBIOS 3.0.0 present.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: GIGABYTE
        Product Name: MDH11HI-SI
        Version: 1.x
        Serial Number: Default string
        Asset Tag: Default string
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: Default string
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0

The BIOS is an AMI-BIOS.  Which was upgraded from V5 to V9 from Gigabyte.  It made no difference.  So whatever the coding problem happens to be, if the problem is with the BIOS, then it managed to go through 9 revisions without a fix.

The cheap and dirty fix is a OS side script to scan and do this in the background.  I am an ok programmer, decent linux skills, but not do not have the skills to dig in this deep.  I have bricked four machines already and I'll brick some more trying whatever.  Anybody game to give it a shot?

Cheers!
Jeff.
Comment 44 Sheila Gabe 2018-03-15 05:27:29 UTC
(In reply to Jeff from comment #43)

After installing i2c-tools, can you probe the `i2c-dev` module?

Afterwards try `grep . /sys/bus/i2c/devices/i2c-*/name` to narrow down which bus belongs to your PCIe device.

However, I am not sure how "0x20" and "0x0", and "0x1f" are determined. Ask some experts or Compex about the details of i2c.
Comment 45 Matteo Croce 2018-03-15 10:36:42 UTC
(In reply to Winston Ho from comment #42)
> Many thanks for your advice, Sheila.
> 
> I understand that the BIOS is causing the PCI bus to reset incorrectly. If
> they wanted to use the QCA9880 module, they need to do something with the
> BIOS. The vendors are Samsung and Lenovo.
> 
> Thanks,
> Winston

Nope.
That card DOES NOT WORK on 8 out of 10 hardware I've tested on.
Samsung Zotac and Lenovo are just the ones I have more details from as I have them locally.
This is the *only* card being not reconized.

As double check, any card from any vendor (I've tested 3 Intel, 5 Qualcomm, a Ralink, a Mediatek and 2 Broadcom) works perfectly fine with all the hardware, *including* Samsung and Lenovo so I assume that those BIOSes work.

To summarize, we have a dozen of motherboards (embedded, notebook and desktop) and a dozen of cards from different vendors.
All the motherboards can reconize any card but the Compex one, and you are blaming the BIOS being buggy?

BIOSes were the cause of so many issue for long time, but this time blaming BIOS is inappropriate.

That card has something really broken.
We can workaround it? Maybe, not yet.
Do you have a proposed workaround? I'll accept it.
It's a BIOS fault? Not this time :)

Cheers,
Matteo
Comment 46 Sheila Gabe 2018-03-15 11:55:08 UTC
(In reply to Matteo Croce from comment #45)
In other words, why are most cards designed to be compatible with most BIOSes / systems, but cards from Compex are so troublesome? Is it a hardware flaw?

For me, I have several mini PCIe cards taken from different laptop vendors. They work fine on old BIOSes and new EFI BIOSes, and also on different laptop and PC models.
Comment 47 Jeff 2018-03-15 12:26:22 UTC
The interesting feature here is Compex cards behave better (sometimes getting partial enumeration) with the EFI settings on than when they are off (AMI calls this Windows 7 vs Windows 8/10).  On my dev setup, which is sort of like the production setup, this compex card worked - or at least appeared to work.  Now I have a pile of tin.  Looking at going corebios, but I need to figure out how to do a from zero port as nobody has started anything with this MB, or Chipset yet.

Sheila,

Here is the i2c-detect probe:

i2cdetect -l | sort
i2c-0   i2c             i915 gmbus dpc                          I2C adapter
i2c-1   i2c             i915 gmbus dpb                          I2C adapter
i2c-2   i2c             i915 gmbus dpd                          I2C adapter
i2c-3   i2c             DPDDC-B                                 I2C adapter
i2c-4   i2c             DPDDC-C                                 I2C adapter

It does not say much beyond this as the card is completely dark.  I can swap it out for something else that does enumerate and figure it out.  Should be 0000:02 etc. But I can chase this down further tonight.

As a note: I started with a completely new box last night with the compex card in place during the fresh install.  Ubuntu found something as my ethernet went from enp3s0 during the install to enp2s0 after reboot.

Mar 15 08:42:13 kernel: [    0.174676] pci 0000:02:00.0: [168c:003e] type 00 class 0x028000
Mar 15 08:42:13 kernel: [    0.174705] pci 0000:02:00.0: reg 0x10: [mem 0xdf000000-0xdf1fffff 64$
Mar 15 08:42:13 kernel: [    0.174856] pci 0000:02:00.0: PME# supported from D0 D3hot D3cold
Mar 15 08:42:13 kernel: [    0.174923] pci 0000:02:00.0: System wakeup disabled by ACPI

This is probably the other qualcom card but something caused ethernet to jump to 02 during the install, which pushed Ethernet to 03.  Sadly no smoking gun in any of the logs I have searched so far.

Hi Winston,

Is there an OS side helper script for this card?  This cannot possibly be a surprise for Compex, it's all over the internet.  I don't care if there is, or why that does not make it PCIe compliant, just want to try it out, instead of having to re-invent the wheel.  For example, why did you pick that hardware, what features were important?  It does not fit in my application, but I will go get the same setup just to try.

Regards,
Jeff.
Comment 48 Winston Ho 2018-03-22 10:23:30 UTC
(In reply to Matteo Croce from comment #45)

Dear Matteo,

Thanks for posting.

Could you let us know the details of the hardware models that you tested?

Compex had been investigating and has some unconfirmed ideas about the issues that you had seen.

The wireless card is basically a Qualcomm Atheros QCA9880 reference design. It has no problems working with Compex embedded boards as this is the intended purpose of high-volume customers. There is no guarantee that it would work on other types of embedded boards or Linux PCs, although we would very much wish that it works.

The QCA9880 chipset sends out a polling frame after a PCIe reset when the motherboard is powered up. The polling frame is usually long enough for CPU-PCIe-TX’s detect frame to detect it. Once detected, the QCA9880 will be recognized by the kernel. For most motherboards this is never a problem. However, we have a motherboard that fails to recognize a small percentage of the WLE900VX.

QCA had helped to look at the problem and found that the incompatible motherboard’s CPU-PCIe-TX signal came out 150ms after PCIe-reset, unlike common Intel motherboards in the market, which had the CPU-PCIe-TX signal out in about 30ms after PCIe-reset. QCA believed that the timing of the CPU-PCIe-TX signal can be altered through BIOS.

We currently have a WLE900VX card with a long polling frame, which Compex believes can be detected by such incompatible motherboards as well. I'm not sure if you would like to test it.

Another suggested workaround is :
After the motherboard sends out the CPU-PCIe-TX, perform another PCIe-Reset (this time CPU-PCIe-TX must not stop working), allowing the QCA9880 to restart polling.

Thanks,
Winston
Comment 49 Jeff 2018-03-22 20:42:58 UTC
Created attachment 274881 [details]
attachment-28985-0.html

I would certainly like to test this, can we reflash the cards?  If so, what
equipment is needed?  I did not notice a bios timing adjustment, but will
look for it tonight.

Regards with thanks,
Jeff.


On Thu, Mar 22, 2018, 6:23 AM <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=84821
>
> --- Comment #48 from Winston Ho (tx2023@gmail.com) ---
> (In reply to Matteo Croce from comment #45)
>
> Dear Matteo,
>
> Thanks for posting.
>
> Could you let us know the details of the hardware models that you tested?
>
> Compex had been investigating and has some unconfirmed ideas about the
> issues
> that you had seen.
>
> The wireless card is basically a Qualcomm Atheros QCA9880 reference
> design. It
> has no problems working with Compex embedded boards as this is the intended
> purpose of high-volume customers. There is no guarantee that it would work
> on
> other types of embedded boards or Linux PCs, although we would very much
> wish
> that it works.
>
> The QCA9880 chipset sends out a polling frame after a PCIe reset when the
> motherboard is powered up. The polling frame is usually long enough for
> CPU-PCIe-TX’s detect frame to detect it. Once detected, the QCA9880 will be
> recognized by the kernel. For most motherboards this is never a problem.
> However, we have a motherboard that fails to recognize a small percentage
> of
> the WLE900VX.
>
> QCA had helped to look at the problem and found that the incompatible
> motherboard’s CPU-PCIe-TX signal came out 150ms after PCIe-reset, unlike
> common
> Intel motherboards in the market, which had the CPU-PCIe-TX signal out in
> about
> 30ms after PCIe-reset. QCA believed that the timing of the CPU-PCIe-TX
> signal
> can be altered through BIOS.
>
> We currently have a WLE900VX card with a long polling frame, which Compex
> believes can be detected by such incompatible motherboards as well. I'm not
> sure if you would like to test it.
>
> Another suggested workaround is :
> After the motherboard sends out the CPU-PCIe-TX, perform another PCIe-Reset
> (this time CPU-PCIe-TX must not stop working), allowing the QCA9880 to
> restart
> polling.
>
> Thanks,
> Winston
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 50 Sheila Gabe 2018-03-23 01:45:39 UTC
(In reply to Winston Ho from comment #48)
> Another suggested workaround is :
> After the motherboard sends out the CPU-PCIe-TX, perform another PCIe-Reset
> (this time CPU-PCIe-TX must not stop working), allowing the QCA9880 to
> restart polling.

I have a WLE650V5-18A 7A suffering from this PCIe reset problem. As a personal consumer, what can be done to solve the problem? How do I perform a PCIe-Reset?

> QCA believed that the timing of the CPU-PCIe-TX signal can be altered through BIOS.

I have American Megatrends (AMI) BIOSes (Aptio). What is that option called?
Comment 51 Matteo Croce 2018-03-25 16:51:18 UTC
Here there are some information about my notebook:

# i2cdetect -l |sort -nk2 -t-
i2c-0   smbus           SMBus I801 adapter at 4040              SMBus adapter
i2c-1   i2c             i915 gmbus ssc                          I2C adapter
i2c-2   i2c             i915 gmbus vga                          I2C adapter
i2c-3   i2c             i915 gmbus panel                        I2C adapter
i2c-4   i2c             i915 gmbus dpc                          I2C adapter
i2c-5   i2c             i915 gmbus dpb                          I2C adapter
i2c-6   i2c             i915 gmbus dpd                          I2C adapter
i2c-7   i2c             Radeon i2c bit bus 0x90                 I2C adapter
i2c-8   i2c             Radeon i2c bit bus 0x91                 I2C adapter
i2c-9   i2c             Radeon i2c bit bus 0x92                 I2C adapter
i2c-10  i2c             Radeon i2c bit bus 0x93                 I2C adapter
i2c-11  i2c             Radeon i2c bit bus 0x94                 I2C adapter
i2c-12  i2c             Radeon i2c bit bus 0x95                 I2C adapter
i2c-13  i2c             Radeon i2c bit bus 0x96                 I2C adapter
i2c-14  i2c             Radeon i2c bit bus 0x97                 I2C adapter
i2c-15  i2c             DPDDC-B                                 I2C adapter

lshw:
    description: Laptop
    product: 350V5C/351V5C/3540VC/3440VC (P09ABE.012.CP)
    vendor: SAMSUNG ELECTRONICS CO., LTD.
    version: P09ABE.012.CP
    serial: HXYC98FD2D1VA6
    width: 64 bits
    capabilities: smbios-2.7 dmi-2.7 smp
    configuration: boot=normal chassis=laptop family=Eureka sku=P09ABE.012.CP uuid=B328858D-702A-11E2-9E12-2089841BC4DD
  *-core
       description: Motherboard
       product: NP350V5C-S09IT
       vendor: SAMSUNG ELECTRONICS CO., LTD.
       physical id: 0
       version: BOARD REVISION 00
       serial: 123490EN400015
       slot: Middle
     *-firmware
          description: BIOS
          vendor: American Megatrends Inc.
          physical id: 0
          version: P09ABE
          date: 07/04/2013
          size: 64KiB
          capacity: 4608KiB
          capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer acpi usb biosbootspecification uefi
Comment 52 Denis Odintsov 2019-03-07 09:18:41 UTC
Hi.

I think I solved the mystery of Compex cards.

Numerous tries, deep debug of PCI bus, and I have solid dual case working/non-working with a prove what make it that way.
I have one of the arm boards where almost all hardware setup and state is described by .dts file and it is fairly easy to roll things.
The core thing here is rfkill pin for the card. It seems state of rfkill pin during PCI initialisation completely defines this card being detected by PCI bridge. If rfkill pin is low, that makes the card be not detected entirely by PCI bridge. Manipulating that pin after, even with triggering PCI rescan has no effect. But having this pin explicitly hog high during boot by description in dts, before pcieport initializes pcie devices, makes the card working fine and be detected by the bridge.

So all in all that explains volatile results, that can even differ from slot to slot. It all depends on the state of the pin during pci init phase.
Comment 53 Matthias Klein 2019-03-27 12:18:34 UTC
(In reply to Denis Odintsov from comment #52)
> Hi.
> 
> I think I solved the mystery of Compex cards.
> 
> So all in all that explains volatile results, that can even differ from slot
> to slot. It all depends on the state of the pin during pci init phase.

I can confirm your observation that a WLE600VX card is only detected if W_DISABLE# (pin 20) is set corresponding. Otherwise the recognition stops for me:

pci_hotplug: PCI Hot Plug PCI Core version: 0.5
imx-pcie 1ffc000.pcie: no reserved region node.
1ffc000.pcie supply epdev_on not found, using dummy regulator
OF: PCI: host bridge /soc/pcie@0x01000000 ranges:
OF: PCI:   No bus range found for /soc/pcie@0x01000000, using [bus 00-ff]
imx-pcie 1ffc000.pcie: phy link never came up
imx-pcie 1ffc000.pcie: Link never came up
imx-pcie 1ffc000.pcie: failed to initialize host
imx-pcie 1ffc000.pcie: unable to add pcie port.
imx-pcie: probe of 1ffc000.pcie failed with error -110
ehci-pci: EHCI PCI platform driver

Environment: NXP i.MX6U7 on a self developed hardware with kernel 4.9.123


Thanks a lot for the info!
Comment 54 Kirill Raevskiy 2019-10-28 12:42:56 UTC
(In reply to Denis Odintsov from comment #52)
> Hi.
> 
> I think I solved the mystery of Compex cards.
> 

Is it possible to fix it? For example, solder 3.3v to pin 20?
Comment 55 Dennis Bland 2019-12-21 23:52:34 UTC
A simple solution that worked for me is to tape over pin 20 on the miniPCIe card.  I used clear packing tape because it was thinner than electrical tape and I was concerned a thicker tape would separate the PCB from the nearby pins.  Pin 20 in on the bottom side of the card, second pin to the left of the middle notch.

The catch, of course, is RFKILL won't work anymore (the OS can't turn off the radio).  Not an issue for my desktop, but might be for laptops. 

My details:

I had a WLE900VX card that worked flawlessly on an old Lenovo T420 running Ubuntu 18.04 (4.15 kernel), using a PCIExpress adapter.  Then I moved the WLE900VX to a new desktop with the exact same OS/kernel version with an M.2 SSD (boots in <5 seconds), and the PCI bus would never ever find the card.

After taping the pin, it gets enumerated 100% of the time.

I'm using an Asus motherboard on my desktop.  I believe Asus uses weak pull-up resistors to 3.3 volts on that line, so if the miniPCIe card isn't able to sink that line (because there is tape in the way), then the motherboard reads the input as high and assumes the hardware wants to be enumerated and enabled. 

As others have mentioned, it appears to be a timing issue in the WLE900VX hardware based on how fast the motherboard probes the PCI bus after boot.  It seems the WLE900VX hardware keeps pin 20 low for a long time after it has finished initializing and is ready to be enumerated.  Because my old laptop had a slow hard drive and a PCIExpress adapter adding a small amount of latency, I imagine the delay was long enough to avoid the problem.  With my new desktop, not so much.
Comment 56 Pali Rohár 2021-04-17 19:19:59 UTC
Just to note that in above ath9k-devel links from #comment 41, i2c was mentioned in this context:

"The units have a special controller (I^2C) able to power off and power on the card slots. I was told that this does not handle the PCI reset line correctly ..."

This was some special HW connected via i2c bus which can power PCIe slot on/off. Such thing is not in laptops or any other common boards. So you do not have to try to do anything with i2c, your laptop for sure do not have special HW connected to SMBus which can control power of PCIe slot.

But original intension was not to power slot on/off, but rather to do (out-of-band) PCIe reset. This is done via PERST# signal (pin 22 on mPCIe slot) and is called PCIe Warm Reset.

In most cases PERST# pin can be controlled by software via GPIO on many arm boards. I do not know if Intel PCIe controllers on x86 motherboards have some way to control PERST# (= control PCIe Warm Reset).

And I confirm that if Complex WLE900VX is in this "broken" non-detectable state, doing PCIe Warm Reset via PERST# pin make it visible in kernel again!

So based on above comments, seems that SW workaround for this card is to disable rfkill (enable radio via mPCIe pin 20) then do PCIe Warm Reset (via mPCIe pin 22) and rescan pci devices in system.


PS: Not to be confused with (in-band) PCIe Hot Reset, also called PCI Secondary Bus Reset. This type of reset is causing WLE900VX card to disappear and thefore must be avoided by kernel, as is was done via quirk patch from comment #31.

Note You need to log in before you can comment on or make changes to this bug.