Bug 12394
Summary: | 2.6.28 and greater: ath5k and p54usb: no association to acess point (regression) | ||
---|---|---|---|
Product: | Drivers | Reporter: | Jan Bücken (jb.faq) |
Component: | network-wireless | Assignee: | drivers_network-wireless (drivers_network-wireless) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | mcgrof, me |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.28 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
strace iwlist wlan0 scan
with debugging enabled and some tests log of the bisect for the "ap connection bug" |
Description
Jan Bücken
2009-01-09 09:16:57 UTC
Is SSID hidden? (In reply to comment #1) > Is SSID hidden? > no, and I tested it with an open network now, no difference Can you post output of 'strace iwlist wlan0 scan' ? Created attachment 19786 [details]
strace iwlist wlan0 scan
strace iwlist wlan0 scan &> strace_iwlist_wlan0_scan.
New info: It seems to me that
iwlist wlan0 scan
print_scanning_info: Allocation failed
is a bug which happens only from time to time, too:
Sometimes it scans, sometimes it doesn't.
But the "association bug" remains...
greetings
Jan
Very weird. Which version of wireless-tools? Here is where things look broken: > ioctl(3, SIOCGIWSCAN, 0x7fff75ae7cb0) = -1 E2BIG (Argument list too long) > mremap(0x7fc65d2e2000, 134221824, 268439552, MREMAP_MAYMOVE) = 0x7fc64d2e1000 > ioctl(3, SIOCGIWSCAN, 0x7fff75ae7cb0) = -1 E2BIG (Argument list too long) [...] So we're asking for scan results with a 134 *meg* buffer, it fails so we reallocate with 268 megs. > mremap(0x7fc5ed2df000, 1073745920, 18446744071562072064, MREMAP_MAYMOVE) = -1 > EFAULT (Bad address) > mmap(NULL, 18446744071562072064, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) We keep doubling until we wrap 32-bit int, then it goes negative so you get ENOMEM. Looking at mac80211 (net/mac80211/scan.c) ieee80211_scan_results, I don't see right away how you would get -E2BIG with any of those sizes, unless ieee80211_scan_result is hosed. But there weren't major changes to it in 2.6.28. As for association, can you turn on CONFIG_MAC80211_DEBUG_MENU and CONFIG_MAC80211_VERBOSE_DEBUG then post whatever shows up in 'dmesg' (if anything) when you try to associate? Created attachment 19938 [details] with debugging enabled and some tests Sorry, I was busy with an exam... (In reply to comment #5) > Very weird. Which version of wireless-tools? I use wireless-tools 29 > As for association, can you turn on CONFIG_MAC80211_DEBUG_MENU and > CONFIG_MAC80211_VERBOSE_DEBUG then post whatever shows up in 'dmesg' > (if anything) when you try to associate? Steps I had done: 0) debugging enabled (CONFIG_MAC80211_VERBOSE_DEBUG and ath5k) in the 2.6.28 vanilla (not 2.6.28.1). 1) fresh reboot with this kernel effect: get no connection to any access point NEW: 2) disabled the wireless lan with the keyboard button (seems to be hardware based, things like rfkill are disabled) 3) wait some seconds and enabled the wlan again. effect: get a connection to a access point, but loose it shortly, especally if I scan with "iwlist wlan0 scan" (sometimes it scans correctly) 4) Repeat step 2) and 3): It is reproducible. You can see this in the dmesg output. New info: First failing kernel version is the 2.6.28-rc1. It is save to do a (git) bisect between 2.6.27 and 2.6.28-rc1? I mean, can it damage my hardware if I start my system with such a kernel? (and this are 3800 patches, is there an easy way?) New Info: I told you in my previous comment that the card connects to an access point if I disable and enable it. If I start a ping to a website, then the connection doesn't breake. Bisecting won't hurt your hardware, and really it's the only thing I can think of at the moment. You can try excluding it to changes in net/ via: $ git bieect start -- net (^typo, should be bisect) Any news on this one? (In reply to comment #10) > Any news on this one? > I'm sorry, I ran into trouble with the bisect, but I hang in there: This is what I find out up to now: First the kernel in the bisect doesn't compile, something with drivers/built-in.o: In function `rtl8169_gset_xmii': r8169.c:(.text+0x7e4b8): undefined reference to `mii_ethtool_gset' make: *** [.tmp_vmlinux1] Fehler 1 and the modules does not build. After skipping some of such kernels I decided to do the bisect without the realtek 8169 and no modules (all things build in) But after testing 2.6.27 and 2.6.28-rc1 again, I find out, that both problems are not reproducible any time: At the university the problems appear more often then at home: At the university are up to 80 access points, at home up to ten. Maybe this is one reason. Next problem: After making sure that the 2.6.28-rc1 has the bug and the 2.6.27 has not, the first kernel between them in the bisect gets a kernel panic at boot. More exactly: I test it at the university and the kernel panic appears if and only if I activate the chip. Until now I skipped some kernels but all get a kernel panic (test in the university). At home some of them boot up and I can test. Maybe the same problem: Too many access points near to the tuning range. Hence its amazing to reboot the laptop every time, and so I have to spend more time on it. (In reply to comment #8) > Bisecting won't hurt your hardware, Why are you so sure? http://www.phoronix.com/scan.php?page=news_item&px=Njc0Nw This can happen every time... But I'll do the bisect, if there are no more unexpected problems. > Why are you so sure? > http://www.phoronix.com/scan.php?page=news_item&px=Njc0Nw Well, because there are no known ath5k bugs that brick the device. If there are any unknown ones, then you might as well hit it using a stable kernel :) Of course if you have e1000, that's another story. > This can happen every time... > But I'll do the bisect, if there are no more unexpected problems. Actually I believe the issue has to do with large information elements in the scan results, combined with the fact that ath5k exports lots of channels so scans take a considerable time. This can interrupt normal function of the card. There are some changes in the pipeline to address some of this. Though I don't think either of those issues are regressions, so there may be something else. 1) The connecting problem to access points: I believe bisecting is not useful, because the bug is not reproducible (at friends I get a connection every time) and the behavior of the bug chances between the bisect. I get this bug: a40c24a13366e324bc0ff8c3bb107db89312c984 is first bad commit commit a40c24a13366e324bc0ff8c3bb107db89312c984 Author: David S. Miller <davem@davemloft.net> Date: Thu Sep 11 04:51:14 2008 -0700 net: Add SKB DMA mapping helper functions. Signed-off-by: David S. Miller <davem@davemloft.net> :040000 040000 2ab13c7cac689f67d97cb8f7ca42343713c53ca0 15a1e0f81f6e8f7eb7e6659a 0f7b6b983eeda420 M include :040000 040000 ff3568bfc0848c00927e97f7c6005a7857f9c0af c877f9af828cab1c62785ead 7cf3571202ab27a7 M net 2) For the "iwlist bug" I have to do a second bisect. The bug split (I get a connection to an access point, but iwlist wlan0 scan fails) 3) I'll test what happen if I revert the commit above in 2.6.28-rc1 Created attachment 20304 [details]
log of the bisect for the "ap connection bug"
only the log
>
> 3) I'll test what happen if I revert the commit above in 2.6.28-rc1
>
It is not possible to revert this bug in 2.6.28-rc1 (too many dependencies)
After testing this commit again (boot with this kernel), I could connect to an ap. I said it is not reproducible all the time.
But: I use the wpa_supllicant and I have all networks disabled as standart. It seems to me that I can connect to an ap with more probability the faster I enable the network with the wpa_cli after reboot.
(In reply to comment #15) > 2) For the "iwlist bug" I have to do a second bisect. The bug split (I get a > connection to an access point, but iwlist wlan0 scan fails) I will wait for the 2.6.29 now and test both problems then, maybe they are gone. I will do a new bisect if and only if the problems are still present then. both still present with 2.6.29 (gentoo-sources) Please post the dmesg of the attempt to associate with AP. You can also try this patch in the meantime: http://marc.info/?l=linux-wireless&m=123841474910111&w=2 (In reply to comment #20) > Please post the dmesg of the attempt to associate with AP. It shows nothing... (tested with gentoo-sources-2.6.29-r1, CONFIG_MAC80211_DEBUG_MENU and CONFIG_MAC80211_VERBOSE_DEBUG turned on) > > You can also try this patch in the meantime: > > http://marc.info/?l=linux-wireless&m=123841474910111&w=2 I will do this next oh I forgot: only wpa_cli repeats "CTRL-EVENT-SCAN-RESULTS" regulary Happy Easter! Today I installed gentoo on an old desktop system. I have an external "Siemens Gigaset 54 Usb" adapter. I installed the 2.6.27-r8 and 2.6.29-r5 kernel (gentoo-sources). And now an important new info: I thought this bug is a problem with the ath5k, but I have the same bug with the p54usb: It is all ok with the 2.6.27 but with 2.6.29 wpa-cli does not connect to the ap! (ap with WPA2). My friend has a bcm4318 and uses the b43 - module. He is not affected by this bug. Does ath5k and p54usb have any same dependecies / code they use, which does not has / is not used by the b43-module? Maybe we can narrow the regression / patch down, now. I don't test this bug with 2.6.30 yet. (In reply to comment #23) > I don't test this bug with 2.6.30 yet. Test it. On both systems (with gentoo-sources-2.6.30-r1). It seems to be FIXED. |