Bug 206961
Summary: | iwlwifi: AX200: channel switching many times causes Tx queue alloc failed | ||
---|---|---|---|
Product: | Drivers | Reporter: | Jeff Schuler (jschuler) |
Component: | network-wireless-intel | Assignee: | Default virtual assignee for network-wireless-intel (drivers_network-wireless-intel) |
Status: | NEW --- | ||
Severity: | blocking | CC: | jschuler, linuxwifi, thomas.f.steeples, ZeroBeat |
Priority: | P1 | ||
Hardware: | ARM | ||
OS: | Linux | ||
Kernel Version: | 5.1.0 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
system infos, kernel logs, and reproduction bash script.
picture of the wifi card firmware trace for iwlwifi solution patch |
Description
Jeff Schuler
2020-03-25 19:05:48 UTC
Created attachment 288059 [details]
picture of the wifi card
Created attachment 288061 [details]
firmware trace for iwlwifi
Started trace capture prior to running reproduction script test, stopped trace capture once iwlwifi crashed. `sudo trace-cmd record -e iwlwifi`
I can confirm this issue, because I received a similar issue report here: https://github.com/ZerBea/hcxdumptool/issues/105 Switching channels here (and we are doing this really fast): https://github.com/ZerBea/hcxdumptool/blob/master/hcxdumptool.c#L3812 causes a crash. (In reply to Michael from comment #3) > I can confirm this issue, because I received a similar issue report here: > https://github.com/ZerBea/hcxdumptool/issues/105 > Switching channels here (and we are doing this really fast): > https://github.com/ZerBea/hcxdumptool/blob/master/hcxdumptool.c#L3812 > causes a crash. This is me. Running Arch with kernel 5.5.11, x86_64 architecture. Same card and firmware. @thomas - race condition while reporting the issue. Anyway, I can confirm it due to hcxdumptool debug printf(). Stay time on an empty channel before we try to set a new channel is 200000usec - enough to crash the driver/firmware. You can increase/decrease it here: https://github.com/ZerBea/hcxdumptool/blob/master/include/hcxdumptool.h#L55 Stay time on a busy channel is one second. BTW: iw use NETLINK (--debug), while hcxdumptool is running pure ioctl() system calls. Yeah, Thomas, we were not able to reproduce this on x86_64. However, we did find the bug. It's a memory leak in iwlmvm. Initial testing looks promising, after we have completed our diligence and are happy with the solution, will post a patch here! Created attachment 288131 [details]
solution patch
Patching the memory leak. The PCIe Tx Q is allocated over and over again and never freed! Intel should probably fix the root of this problem as I doubt this is the 100% best place to do this...
|