Bug 200821
Summary: | iwlwifi: 9260: TX on unused queue | ||
---|---|---|---|
Product: | Drivers | Reporter: | Jan Fuchs (oposum) |
Component: | network-wireless | Assignee: | DO NOT USE - assign "network-wireless-intel" component instead (linuxwifi) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | L.Bonnaud, oposum |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.15.0-30-generic | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
requested trace during the testcase
trace-tx-on-unused-queue2 |
Description
Jan Fuchs
2018-08-15 16:10:34 UTC
are you able to reproduce this? If yes, what is the exact scenario? Yes, my testcase is fully automated and so it's reproducible everytime the testcase gets executed. The scenario is a self written testcase, based on a testcase from a well known "Alliance", where a Channel Switch Announcement is triggered on our access point. As you can see in the dmesg/syslog output, the crash happens directly after "wlp4s0: deauthenticating from 00:a0:57:1c:05:58 by local choice (Reason: 3=DEAUTH_LEAVING)". I've noticed, you made some changes in the MASTER branch recently, regarding the handling of a CSA IE [0], which seems related (or the same?) to my issue? Should I give this a try? [0]: https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/backport-iwlwifi.git/commit/?id=a78ef0d62b8810282113fea579b60d70f0fd972d The patch here shouldn't be relevant since it relates only for cases where we don't do channel switch but rather disconnect. I don't understand why you have a disconnection. Can you please record tracing during the test? I am going on vacation now for 2 weeks. Created attachment 277987 [details]
requested trace during the testcase
Sure, see the zipped trace file. I used "sudo trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e iwlwifi_msg" as trace command.
Enjoy your vacation! :)
Hm, I think my trace won't help. I need to activate "CONFIG_IWLWIFI_TRACING" first, right? It's iwlwifi from ubuntu, not a compiled one from the github. I need more time to do this. Sorry. Dear Emmanuel, did you have had time to have a look at my attached trace? :) You said you didn't have CONFIG_IWLWIFI_TRACING enabled, right? That's why I didn't look at this. I'll try to find time for this on Sunday. This data is unusable unfortunately. Please don't remove the module while you record the data. I tried to parse the data using several ways. The data looks mostly ASCII and 99% is a symbol list which is not what is expected here. Are you sure you followed the instructions? What flag did you add to the trace-cmd recording command? Please look at https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging#tracing Thanks. Created attachment 278591 [details] trace-tx-on-unused-queue2 Thanks for feedback! >What flag did you add to the trace-cmd recording command? As mentioned in comment #4 "sudo trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e iwlwifi_msg" >Are you sure you followed the instructions? I think you are right. Removing the iwlwifi module was the problem for the unsuable trace. I've attached another trace (trace-tx-on-unused-queue2.zip), where the iwlwifi module is not removed while recording. The corresponding dmesg-output: [1728883.442825] wlp4s0: authenticate with 00:a0:57:1c:05:58 [1728883.448608] wlp4s0: send auth to 00:a0:57:1c:05:58 (try 1/3) [1728883.486993] wlp4s0: authenticated [1728883.491025] wlp4s0: associate with 00:a0:57:1c:05:58 (try 1/3) [1728883.492428] wlp4s0: RX AssocResp from 00:a0:57:1c:05:58 (capab=0x1111 status=0 aid=1) [1728883.494894] wlp4s0: associated [1728883.577431] wlp4s0: Limiting TX power to 13 (23 - 10) dBm as advertised by 00:a0:57:1c:05:58 [1728900.327512] wlp4s0: deauthenticating from 00:a0:57:1c:05:58 by local choice (Reason: 3=DEAUTH_LEAVING) [1728900.331370] ------------[ cut here ]------------ [1728900.331372] TX on unused queue 2 [1728900.331431] WARNING: CPU: 2 PID: 22 at /build/linux-81MBYC/linux-4.15.0/drivers/net/wireless/intel/iwlwifi/pcie/tx.c:2296 iwl_trans_pcie_tx+0x830/0xab0 [iwlwifi] [1728900.331433] Modules linked in: iwlmvm mac80211 iwlwifi cfg80211 btrfs zstd_compress xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c cpuid cmac md4 nls_utf8 cifs ccm fscache bnep nls_iso8859_1 snd_hda_codec_hdmi arc4 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_pcm kvm irqbypass rtsx_usb_ms crct10dif_pclmul memstick crc32_pclmul snd_seq_midi ghash_clmulni_intel snd_seq_midi_event pcbc snd_rawmidi snd_seq snd_seq_device snd_timer aesni_intel aes_x86_64 btusb crypto_simd snd glue_helper btrtl cryptd intel_cstate btbcm btintel intel_rapl_perf input_leds wmi_bmof bluetooth soundcore mei_me ecdh_generic mei shpchp mac_hid tpm_crb acpi_pad sch_fq_codel parport_pc [1728900.331563] ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid rtsx_usb_sdmmc rtsx_usb i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvme ahci drm r8169 libahci nvme_core mii wmi video [last unloaded: cfg80211] [1728900.331615] CPU: 2 PID: 22 Comm: ksoftirqd/2 Not tainted 4.15.0-33-generic #36-Ubuntu [1728900.331618] Hardware name: LENOVO 10NK002MGE/3102, BIOS M16KT49A 04/24/2018 [1728900.331640] RIP: 0010:iwl_trans_pcie_tx+0x830/0xab0 [iwlwifi] [1728900.331643] RSP: 0018:ffffa75a80da7c10 EFLAGS: 00010286 [1728900.331649] RAX: 0000000000000000 RBX: ffff93d05f8a2000 RCX: 0000000000000006 [1728900.331652] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff93d09ed16490 [1728900.331655] RBP: ffffa75a80da7c78 R08: 0000000000000001 R09: 0000000000000a3f [1728900.331659] R10: ffffa75a80da7c90 R11: 0000000000000000 R12: ffff93d00a6a5528 [1728900.331662] R13: ffff93d029c44600 R14: 0000000000000002 R15: ffff93cfd5162180 [1728900.331667] FS: 0000000000000000(0000) GS:ffff93d09ed00000(0000) knlGS:0000000000000000 [1728900.331671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1728900.331674] CR2: 00007fcd68f63650 CR3: 00000001c060a001 CR4: 00000000003606e0 [1728900.331677] Call Trace: [1728900.331707] ? iwl_mvm_set_tx_params+0x152/0x3f0 [iwlmvm] [1728900.331733] iwl_mvm_tx_skb_non_sta+0x187/0x340 [iwlmvm] [1728900.331755] iwl_mvm_mac_tx+0x104/0x1d0 [iwlmvm] [1728900.331809] ieee80211_tx_frags+0x14f/0x230 [mac80211] [1728900.331856] __ieee80211_tx+0x74/0x140 [mac80211] [1728900.331905] ieee80211_tx_pending+0x14d/0x200 [mac80211] [1728900.331912] ? __qdisc_run+0x61/0x320 [1728900.331921] tasklet_action+0x64/0x110 [1728900.331927] __do_softirq+0xe4/0x2bb [1728900.331934] run_ksoftirqd+0x22/0x60 [1728900.331941] smpboot_thread_fn+0xfc/0x170 [1728900.331949] kthread+0x121/0x140 [1728900.331954] ? sort_range+0x30/0x30 [1728900.331961] ? kthread_create_worker_on_cpu+0x70/0x70 [1728900.331968] ret_from_fork+0x35/0x40 [1728900.331973] Code: 80 3d 99 89 02 00 00 b8 ea ff ff ff 0f 85 bb fb ff ff 44 89 f6 48 c7 c7 d1 b3 81 c0 89 45 d0 c6 05 7a 89 02 00 01 e8 f0 01 49 fd <0f> 0b 8b 45 d0 e9 98 fb ff ff 41 0f b6 84 24 90 00 00 00 83 e0 [1728900.332081] ---[ end trace 8012b4eaf17c5e7f ]--- I think, the trace looks better now. At least trace-cmd report gives us some output this time. Something very weird here... I can see that cfg80211_tx_mlme_mgmt gets called which is strange... Can you please add this: diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c index f2dc2ad..625abda 100644 --- a/net/wireless/mlme.c +++ b/net/wireless/mlme.c @@ -174,6 +174,8 @@ void cfg80211_tx_mlme_mgmt(struct net_device *dev, const u8 *buf, size_t len) ASSERT_WDEV_LOCK(wdev); + dump_stack(); + trace_cfg80211_tx_mlme_mgmt(dev, buf, len); if (WARN_ON(len < 2)) So that we'll know where we get there from? Are you starting an AP / P2P thing on the same system? Another worth trying to take our backport tree and see if you can reproduce there. You can take the master branch of the driver and follow the indications here: https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/core_release#core_release Sorry, had to do some rework on the test setup. I've put in another PC with a new Intel 9260 + iwlwifi MASTER branch + firmware version 38.c0e03d94.0. Problem is gone, can't reproduce with MASTER branch, while the other (unmodified) PC still has the "tx unused queue" crash. So at least the master branch fixes this. Furhtermore, I don't know which iwlwifi version (ethtool says firmware v34.0.0) is running on the (unmodified) ubuntu 18.04 LTS pc => don't know which CoreXY release branch I should choose, to track down the version which fixes this issue initially. Then I suggest we close this. This bug is just about a WARNING. Spewing a WARNING isn't nice, but if that's the only problem, I am not going to hunt if it was fixed in master already. Thanks. Please reopen if you see it again on master. *** Bug 204643 has been marked as a duplicate of this bug. *** |