Bug 219447
Description
Rahul
2024-10-31 04:45:05 UTC
(In reply to Rahul from comment #0) > Wi-Fi does not work after waking from sleep with the latest 6.11 kernel. > This regression started with 6.11, as it works correctly on 6.10. To restore > functionality, I need to reload the iwlmvm module driver. Disabling power > management did not resolve the issue either. Wi-Fi networks are visible, but > I am unable to connect to any network, even open ones. Operating System: Fedora Workstation 41 Kernel Version: 6.11.5-300.fc41.x86_64 Hardware: Mi Notebook Ultra with Intel Tiger Lake CPU (no dedicated GPU) Wi-Fi Card: Intel Wi-Fi 6 AX201 (rev 20) Can you please share kernel log? Would be useful to get tracing.. dmesg output https://pastebin.com/8fsQtKGa note: It occurs randomly but most often happens when waking from sleep. Additionally, changing the frequency of the access point seems to increase the chances of reproducing the issue. In my case, I switch from dynamic to a fixed 5 GHz frequency. After failing to connect to the 5 GHz network, it also becomes unable to connect to any network, even if it is open. Ok... interesting... can you record tracing of this? sudo trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_dbg -e iwlwifi_dbg ? (In reply to Emmanuel Grumbach from comment #4) > Ok... interesting... > > can you record tracing of this? > > sudo trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_dbg -e > iwlwifi_dbg > > ? it shows sudo trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_dbg -e iwlwifi_dbg trace-cmd: No such file or directory No events enabled with iwlwifi This is the current output when I am unable to connect to Wi-Fi. can I ask you to recompile the kernel with IWLWIFI_TRACING? let me try I am having the exact same issue with my Lenovo T15 G2 and F40. Adapter: Intel Wi-Fi 6E AX210/AX1675 2x2 [Typhoon Peak] driver: iwlwifi Also for me, 6.11.4 does not fix it, the issue remains. Booting into 6.10 makes it work normally again. @Omer, please share your kernel log, I want to make sure it is really the same issue. (In reply to Emmanuel Grumbach from comment #7) > can I ask you to recompile the kernel with IWLWIFI_TRACING? CONFIG_IWLWIFI_DEBUG=y CONFIG_IWLWIFI_DEBUGFS=y CONFIG_IWLWIFI_DEVICE_TRACING=y this work right? (In reply to Emmanuel Grumbach from comment #7) > can I ask you to recompile the kernel with IWLWIFI_TRACING? CONFIG_IWLWIFI_DEBUG=y CONFIG_IWLWIFI_DEBUGFS=y CONFIG_IWLWIFI_DEVICE_TRACING=y this work right? yes. Thanks! (In reply to Emmanuel Grumbach from comment #13) > yes. > > Thanks! done now what i do next ](In reply to Emmanuel Grumbach from comment #4) > Ok... interesting... > > can you record tracing of this? > > sudo trace-cmd record -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_dbg -e > iwlwifi_dbg > > ? after doing this. i run trace-cmd report > trace-cmd.txt i uploaded it as attachment Created attachment 307100 [details]
trace-cmd report > trace-cmd.txt
Did you reproduce the bug while recording tracing? Note that tracing needs to be running while you reproduce the bug. (In reply to Emmanuel Grumbach from comment #17) > Did you reproduce the bug while recording tracing? > > Note that tracing needs to be running while you reproduce the bug. yep i did when prodcing the issue Can you please compress and attach the raw trace.dat instead of parsing it yourself? Thanks! Created attachment 307101 [details]
trace.dat
(In reply to Emmanuel Grumbach from comment #19) > Can you please compress and attach the raw trace.dat instead of parsing it > yourself? > > Thanks! done (In reply to Emmanuel Grumbach from comment #10) > @Omer, please share your kernel log, I want to make sure it is really the > same issue. Sorry for the late reply. Meanwhile Fedora released kernel 6.11.5 to the update channel and I upgraded, but the problem persists. However, if I set "Sleep state" in UEFI to "Windows and Linux" instead of "Linux S3", the problem is gone. But as I understand this consumes more power in sleep when not connected to AC, so it would be great if I could go back to S3. Here's my dmesg log that you asked for: https://pastebin.com/B3XhvSZ2 @Omar, this is not related to the issue reported by Rahul. Please open a new bug, CC me, and add the firmware debug dump. The title of the bug should be: iwlwifi AX210 ASSERT 87 upon resume from suspend. Information on how to create such a dump can be found here: https://wireless.docs.kernel.org/en/latest/en/users/drivers/iwlwifi/debugging.html#firmware-debugging No need to trigger the dump through debugfs, it'll be created automatically because of the ASSERT. All you need is to create /sbin/iwlfwdump.sh and to add the udev rule. Thanks @Rahul, you provided excellent data, we are analyzing it. I assume we'll need more data from you soon when we'll have a a better idea of where to look for the problem. Thank you for your cooperation. (In reply to Emmanuel Grumbach from comment #23) > @Omar Sorry, I mean Omer. (In reply to Emmanuel Grumbach from comment #23) > @Rahul, you provided excellent data, we are analyzing it. > I assume we'll need more data from you soon when we'll have a a better idea > of where to look for the problem. Thank you for your cooperation. Thank you for your assistance. Just let me know if you need anything. (In reply to Emmanuel Grumbach from comment #23) > @Omar, this is not related to the issue reported by Rahul. > > Please open a new bug, CC me, and add the firmware debug dump. > The title of the bug should be: > > iwlwifi AX210 ASSERT 87 upon resume from suspend. > > Information on how to create such a dump can be found here: > > https://wireless.docs.kernel.org/en/latest/en/users/drivers/iwlwifi/ > debugging.html#firmware-debugging > > No need to trigger the dump through debugfs, it'll be created automatically > because of the ASSERT. > All you need is to create /sbin/iwlfwdump.sh and to add the udev rule. > > Thanks > > @Rahul, you provided excellent data, we are analyzing it. > I assume we'll need more data from you soon when we'll have a a better idea > of where to look for the problem. Thank you for your cooperation. I believe I am having the same bug as @Omer would you be able to confirm this? If so should I go ahead and follow the same direction? > I believe I am having the same bug as @Omer would you be able to confirm > this? If so should I go ahead and follow the same direction? Sorry here is the paste... https://pastebin.com/cHVtBcMh @Stars, I can confirm you seem to face the same issue as Rahul. (In reply to Emmanuel Grumbach from comment #28) > @Stars, I can confirm you seem to face the same issue as Rahul. Okay, thank you for confirming. If you require a second set of debug data please let me know if it helps. @Stars, I guess you can provide the same data as Rahul, it'll allow to get more evidence of the problem. I'll probably need to add more debug data in the code (will require to patch the driver) to see what's really going wrong. It looks like a race in the Tx scheduling. Thanks Actually... sudo trace-cmd record -T -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_msg -e iwlwifi_dbg That will be better. Thanks All right, I need to know why mac80211 stopped the queues. Can you please add this patch: diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 2150496130ff..2a09ef8510fc 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -3853,6 +3853,7 @@ begin: spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags); if (unlikely(q_stopped)) { + pr_err("%s - q_stopped 0x%08x\n", __func__, q_stopped); /* mark for waking later */ set_bit(IEEE80211_TXQ_DIRTY, &txqi->flags); return NULL; this print will appear in the kernel logs, but the best would be to capture then along with the other data through tracing: sudo trace-cmd record -T -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_msg -e iwlwifi_dbg -e console Please also make sure you have CONFIG_MAC80211_MESSAGE_TRACING selected in the kernel compilation. Thanks! Created attachment 307121 [details]
Trace w/ driver patch
(In reply to Emmanuel Grumbach from comment #32) > All right, I need to know why mac80211 stopped the queues. > Can you please add this patch: > diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c > index 2150496130ff..2a09ef8510fc 100644 > --- a/net/mac80211/tx.c > +++ b/net/mac80211/tx.c > @@ -3853,6 +3853,7 @@ begin: > spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags); > > if (unlikely(q_stopped)) { > + pr_err("%s - q_stopped 0x%08x\n", __func__, q_stopped); > /* mark for waking later */ > set_bit(IEEE80211_TXQ_DIRTY, &txqi->flags); > return NULL; > > > this print will appear in the kernel logs, but the best would be to capture > then along with the other data through tracing: > > sudo trace-cmd record -T -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_msg > -e iwlwifi_dbg -e console > > Please also make sure you have CONFIG_MAC80211_MESSAGE_TRACING selected in > the kernel compilation. > > Thanks! I am hoping I have done everything right, this was the first time I've ever compiled a kernel before. CONFIG_MAC80211_MESSAGE_TRACING was selected, and patch was added. File should be signed to you. First you did everything right. Second, I can see a differently failure, or to be more precise, an additional failure: 0x00000087 | ADVANCED_SYSASSERT Because of that, we go into a mess of FW recovery, and then, we have the queues stuck: wpa_supplicant-650 [010] 145.756114: console: wlp9s0: Inserted STA XX wpa_supplicant-650 [010] 145.756119: console: wlp9s0: authenticate with XX (local address=) wpa_supplicant-650 [010] 145.759106: console: wlp9s0: send auth to XX (try 1/3) kworker/10:1-166 [010] 145.760108: console: ieee80211_tx_dequeue - q_stopped 0x00000001 This means that we have a bug in iwlwifi. I'll dig. In the meantime, if you can get a reproduction without the firmware crash (which needs to be debugged regardless..) BTW, the firmware crash you saw is exactly what Omer is seeing.. Ahh.. Ok. I think I understand. Sorry for the noise. No need for more data at this stage. Thanks (In reply to Emmanuel Grumbach from comment #23) > @Omar, this is not related to the issue reported by Rahul. > > Please open a new bug, CC me, and add the firmware debug dump. > The title of the bug should be: > > iwlwifi AX210 ASSERT 87 upon resume from suspend. > > Information on how to create such a dump can be found here: > > https://wireless.docs.kernel.org/en/latest/en/users/drivers/iwlwifi/ > debugging.html#firmware-debugging > > No need to trigger the dump through debugfs, it'll be created automatically > because of the ASSERT. > All you need is to create /sbin/iwlfwdump.sh and to add the udev rule. > > Thanks > > @Rahul, you provided excellent data, we are analyzing it. > I assume we'll need more data from you soon when we'll have a a better idea > of where to look for the problem. Thank you for your cooperation. I hope I did it ok, here's the report: https://bugzilla.kernel.org/show_bug.cgi?id=219460 I initially forgot to add you to CC and did it after it was already open :) I don't want to interfere with the above, but I have the same issue. The Wi-Fi controler doesn't appear in the GUI Network Manager. Can't connect to wifi. System: Kernel: 6.11.0-9-generic arch: x86_64 bits: 64 Desktop: MATE v: 1.26.2 Distro: Ubuntu 24.10 (Oracular Oriole) Machine: Type: Convertible System: HP product: HP ENVY x360 Convertible 15-ed1xxx v: Type1ProductConfigId serial: CND0506KLY Mobo: HP model: 8826 v: 48.37 serial: PKGKVD31WEKG0V UEFI: Insyde v: F.15 date: 04/28/2021 If I do 'dmesg | grep iwlwifi' I get: [ 14.038750] iwlwifi_compat: loading out-of-tree module taints kernel. [ 14.046028] Loading modules backported from iwlwifi [ 14.046034] iwlwifi-stack-public:release/core89:12325:dcfcbdc0 [ 14.188838] iwlwifi 0000:00:14.3: Detected crf-id 0x3617, cnv-id 0x20000302 wfpm id 0x80000000 [ 14.188868] iwlwifi 0000:00:14.3: PCI dev a0f0/0074, rev=0x351, rfid=0x10a100 [ 14.188873] iwlwifi 0000:00:14.3: Detected Intel(R) Wi-Fi 6 AX201 160MHz [ 14.199951] iwlwifi 0000:00:14.3: TLV_FW_FSEQ_VERSION: FSEQ Version: 89.3.35.37 [ 14.200345] iwlwifi 0000:00:14.3: loaded firmware version 77.85be44d3.0 QuZ-a0-hr-b0-77.ucode op_mode iwlmvm [ 16.681831] iwlwifi 0000:00:14.3: SecBoot CPU1 Status: 0x5893, CPU2 Status: 0x3 [ 16.681866] iwlwifi 0000:00:14.3: WFPM_LMAC1_PD_NOTIFICATION: 0x0 [ 16.681896] iwlwifi 0000:00:14.3: HPM_SECONDARY_DEVICE_STATE: 0x42 [ 16.681927] iwlwifi 0000:00:14.3: WFPM_MAC_OTP_CFG7_ADDR: 0x0 [ 16.681956] iwlwifi 0000:00:14.3: WFPM_MAC_OTP_CFG7_DATA: 0x0 [ 16.681960] iwlwifi 0000:00:14.3: UMAC CURRENT PC: 0xa05c18 [ 16.681964] iwlwifi 0000:00:14.3: LMAC1 CURRENT PC: 0xa05c1c [ 16.681969] iwlwifi 0000:00:14.3: WRT: Collecting data: ini trigger 13 fired (delay=0ms). [ 16.682073] iwlwifi 0000:00:14.3: Start IWL Error Log Dump: [ 16.682077] iwlwifi 0000:00:14.3: Transport status: 0x00000042, valid: 6 [ 16.682082] iwlwifi 0000:00:14.3: Loaded firmware version: 77.85be44d3.0 QuZ-a0-hr-b0-77.ucode [ 16.682087] iwlwifi 0000:00:14.3: 0x00000034 | NMI_INTERRUPT_WDG [ 16.682092] iwlwifi 0000:00:14.3: 0x000022F0 | trm_hw_status0 [ 16.682096] iwlwifi 0000:00:14.3: 0x00000000 | trm_hw_status1 [ 16.682100] iwlwifi 0000:00:14.3: 0x004C94FA | branchlink2 [ 16.682104] iwlwifi 0000:00:14.3: 0x0001531C | interruptlink1 [ 16.682108] iwlwifi 0000:00:14.3: 0x0001531C | interruptlink2 [ 16.682111] iwlwifi 0000:00:14.3: 0x00014ECA | data1 [ 16.682115] iwlwifi 0000:00:14.3: 0x0BADCAFE | data2 [ 16.682119] iwlwifi 0000:00:14.3: 0x00000000 | data3 [ 16.682122] iwlwifi 0000:00:14.3: 0x00000000 | beacon time [ 16.682126] iwlwifi 0000:00:14.3: 0x00000000 | tsf low [ 16.682130] iwlwifi 0000:00:14.3: 0x00000000 | tsf hi [ 16.682134] iwlwifi 0000:00:14.3: 0x00000000 | time gp1 [ 16.682137] iwlwifi 0000:00:14.3: 0x0003D5AB | time gp2 [ 16.682141] iwlwifi 0000:00:14.3: 0x00000001 | uCode revision type [ 16.682145] iwlwifi 0000:00:14.3: 0x0000004D | uCode version major [ 16.682149] iwlwifi 0000:00:14.3: 0x85BE44D3 | uCode version minor [ 16.682153] iwlwifi 0000:00:14.3: 0x00000351 | hw version [ 16.682157] iwlwifi 0000:00:14.3: 0x00C89001 | board version [ 16.682161] iwlwifi 0000:00:14.3: 0x00000000 | hcmd [ 16.682164] iwlwifi 0000:00:14.3: 0x00020000 | isr0 [ 16.682168] iwlwifi 0000:00:14.3: 0x00000000 | isr1 [ 16.682172] iwlwifi 0000:00:14.3: 0x08F00002 | isr2 [ 16.682175] iwlwifi 0000:00:14.3: 0x00C0001C | isr3 [ 16.682179] iwlwifi 0000:00:14.3: 0x00000000 | isr4 [ 16.682183] iwlwifi 0000:00:14.3: 0x00000000 | last cmd Id [ 16.682186] iwlwifi 0000:00:14.3: 0x00014ECA | wait_event [ 16.682190] iwlwifi 0000:00:14.3: 0x00000000 | l2p_control [ 16.682194] iwlwifi 0000:00:14.3: 0x00000000 | l2p_duration [ 16.682197] iwlwifi 0000:00:14.3: 0x00000000 | l2p_mhvalid [ 16.682201] iwlwifi 0000:00:14.3: 0x00000000 | l2p_addr_match [ 16.682205] iwlwifi 0000:00:14.3: 0x0000004B | lmpm_pmg_sel [ 16.682208] iwlwifi 0000:00:14.3: 0x00000000 | timestamp [ 16.682212] iwlwifi 0000:00:14.3: 0x0000F81C | flow_handler [ 16.682258] iwlwifi 0000:00:14.3: Start IWL Error Log Dump: [ 16.682261] iwlwifi 0000:00:14.3: Transport status: 0x00000042, valid: 7 [ 16.682265] iwlwifi 0000:00:14.3: 0x20000070 | NMI_INTERRUPT_LMAC_FATAL [ 16.682270] iwlwifi 0000:00:14.3: 0x00000000 | umac branchlink1 [ 16.682274] iwlwifi 0000:00:14.3: 0x804561E2 | umac branchlink2 [ 16.682278] iwlwifi 0000:00:14.3: 0x804737A2 | umac interruptlink1 [ 16.682281] iwlwifi 0000:00:14.3: 0x804661EC | umac interruptlink2 [ 16.682285] iwlwifi 0000:00:14.3: 0x00000400 | umac data1 [ 16.682289] iwlwifi 0000:00:14.3: 0x804661EC | umac data2 [ 16.682292] iwlwifi 0000:00:14.3: 0x00000000 | umac data3 [ 16.682296] iwlwifi 0000:00:14.3: 0x0000004D | umac major [ 16.682299] iwlwifi 0000:00:14.3: 0x85BE44D3 | umac minor [ 16.682303] iwlwifi 0000:00:14.3: 0x0003D622 | frame pointer [ 16.682307] iwlwifi 0000:00:14.3: 0xC0887EE4 | stack pointer [ 16.682310] iwlwifi 0000:00:14.3: 0x00000000 | last host cmd [ 16.682314] iwlwifi 0000:00:14.3: 0x00200040 | isr status reg [ 16.682339] iwlwifi 0000:00:14.3: IML/ROM dump: [ 16.682343] iwlwifi 0000:00:14.3: 0x00000003 | IML/ROM error/state [ 16.682370] iwlwifi 0000:00:14.3: 0x00005893 | IML/ROM data1 [ 16.682399] iwlwifi 0000:00:14.3: 0x00000080 | IML/ROM WFPM_AUTH_KEY_0 [ 16.682427] iwlwifi 0000:00:14.3: Fseq Registers: [ 16.682448] iwlwifi 0000:00:14.3: 0x60000000 | FSEQ_ERROR_CODE [ 16.682474] iwlwifi 0000:00:14.3: 0x00290033 | FSEQ_TOP_INIT_VERSION [ 16.682499] iwlwifi 0000:00:14.3: 0x00090006 | FSEQ_CNVIO_INIT_VERSION [ 16.682523] iwlwifi 0000:00:14.3: 0x0000A482 | FSEQ_OTP_VERSION [ 16.682548] iwlwifi 0000:00:14.3: 0x00000003 | FSEQ_TOP_CONTENT_VERSION [ 16.682574] iwlwifi 0000:00:14.3: 0x4552414E | FSEQ_ALIVE_TOKEN [ 16.682599] iwlwifi 0000:00:14.3: 0x20000302 | FSEQ_CNVI_ID [ 16.682624] iwlwifi 0000:00:14.3: 0x01300504 | FSEQ_CNVR_ID [ 16.682664] iwlwifi 0000:00:14.3: 0x20000302 | CNVI_AUX_MISC_CHIP [ 16.682686] iwlwifi 0000:00:14.3: 0x01300504 | CNVR_AUX_MISC_CHIP [ 16.682712] iwlwifi 0000:00:14.3: 0x05B0905B | CNVR_SCU_SD_REGS_SD_REG_DIG_DCDC_VTRIM [ 16.682737] iwlwifi 0000:00:14.3: 0x0000025B | CNVR_SCU_SD_REGS_SD_REG_ACTIVE_VDIG_MIRROR [ 16.682759] iwlwifi 0000:00:14.3: 0x00000000 | FSEQ_PREV_CNVIO_INIT_VERSION [ 16.682784] iwlwifi 0000:00:14.3: 0x00290033 | FSEQ_WIFI_FSEQ_VERSION [ 16.682810] iwlwifi 0000:00:14.3: 0x00290033 | FSEQ_BT_FSEQ_VERSION [ 16.682835] iwlwifi 0000:00:14.3: 0x000000D4 | FSEQ_CLASS_TP_VERSION [ 16.682867] iwlwifi 0000:00:14.3: UMAC CURRENT PC: 0x804732b0 [ 16.682889] iwlwifi 0000:00:14.3: LMAC1 CURRENT PC: 0xd0 [ 16.682914] iwlwifi 0000:00:14.3: Failed to start RT ucode: -110 [ 16.682919] iwlwifi 0000:00:14.3: WRT: Collecting data: ini trigger 13 fired (delay=0ms). [ 17.902943] iwlwifi 0000:00:14.3: Failed to run INIT ucode: -110 There is 13 'iwlwifi-ty-a0-gf-a0-xx.ucode' in /lib/firmware. And the above shows '13 fired' I have no idea how to solve it. Same phenomena, but don't know if same problem/issue then with your above discussion. Thanks for any advice. W @Wadford, this is a different issue. Please open a new bug if you want, but we must not mix issues in the same ticket. Created attachment 307129 [details]
patch - don't dump FW state upon RFKILL in suspend
@Rahul,
can you please add the patch attached?
It'll avoid printing the FW state when it's not really needed and can possibly fix the firmware load failure that you see later on.
Thanks
(In reply to Emmanuel Grumbach from comment #41) > Created attachment 307129 [details] > patch - don't dump FW state upon RFKILL in suspend > > @Rahul, > > can you please add the patch attached? > It'll avoid printing the FW state when it's not really needed and can > possibly fix the firmware load failure that you see later on. > > Thanks any other things to add in compile kernel other than CONFIG_IWLWIFI_DEBUG=y CONFIG_IWLWIFI_DEBUGFS=y CONFIG_IWLWIFI_DEVICE_TRACING=y and this patch. Yes :) The debug print patch from Comment#32 and CONFIG_MAC80211_MESSAGE_TRACING Thanks!! (In reply to Emmanuel Grumbach from comment #36) > BTW, the firmware crash you saw is exactly what Omer is seeing.. Shall I migrate over to Omer's bug report? Or was the crash just coincidence? (In reply to Stars from comment #44) > (In reply to Emmanuel Grumbach from comment #36) > > BTW, the firmware crash you saw is exactly what Omer is seeing.. > > Shall I migrate over to Omer's bug report? Or was the crash just coincidence? Please do (In reply to Emmanuel Grumbach from comment #43) > Yes :) > > The debug print patch from Comment#32 > > and CONFIG_MAC80211_MESSAGE_TRACING > > Thanks!! thanks compiling (In reply to Rahul from comment #46) > (In reply to Emmanuel Grumbach from comment #43) > > Yes :) > > > > The debug print patch from Comment#32 > > > > and CONFIG_MAC80211_MESSAGE_TRACING > > > > Thanks!! > > thanks compiling is both patch up to date with latest kernel 6.11.6 because it not applying Created attachment 307130 [details]
don't dump FW log on rfkill in wowlan
Hi,
sorry, this is the patch based on 6.11.6
The previous one was based on our internal tree.
(In reply to Emmanuel Grumbach from comment #40) > @Wadford, this is a different issue. Please open a new bug if you want, but > we must not mix issues in the same ticket. OK but it seems I won't have to After trying a few other things, last night I did this: sudo mv /usr/lib/firmware/iwlwifi-ty-a0-gf-a0.pnvm.zst /usr/lib/firmware/iwlwifi-ty-a0-gf-a0.bak I found here on bugzilla.kernel.org/, where it was said: ''After upgrading linux-firmware lib,.... always have to use this command `sudo mv /usr/lib/firmware/iwlwifi-ty-a0-gf-a0.pnvm /usr/lib/firmware/iwlwifi-ty-a0-gf-a0.bak` (https://askubuntu.com/questions/1360175/intel-wifi-6-ax210-wifi-not-working-after-update), and after this command, Wi-Fi module works.'' When firing up the laptop minutes ago, I am back to normal, like if the week-end being just a bad dream. I'll come back here and issue a new ticket if the solution doesn't persist (i.e. if the issue comes back) All my thanks for your pointing me in the right direction... for the amateur that I am. W Hi, This is really strange... you shouldn't have to disable the PNVM file (Which is what yo udo here) to have things working.. In any case, that's a different issue. (In reply to Emmanuel Grumbach from comment #48) > Created attachment 307130 [details] > don't dump FW log on rfkill in wowlan > > Hi, > > sorry, this is the patch based on 6.11.6 > The previous one was based on our internal tree. After applying both patches, the output shows of "sudo trace-cmd record -T -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_msg -e iwlwifi_dbg -e console" attached Created attachment 307137 [details]
sudo trace-cmd record -T -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_msg -e iwlwifi_dbg -e console
Created attachment 307140 [details]
fix candidate
Can you please apply the patch attached?
I believe it should fix the issue, or at least certain occurrences of the issues.
At this point, you can remove all the other patches I asked you to apply and use only this one.
Thnks for your cooperation!
(In reply to Emmanuel Grumbach from comment #53) > Created attachment 307140 [details] > fix candidate > > Can you please apply the patch attached? > > I believe it should fix the issue, or at least certain occurrences of the > issues. > At this point, you can remove all the other patches I asked you to apply and > use only this one. > > Thnks for your cooperation! It looks like it's fixed the issue, but I will wait for a few days before jumping to a conclusion. (In reply to Rahul from comment #54) > (In reply to Emmanuel Grumbach from comment #53) > > Created attachment 307140 [details] > > fix candidate > > > > Can you please apply the patch attached? > > > > I believe it should fix the issue, or at least certain occurrences of the > > issues. > > At this point, you can remove all the other patches I asked you to apply > and > > use only this one. > > > > Thnks for your cooperation! > > It looks like it's fixed the issue, but I will wait for a few days before > jumping to a conclusion. nop its not fix Ok can you please tell us what this patch prints to the logs: Note, it is very much like the previous one, but not exactly the same. Please keep the patch attached in comment#53 and add this one on top. Thanks! diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 2150496130ff..e92932b20227 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -3854,6 +3854,7 @@ begin: if (unlikely(q_stopped)) { /* mark for waking later */ + pr_err("%s - q_stopped 0x%08x\n", __func__, local->queue_stop_reasons[q]); set_bit(IEEE80211_TXQ_DIRTY, &txqi->flags); return NULL; } (In reply to Emmanuel Grumbach from comment #56) > Ok can you please tell us what this patch prints to the logs: > > Note, it is very much like the previous one, but not exactly the same. > Please keep the patch attached in comment#53 and add this one on top. > Thanks! > > > diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c > index 2150496130ff..e92932b20227 100644 > --- a/net/mac80211/tx.c > +++ b/net/mac80211/tx.c > @@ -3854,6 +3854,7 @@ begin: > > if (unlikely(q_stopped)) { > /* mark for waking later */ > + pr_err("%s - q_stopped 0x%08x\n", __func__, > local->queue_stop_reasons[q]); > set_bit(IEEE80211_TXQ_DIRTY, &txqi->flags); > return NULL; > } is it 6.11.6 compatible because its not applying diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 2150496130ff..2a09ef8510fc 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -3830,6 +3830,7 @@ spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags); if (unlikely(q_stopped)) { + pr_err("%s - q_stopped 0x%08x\n", __func__, local->queue_stop_reasons[q]); /* mark for waking later */ set_bit(IEEE80211_TXQ_DIRTY, &txqi->flags); return NULL; this one okay yes, sorry.. (In reply to Emmanuel Grumbach from comment #59) > yes, sorry.. same loop of "Nov 06 05:51:07 kernel: iwlwifi 0000:00:14.3: Not associated and the session protection is over already... " Created attachment 307150 [details]
sudo trace-cmd record -T -e iwlwifi -e mac80211 -e cfg80211 -e mac80211_msg -e iwlwifi_dbg -e console
Can you please start tracing before going to suspend? I can see that you started tracing upon resume. Thanks! (In reply to Emmanuel Grumbach from comment #62) > Can you please start tracing before going to suspend? > I can see that you started tracing upon resume. > Thanks! file is above limit so share drive link https://drive.google.com/file/d/1Z_wAd_xwdoWVd3EjS5RgfDI80ot8JfzY/view?usp=drive_link It works sometimes after waking up, but after that, it stops working, which is a known behavior from the initial bug report Thanks, I'll take a look. We need to focus on specific behaviors. For now, I focus on the authentication frame not being sent. If there are cases where we associate successfully and fail later, it's a different issue. (In reply to Emmanuel Grumbach from comment #65) > Thanks, I'll take a look. > > We need to focus on specific behaviors. > > For now, I focus on the authentication frame not being sent. > > If there are cases where we associate successfully and fail later, it's a > different issue. i think issue happens 5ghz channel (In reply to Emmanuel Grumbach from comment #65) > Thanks, I'll take a look. > > We need to focus on specific behaviors. > > For now, I focus on the authentication frame not being sent. > > If there are cases where we associate successfully and fail later, it's a > different issue. i think issue happens 5ghz channel Thanks for the tracing. I can see strange things there... First, your AP seems to be switching channel all the time, and every time it changes channel, it prevents us from transmitting data. This can cause lots of problems, but I do wonder how come it worked on 6.10... I need to dig more in the data you provided. Created attachment 307193 [details]
more prints in queue tracing
Hi,
I am afraid I have to ask you more logs :(
I can see something that I cannot explain, hence the need to get more data.
Can you please apply the patch attached and reproduce the problem while recording tracing?
You can compress the trace.dat file if you want.
Thanks
Does tracing need to be enabled during sleep to wake? and only this patch only right? and only this patch need to apply right? (In reply to Rahul from comment #70) > Does tracing need to be enabled during sleep to wake? Yes please > and only this patch only right? Please keep the diff from comment#58. I'll prepare a patch that includes both changes. Created attachment 307194 [details]
more prints in queue tracing
(In reply to Emmanuel Grumbach from comment #72) > (In reply to Rahul from comment #70) > > Does tracing need to be enabled during sleep to wake? > > Yes please > > > and only this patch only right? > > Please keep the diff from comment#58. > > I'll prepare a patch that includes both changes. It's without sleep-to-wake because I tried it a few times, and it worked. However, this time, after I stopped debugging, the issue occurred, but I had already deleted the old trace by then. Created attachment 307195 [details]
trace.dat without wake from sleep
I can't see the start of the channel switch :( From what I see, the problem is not related to resume from sleep, but rather to the AP switching channel, and this happens only on the 5.2 GHz band Created attachment 307196 [details]
how about this
(In reply to Emmanuel Grumbach from comment #77) > From what I see, the problem is not related to resume from sleep, but rather > to the AP switching channel, and this happens only on the 5.2 GHz band But after encountering this issue, I tried with 2.4 GHz, but it was also unable to connect after the issue occurred. (In reply to Rahul from comment #78) > Created attachment 307196 [details] > how about this nope. Channel switch happened before. I suggest that you disable wifi. Start tracing, enable wifi and wait until the problem happens. Created attachment 307198 [details]
check1
check+1
You haven't disabled wifi. Please do the following. 1) Disable wifi 2) unload iwlmvm iwlwifi mac80211 cfg80211 3) load iwlwifi # that will load all the rest. 4) while Wifi is still disabled, start tracing 5) enable wifi All the last logs didn't contain valuable information unfortunately. (In reply to Emmanuel Grumbach from comment #82) > You haven't disabled wifi. > > Please do the following. > > 1) Disable wifi > 2) unload iwlmvm iwlwifi mac80211 cfg80211 > 3) load iwlwifi # that will load all the rest. > 4) while Wifi is still disabled, start tracing > 5) enable wifi > > All the last logs didn't contain valuable information unfortunately. https://drive.google.com/file/d/1vmPkS3LD9uUFGtr_W5jHcp2RpqiZ-Kja/view?usp=drive_link if you don’t find it here, it might be because, in the previous case, it didn’t connect with the 5 GHz, so I changed the AP frequency to dynamic. That could be the reason. Yes - that one is good. Will analyze a bit later. Created attachment 307199 [details]
fix candidate
Can you please try the patch attached?
I believe it should fix the problem.
I still need to discuss this with someone though.
(In reply to Emmanuel Grumbach from comment #86) > Created attachment 307199 [details] > fix candidate > > Can you please try the patch attached? > > I believe it should fix the problem. > I still need to discuss this with someone though. yep it fix the issue great. I'll keep this bug open until I provide a fix that passes code review. Created attachment 307205 [details]
fix candidate
This is the (probably) finaly version of the fix.
Can you please give it a try?
Now the Wi-Fi isn’t completely dead, but the connection is unstable. After some time, it disconnects and won’t reconnect until the AP is restarted. After restarting the AP, Wi-Fi works for a while, but then the issue repeats.(In reply to Emmanuel Grumbach from comment #89) > Created attachment 307205 [details] > fix candidate > > This is the (probably) finaly version of the fix. > Can you please give it a try? Now the Wi-Fi isn’t completely dead, but the connection is unstable. After some time, it disconnects and won’t reconnect until the AP is restarted. After restarting the AP, Wi-Fi works for a while, but then the issue repeats. Sometimes, reconnecting also solves the issue. Can you try again the previous version of the fix to see if that one worked better? I am very surprised because the fixes are equivalent, at least, they seem so. (In reply to Emmanuel Grumbach from comment #92) > Can you try again the previous version of the fix to see if that one worked > better? > I am very surprised because the fixes are equivalent, at least, they seem so. one note is that previous was 6.11.6 and now i tested with 6.11.7 does it make change? Wait... Did you recompile and reinstalled the back port driver? Or you change and recompile the whole kernel? (In reply to Emmanuel Grumbach from comment #94) > Wait... > > Did you recompile and reinstalled the back port driver? > i dont know how to do this > Or you change and recompile the whole kernel? yep this way So kernel version doesn't matter. Can you please record tracing of that? wait let me test more(In reply to Emmanuel Grumbach from comment #96) > So kernel version doesn't matter. > > Can you please record tracing of that? wait let me test more It happened only once, immediately after the first boot of this kernel, and now it’s not happening anymore. It looks good.(In reply to Emmanuel Grumbach from comment #96) > So kernel version doesn't matter. > > Can you please record tracing of that? It happened only once, immediately after the first boot of this kernel, and now it’s not happening anymore. It looks good. Thanks I'll close this bug. The fix is in final stages of code review and it'll make its way upstream. I'd like to thank you for your report and cooperation. I had to make you work hard to provide the data I needed, but thanks to your efforts, the wifi stack in Linux is getting better. Thank you! Closing. I'll still see any new comment added to this ticket. (In reply to Emmanuel Grumbach from comment #99) > Thanks > > I'll close this bug. > The fix is in final stages of code review and it'll make its way upstream. > > I'd like to thank you for your report and cooperation. > I had to make you work hard to provide the data I needed, but thanks to your > efforts, the wifi stack in Linux is getting better. > > Thank you! Linux FTW (In reply to Emmanuel Grumbach from comment #100) > Closing. I'll still see any new comment added to this ticket. Has it been merged upstream, and has it been backported to the 6.11 series? Not yet. Our maintainer is going to start sending patches soon :) Hi. What is the status on this one? I still have this problem with kernel version 6.11.10 The patch were sent just before the merge window hence it takes time.. They've been applied to wireless.git, on their way to 6.13 and then they'll be backported: https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless.git/commit/?id=11ac0d7c3b5ba58232fb7dacb54371cbe75ec183 The patch were sent just before the merge window hence it takes time.. They've been applied to wireless.git, on their way to 6.13 and then they'll be backported: https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless.git/commit/?id=11ac0d7c3b5ba58232fb7dacb54371cbe75ec183 |