Bug 205187

Summary: iwlwifi 9260 (Killer): hardware error detected (Queue stuck)
Product: Drivers Reporter: Bas Vermeulen (bvermeul)
Component: network-wireless-intelAssignee: DO NOT USE - assign "network-wireless-intel" component instead (linuxwifi)
Status: NEW ---    
Severity: high CC: bugzilla, bvermeul, linuxwifi, luca, mark.harfouche, rdiezmail-kernelbugzilla, thomas.natschlaeger
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.3.2-5.3.5 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg output from two boots (fw 43 and 46)

Description Bas Vermeulen 2019-10-14 08:49:15 UTC
Oct 14 10:11:30 area51m kernel: [  109.184156] iwlwifi 0000:46:00.0: Queue 0 is inactive on fifo 2 and stuck for 2500 ms. SW [215, 216] HW [162, 162] FH TRB=0x0a5a5a5a2
Oct 14 10:11:30 area51m kernel: [  109.185494] iwlwifi 0000:46:00.0: Hardware error detected. Restarting.
Oct 14 10:11:30 area51m kernel: [  109.185639] iwlwifi 0000:46:00.0: HW error, resetting before reading
Oct 14 10:11:33 area51m kernel: [  109.193260] iwlwifi 0000:46:00.0: Start IWL Error Log Dump:
Oct 14 10:11:33 area51m kernel: [  109.193262] iwlwifi 0000:46:00.0: Status: 0x00000080, count: 267877053
Oct 14 10:11:33 area51m kernel: [  109.193263] iwlwifi 0000:46:00.0: Loaded firmware version: 46.6bf1df06.0
Oct 14 10:11:33 area51m kernel: [  109.193264] iwlwifi 0000:46:00.0: 0x6C6E52C6 | ADVANCED_SYSASSERT    
Oct 14 10:11:33 area51m kernel: [  109.193265] iwlwifi 0000:46:00.0: 0xBB1EF0DF | trm_hw_status0
Oct 14 10:11:33 area51m kernel: [  109.193266] iwlwifi 0000:46:00.0: 0xC327C613 | trm_hw_status1
Oct 14 10:11:33 area51m kernel: [  109.193266] iwlwifi 0000:46:00.0: 0x66C6DE95 | branchlink2
Oct 14 10:11:33 area51m kernel: [  109.193267] iwlwifi 0000:46:00.0: 0xFF18FDE8 | interruptlink1
Oct 14 10:11:33 area51m kernel: [  109.193267] iwlwifi 0000:46:00.0: 0xD70ED0D1 | interruptlink2
Oct 14 10:11:33 area51m kernel: [  109.193268] iwlwifi 0000:46:00.0: 0x3A53D5A3 | data1
Oct 14 10:11:33 area51m kernel: [  109.193268] iwlwifi 0000:46:00.0: 0xC6B4D5FC | data2
Oct 14 10:11:33 area51m kernel: [  109.193269] iwlwifi 0000:46:00.0: 0x22CBEDEB | data3
Oct 14 10:11:33 area51m kernel: [  109.193270] iwlwifi 0000:46:00.0: 0x21C31B20 | beacon time
Oct 14 10:11:33 area51m kernel: [  109.193270] iwlwifi 0000:46:00.0: 0x0C40FF99 | tsf low
Oct 14 10:11:33 area51m kernel: [  109.193271] iwlwifi 0000:46:00.0: 0x90E4ABFB | tsf hi
Oct 14 10:11:33 area51m kernel: [  109.193271] iwlwifi 0000:46:00.0: 0xF34C5F4A | time gp1
Oct 14 10:11:33 area51m kernel: [  109.193272] iwlwifi 0000:46:00.0: 0xD57BFDB2 | time gp2
Oct 14 10:11:33 area51m kernel: [  109.193273] iwlwifi 0000:46:00.0: 0x03AFB48A | uCode revision type
Oct 14 10:11:33 area51m kernel: [  109.193273] iwlwifi 0000:46:00.0: 0xF76F9FEB | uCode version major
Oct 14 10:11:33 area51m kernel: [  109.193274] iwlwifi 0000:46:00.0: 0xADDFC56F | uCode version minor
Oct 14 10:11:33 area51m kernel: [  109.193274] iwlwifi 0000:46:00.0: 0x994BBFE6 | hw version
Oct 14 10:11:33 area51m kernel: [  109.193275] iwlwifi 0000:46:00.0: 0x9AB13AA7 | board version
Oct 14 10:11:33 area51m kernel: [  109.193275] iwlwifi 0000:46:00.0: 0x693FAF3A | hcmd
Oct 14 10:11:33 area51m kernel: [  109.193276] iwlwifi 0000:46:00.0: 0x9CDFD781 | isr0
Oct 14 10:11:33 area51m kernel: [  109.193277] iwlwifi 0000:46:00.0: 0x6F56CB7A | isr1
Oct 14 10:11:33 area51m kernel: [  109.193277] iwlwifi 0000:46:00.0: 0xBD6EDF7A | isr2
Oct 14 10:11:33 area51m kernel: [  109.193278] iwlwifi 0000:46:00.0: 0xA26C5723 | isr3
Oct 14 10:11:33 area51m kernel: [  109.193278] iwlwifi 0000:46:00.0: 0xD443EA77 | isr4
Oct 14 10:11:33 area51m kernel: [  109.193279] iwlwifi 0000:46:00.0: 0xB09A6AE9 | last cmd Id
Oct 14 10:11:33 area51m kernel: [  109.193279] iwlwifi 0000:46:00.0: 0xBE737ABB | wait_event
Oct 14 10:11:33 area51m kernel: [  109.193280] iwlwifi 0000:46:00.0: 0x1297C630 | l2p_control
Oct 14 10:11:33 area51m kernel: [  109.193281] iwlwifi 0000:46:00.0: 0xDFEFB9A0 | l2p_duration
Oct 14 10:11:33 area51m kernel: [  109.193281] iwlwifi 0000:46:00.0: 0xFCF8D773 | l2p_mhvalid
Oct 14 10:11:33 area51m kernel: [  109.193282] iwlwifi 0000:46:00.0: 0x7413BFBC | l2p_addr_match
Oct 14 10:11:33 area51m kernel: [  109.193282] iwlwifi 0000:46:00.0: 0x5DAB9F55 | lmpm_pmg_sel
Oct 14 10:11:33 area51m kernel: [  109.193283] iwlwifi 0000:46:00.0: 0x6A74C0F4 | timestamp
Oct 14 10:11:33 area51m kernel: [  109.193283] iwlwifi 0000:46:00.0: 0x83A6EACD | flow_handler
Oct 14 10:11:33 area51m kernel: [  109.193539] iwlwifi 0000:46:00.0: Start IWL Error Log Dump:
Oct 14 10:11:33 area51m kernel: [  109.193540] iwlwifi 0000:46:00.0: Status: 0x00000080, count: 1719284251
Oct 14 10:11:33 area51m kernel: [  109.193541] iwlwifi 0000:46:00.0: 0x664B99DF | ADVANCED_SYSASSERT
Oct 14 10:11:33 area51m kernel: [  109.193541] iwlwifi 0000:46:00.0: 0x9C985A76 | umac branchlink1
Oct 14 10:11:33 area51m kernel: [  109.193542] iwlwifi 0000:46:00.0: 0xBBA6C907 | umac branchlink2
Oct 14 10:11:33 area51m kernel: [  109.193543] iwlwifi 0000:46:00.0: 0xE75F56AC | umac interruptlink1
Oct 14 10:11:33 area51m kernel: [  109.193543] iwlwifi 0000:46:00.0: 0xF86198E9 | umac interruptlink2
Oct 14 10:11:33 area51m kernel: [  109.193544] iwlwifi 0000:46:00.0: 0x648A1CC3 | umac data1
Oct 14 10:11:33 area51m kernel: [  109.193544] iwlwifi 0000:46:00.0: 0xBECDE773 | umac data2
Oct 14 10:11:33 area51m kernel: [  109.193545] iwlwifi 0000:46:00.0: 0xDFFB4CAC | umac data3
Oct 14 10:11:33 area51m kernel: [  109.193545] iwlwifi 0000:46:00.0: 0xA879F663 | umac major
Oct 14 10:11:33 area51m kernel: [  109.193546] iwlwifi 0000:46:00.0: 0xE79B3CF2 | umac minor
Oct 14 10:11:33 area51m kernel: [  109.193546] iwlwifi 0000:46:00.0: 0x979799ED | frame pointer
Oct 14 10:11:33 area51m kernel: [  109.193547] iwlwifi 0000:46:00.0: 0xF87F330A | stack pointer
Oct 14 10:11:33 area51m kernel: [  109.193548] iwlwifi 0000:46:00.0: 0x1D93A5DF | last host cmd
Oct 14 10:11:33 area51m kernel: [  109.193548] iwlwifi 0000:46:00.0: 0x6D57F1A4 | isr status reg
Oct 14 10:11:33 area51m kernel: [  109.193723] iwlwifi 0000:46:00.0: Fseq Registers:
Oct 14 10:11:33 area51m kernel: [  109.193841] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | FSEQ_ERROR_CODE
Oct 14 10:11:33 area51m kernel: [  109.193976] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | FSEQ_TOP_INIT_VERSION
Oct 14 10:11:33 area51m kernel: [  109.194111] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | FSEQ_CNVIO_INIT_VERSION
Oct 14 10:11:33 area51m kernel: [  109.194246] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | FSEQ_OTP_VERSION
Oct 14 10:11:33 area51m kernel: [  109.194382] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | FSEQ_TOP_CONTENT_VERSION
Oct 14 10:11:33 area51m kernel: [  109.194517] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | FSEQ_ALIVE_TOKEN
Oct 14 10:11:33 area51m kernel: [  109.194652] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | FSEQ_CNVI_ID
Oct 14 10:11:33 area51m kernel: [  109.194788] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | FSEQ_CNVR_ID
Oct 14 10:11:33 area51m kernel: [  109.194923] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | CNVI_AUX_MISC_CHIP
Oct 14 10:11:33 area51m kernel: [  109.195058] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | CNVR_AUX_MISC_CHIP
Oct 14 10:11:33 area51m kernel: [  109.195194] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | CNVR_SCU_SD_REGS_SD_REG_DIG_DCDC_VTRIM
Oct 14 10:11:33 area51m kernel: [  109.195329] iwlwifi 0000:46:00.0: 0xA5A5A5A2 | CNVR_SCU_SD_REGS_SD_REG_ACTIVE_VDIG_MIRROR
Oct 14 10:11:33 area51m kernel: [  109.195333] iwlwifi 0000:46:00.0: Collecting data: trigger 2 fired.
Oct 14 10:11:33 area51m kernel: [  109.195334] ieee80211 phy0: Hardware restart was requested
Oct 14 10:11:49 area51m kernel: [  127.957605] nvidia-gpu 0000:01:00.3: i2c timeout error e0000040
Oct 14 10:11:50 area51m kernel: [  128.955900] ucsi_ccg 0-0008: i2c_transfer failed -110
Oct 14 10:11:50 area51m kernel: [  128.955903] ucsi_ccg 0-0008: failed to reset PPM!
Oct 14 10:11:50 area51m kernel: [  128.955904] ucsi_ccg 0-0008: PPM init failed (-110)
Oct 14 10:11:59 area51m kernel: [  138.185036] watchdog: BUG: soft lockup - CPU#0 stuck for 25s! [kworker/0:2:241]
Oct 14 10:11:59 area51m kernel: [  138.185038] Modules linked in: msr xfrm_user xfrm4_tunnel tunnel4 l2tp_ppp l2tp_netlink ipcomp xfrm_ipcomp l2tp_core ip6_udp_tunnel udp_tunnel esp4 pppox ah4 af_key xfrm_algo rfcomm ccm xt_CHECKSUM iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter cmac bnep binfmt_misc btusb btrtl btbcm btintel bluetooth ecdh_generic ecc uvcvideo nls_iso8859_1 hid_rmi rmi_core videobuf2_vmalloc snd_hda_codec_hdmi videobuf2_memops sof_pci_dev snd_sof_intel_hda_common videobuf2_v4l2 videobuf2_common snd_sof_intel_byt snd_sof_intel_ipc videodev snd_sof mc snd_sof_nocodec 8250_dw mei_hdcp intel_rapl_msr snd_hda_codec_realtek snd_sof_xtensa_dsp snd_soc_skl snd_hda_codec_generic snd_soc_hdac_hda ledtrig_audio snd_hda_ext_core snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress
Oct 14 10:11:59 area51m kernel: [  138.185055]  ac97_bus snd_pcm_dmaengine dell_wmi x86_pkg_temp_thermal intel_powerclamp snd_hda_intel snd_hda_codec kvm_intel snd_hda_core iwlmvm snd_hwdep snd_pcm mac80211 kvm irqbypass snd_seq_midi libarc4 snd_seq_midi_event dell_smbios intel_cstate snd_rawmidi dcdbas intel_rapl_perf snd_seq joydev input_leds iwlwifi dell_wmi_descriptor snd_seq_device alienware_wmi idma64 snd_timer serio_raw intel_wmi_thunderbolt wmi_bmof mxm_wmi virt_dma cfg80211 r8125(OE) ucsi_ccg snd mei_me ucsi_acpi soundcore mei typec_ucsi processor_thermal_device intel_lpss_pci cros_ec_ishtp typec intel_rapl_common intel_lpss cros_ec_core intel_soc_dts_iosf intel_pch_thermal int3403_thermal int340x_thermal_zone intel_hid int3400_thermal ie31200_edac acpi_thermal_rel sparse_keymap mac_hid acpi_pad nvidia_uvm(OE) sch_fq_codel nfsd coretemp auth_rpcgss parport_pc nfs_acl ppdev lockd lp grace parport sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq libcrc32c algif_skcipher af_alg dm_crypt
Oct 14 10:11:59 area51m kernel: [  138.185072]  hid_logitech_hidpp hid_logitech_dj usbhid hid_sensor_custom hid_sensor_hub hid_generic intel_ishtp_loader intel_ishtp_hid nvidia_drm(POE) nvidia_modeset(POE) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nvidia(POE) i915 aesni_intel i2c_algo_bit drm_kms_helper aes_x86_64 crypto_simd syscopyarea sysfillrect sysimgblt cryptd fb_sys_fops glue_helper psmouse nvme drm nvme_core thunderbolt ahci intel_ish_ipc ipmi_devintf libahci intel_ishtp ipmi_msghandler i2c_nvidia_gpu i2c_hid hid wmi video pinctrl_cannonlake pinctrl_intel
Oct 14 10:11:59 area51m kernel: [  138.185109] CPU: 0 PID: 241 Comm: kworker/0:2 Tainted: P           OE     5.3.5-050305-generic #201910071830
Oct 14 10:11:59 area51m kernel: [  138.185109] Hardware name: Alienware Alienware Area-51m/Alienware Area-51m, BIOS 1.7.2 8/5/2019
Oct 14 10:11:59 area51m kernel: [  138.185117] Workqueue: events iwl_fw_error_dump_wk [iwlwifi]
Oct 14 10:11:59 area51m kernel: [  138.185119] RIP: 0010:_raw_spin_unlock_irqrestore+0x15/0x20
Oct 14 10:11:59 area51m kernel: [  138.185120] Code: 00 e9 78 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 49 89 f8 b8 00
Oct 14 10:11:59 area51m kernel: [  138.185121] RSP: 0018:ffffb5b7805b7cb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
Oct 14 10:11:59 area51m kernel: [  138.185135] RAX: ffffffffc20d5480 RBX: ffff9ef075c20018 RCX: 0000000000000003
Oct 14 10:11:59 area51m kernel: [  138.185135] RDX: 0000000008040005 RSI: 0000000000000246 RDI: 0000000000000246
Oct 14 10:11:59 area51m kernel: [  138.185136] RBP: ffffb5b7805b7cb8 R08: 0000000000003a98 R09: 0000000000000011
Oct 14 10:11:59 area51m kernel: [  138.185136] R10: ffffec631da88a08 R11: 0000000000005ebd R12: 00000000fffffff7
Oct 14 10:11:59 area51m kernel: [  138.185136] R13: ffffb5b7805b7cf0 R14: 0000000000400000 R15: ffff9ef082afacb8
Oct 14 10:11:59 area51m kernel: [  138.185137] FS:  0000000000000000(0000) GS:ffff9ef08b800000(0000) knlGS:0000000000000000
Oct 14 10:11:59 area51m kernel: [  138.185137] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 14 10:11:59 area51m kernel: [  138.185138] CR2: 000000fa1e2d4000 CR3: 000000031040a005 CR4: 00000000003606f0
Oct 14 10:11:59 area51m kernel: [  138.185138] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 14 10:11:59 area51m kernel: [  138.185138] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Oct 14 10:11:59 area51m kernel: [  138.185139] Call Trace:
Oct 14 10:11:59 area51m kernel: [  138.185143]  iwl_trans_pcie_release_nic_access+0x61/0x70 [iwlwifi]
Oct 14 10:11:59 area51m kernel: [  138.185146]  iwl_trans_pcie_read_mem+0x94/0xc0 [iwlwifi]
Oct 14 10:11:59 area51m kernel: [  138.185150]  iwl_fw_dump_mem.isra.0.part.0+0x50/0x90 [iwlwifi]
Oct 14 10:11:59 area51m kernel: [  138.185153]  iwl_fw_error_dump_file.isra.0+0x436/0xf80 [iwlwifi]
Oct 14 10:11:59 area51m kernel: [  138.185156]  iwl_fw_dbg_collect_sync+0x35c/0x910 [iwlwifi]
Oct 14 10:11:59 area51m kernel: [  138.185157]  ? __switch_to_asm+0x40/0x70
Oct 14 10:11:59 area51m kernel: [  138.185158]  ? __switch_to+0x7f/0x470
Oct 14 10:11:59 area51m kernel: [  138.185159]  ? __switch_to_asm+0x40/0x70
Oct 14 10:11:59 area51m kernel: [  138.185175]  iwl_fw_error_dump_wk+0x59/0x80 [iwlwifi]
Oct 14 10:11:59 area51m kernel: [  138.185177]  process_one_work+0x1db/0x380
Oct 14 10:11:59 area51m kernel: [  138.185177]  worker_thread+0x4d/0x400
Oct 14 10:11:59 area51m kernel: [  138.185179]  kthread+0x104/0x140
Oct 14 10:11:59 area51m kernel: [  138.185179]  ? process_one_work+0x380/0x380
Oct 14 10:11:59 area51m kernel: [  138.185180]  ? kthread_park+0x80/0x80
Oct 14 10:11:59 area51m kernel: [  138.185181]  ret_from_fork+0x35/0x40
Oct 14 10:12:00 area51m kernel: [  138.695590] iwlwifi 0000:46:00.0: Failing on timeout while stopping DMA channel 8 [0xa5a5a5a2]
Oct 14 10:12:00 area51m kernel: [  138.721332] iwlwifi 0000:46:00.0: Applying debug destination EXTERNAL_DRAM
Oct 14 10:12:00 area51m kernel: [  138.849691] iwlwifi 0000:46:00.0: Applying debug destination EXTERNAL_DRAM
Oct 14 10:12:00 area51m kernel: [  138.928978] iwlwifi 0000:46:00.0: FW already configured (0) - re-configuring
Oct 14 10:12:00 area51m kernel: [  138.944244] iwlwifi 0000:46:00.0: BIOS contains WGDS but no WRDS
Oct 14 10:12:00 area51m kernel: [  138.993855] wlp70s0: deauthenticated from 80:2a:a8:12:0e:06 (Reason: 6=CLASS2_FRAME_FROM_NONAUTH_STA)
Oct 14 10:12:00 area51m kernel: [  139.115309] wlp70s0: authenticate with 80:2a:a8:12:0e:06
Oct 14 10:12:00 area51m kernel: [  139.119098] wlp70s0: send auth to 80:2a:a8:12:0e:06 (try 1/3)
Oct 14 10:12:00 area51m kernel: [  139.129635] iwlwifi 0000:46:00.0: Unhandled alg: 0x707
Oct 14 10:12:00 area51m kernel: [  139.160560] wlp70s0: authenticated
Oct 14 10:12:00 area51m kernel: [  139.163537] wlp70s0: associate with 80:2a:a8:12:0e:06 (try 1/3)
Oct 14 10:12:00 area51m kernel: [  139.166051] wlp70s0: RX AssocResp from 80:2a:a8:12:0e:06 (capab=0x411 status=0 aid=3)
Oct 14 10:12:00 area51m kernel: [  139.170306] wlp70s0: associated

This keeps happening in different incarnations. Debian's 4.15 works fine, as did 5.1.x and I believe 5.2.x. I can check 5.2 if necessary, I don't have it installed at the moment.
Comment 1 Bas Vermeulen 2019-10-14 08:56:22 UTC
Created attachment 285495 [details]
dmesg output from two boots (fw 43 and 46)

kern.log from debian, from two boots with fw 43 and 46.