Created attachment 277929 [details] dmesg I see this oops hit every couple of days on my Intel 8265 (in master mode) running a vanilla 4.14.52 kernel (Alpine Linux 3.8); dmesg attached. Some searching turned up a very similar oops that someone had posted on pastebin, as well as on github (https://gist.github.com/aplund/7ba82370be0388abfa1974d13102ae9a), but I was unable to find a matching issue in the issue tracker.
Created attachment 277931 [details] iwlwifi.ko
So we fail here (last line): 0000000000008e7d <iwl_trans_pcie_txq_enable>: 8e7d: e8 00 00 00 00 callq 8e82 <iwl_trans_pcie_txq_enable+0x5> 8e82: 41 57 push %r15 8e84: 41 56 push %r14 8e86: 49 89 fe mov %rdi,%r14 8e89: 41 55 push %r13 8e8b: 41 54 push %r12 8e8d: 49 89 cd mov %rcx,%r13 8e90: 55 push %rbp 8e91: 53 push %rbx 8e92: 41 89 d4 mov %edx,%r12d 8e95: 89 f3 mov %esi,%ebx 8e97: 48 83 ec 20 sub $0x20,%rsp 8e9b: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax 8ea2: 00 00 8ea4: 48 89 44 24 18 mov %rax,0x18(%rsp) 8ea9: 31 c0 xor %eax,%eax 8eab: 48 63 c6 movslq %esi,%rax 8eae: 66 89 54 24 02 mov %dx,0x2(%rsp) 8eb3: 4c 8b bc c7 08 7e 00 mov 0x7e08(%rdi,%rax,8),%r15 8eba: 00 8ebb: f0 48 0f ab 87 08 8e lock bts %rax,0x8e08(%rdi) 8ec2: 00 00 8ec4: 73 28 jae 8eee <iwl_trans_pcie_txq_enable+0x71> 8ec6: 80 3d 00 00 00 00 00 cmpb $0x0,0x0(%rip) # 8ecd <iwl_trans_pcie_txq_enable+0x50> 8ecd: 75 1f jne 8eee <iwl_trans_pcie_txq_enable+0x71> 8ecf: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 8ed6: 44 89 44 24 04 mov %r8d,0x4(%rsp) 8edb: c6 05 00 00 00 00 01 movb $0x1,0x0(%rip) # 8ee2 <iwl_trans_pcie_txq_enable+0x65> 8ee2: e8 00 00 00 00 callq 8ee7 <iwl_trans_pcie_txq_enable+0x6a> 8ee7: 0f 0b ud2 8ee9: 44 8b 44 24 04 mov 0x4(%rsp),%r8d 8eee: 44 89 c7 mov %r8d,%edi 8ef1: e8 00 00 00 00 callq 8ef6 <iwl_trans_pcie_txq_enable+0x79> 8ef6: 4d 85 ed test %r13,%r13 8ef9: 49 89 47 70 mov %rax,0x70(%r15) Clearly, r15 is 0. r15 is assigned as mov 0x7e08(%rdi,%rax,8),%r15 which teaches me that r15 much be the pointer to the txq. rdi is the first param to the function (trans) and apparently rax is the txq_id (the second parameter although this doesn't come natural from the calling convention, rax is has been assigned to be txq_id). The txq assignment is: struct iwl_txq *txq = trans_pcie->txq[txq_id]; Bottom line, txq is NULL... Note that we tried (and failed) to open AMPDU a bit before the crash and this is clearly not a classic scenario. I really don't see how trans_pcie->txq[txq_id] could be NULL... If only we knew what was the value of txq_id... Can you load iwlwifi with debug=0x80000000 ?
Ping? Is this still reproducible?
please re-open if you have the data we asked for.