Bug 197281 - iwlwifi: Oops in iwl_mvm_set_tx_cmd
Summary: iwlwifi: Oops in iwl_mvm_set_tx_cmd
Status: CLOSED DUPLICATE of bug 197279
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: DO NOT USE - assign "network-wireless-intel" component instead
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-10-14 12:24 UTC by David Weber
Modified: 2017-11-13 08:14 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.13.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments
pstore (50.87 KB, text/plain)
2017-10-14 12:24 UTC, David Weber
Details
iwlmvm.ko (344.62 KB, application/x-object)
2017-10-15 09:51 UTC, David Weber
Details
iwlmvm.o (343.16 KB, application/x-object)
2017-10-15 09:52 UTC, David Weber
Details
tx.o (23.66 KB, application/x-object)
2017-10-15 09:52 UTC, David Weber
Details
tdls.o (7.88 KB, application/x-object)
2017-10-15 09:53 UTC, David Weber
Details
iwlwifi.ko (283.54 KB, application/x-object)
2017-10-15 09:59 UTC, David Weber
Details
iwlwifi.ko (2.06 MB, application/x-object)
2017-10-15 13:24 UTC, David Weber
Details
dmesg (183.85 KB, text/plain)
2017-10-17 15:29 UTC, David Weber
Details
dmesg (101.59 KB, text/plain)
2017-10-19 14:33 UTC, David Weber
Details

Description David Weber 2017-10-14 12:24:26 UTC
Created attachment 258827 [details]
pstore

Hello,

I own a Dell XPS 13 9343 with an Intel Wireless 7265 chip.
Everything works fine in all Wifis except the one in my university (PEAP authentication). After some time (can be a few minutes or a few hours) no data goes trough the network anymore. Nothing interesting shows up in dmesg and NetworkManager reports that I'm still connected. If I press disconnect I get immediately an OOPS. I attached the data which was stored in pstore.

Please tell me if you need more information.

Cheers
David
Comment 1 Emmanuel Grumbach 2017-10-14 17:26:06 UTC
I understand that there are two issues here: the lack of traffic and the Oops on disconnection, right?

Regarding the Oops, we'd need the actual object file so that we can check exactly where it is failing.

Thanks.
Comment 2 David Weber 2017-10-15 09:51:27 UTC
Created attachment 258835 [details]
iwlmvm.ko
Comment 3 David Weber 2017-10-15 09:52:05 UTC
Created attachment 258837 [details]
iwlmvm.o
Comment 4 David Weber 2017-10-15 09:52:42 UTC
Created attachment 258839 [details]
tx.o
Comment 5 David Weber 2017-10-15 09:53:02 UTC
Created attachment 258841 [details]
tdls.o
Comment 6 Emmanuel Grumbach 2017-10-15 09:53:57 UTC
iwlmvm.ko is enough :)
No need for all the .o :)
Comment 7 David Weber 2017-10-15 09:55:38 UTC
(In reply to Emmanuel Grumbach from comment #1)
> I understand that there are two issues here: the lack of traffic and the
> Oops on disconnection, right?
They are related. Disconnection works usually fine on the network. The OOPs only happens when I disconnect after traffic stops. 

> 
> Regarding the Oops, we'd need the actual object file so that we can check
> exactly where it is failing.

I attached iwlmvm.o, tx.o, tdls.o and iwlmvm.ko. Tell me if you need anything else.
Comment 8 Emmanuel Grumbach 2017-10-15 09:57:29 UTC
Actually, the Oops is in iwlwifi.ko. Please attach that one.

Thanks.
Comment 9 David Weber 2017-10-15 09:59:17 UTC
Created attachment 258843 [details]
iwlwifi.ko
Comment 10 Emmanuel Grumbach 2017-10-15 12:29:41 UTC
There is no debug info in this object file. Can you please enable frame pointers in the kernel compilation?

I can't see much w/o more debug from the compiler. The offset in iwl_trans_pcie_tx is quite big (0x8b9), so I can't even guess where the crash happened.
Comment 11 David Weber 2017-10-15 12:38:29 UTC
(In reply to Emmanuel Grumbach from comment #10)
> There is no debug info in this object file. Can you please enable frame
> pointers in the kernel compilation?

CONFIG_FRAME_POINTER=y was already set but CONFIG_DEBUG_INFO=y was missing. I'll try to catch an OOPS with this option
Comment 12 David Weber 2017-10-15 13:24:44 UTC
Created attachment 258845 [details]
iwlwifi.ko

Here's the new iwlwifi.ko with debug symbols. I guess the offset changed because of the recompilation. I try to get a new OOPS
Comment 13 David Weber 2017-10-17 15:29:11 UTC
Created attachment 260245 [details]
dmesg
Comment 14 David Weber 2017-10-17 15:33:43 UTC
Now the traffic stopped as usual but I could disconnect and connect again. I hope the stuff in dmesg is useful. This also happened at my university. At every other wifi everything works fine.
Comment 15 Emmanuel Grumbach 2017-10-17 15:51:32 UTC
Nope - not the same problem.

I do see a SYSASSERT that reminds me another FW bug.

Can you please move to -31.ucode?
Comment 16 David Weber 2017-10-17 16:20:56 UTC
(In reply to Emmanuel Grumbach from comment #15)
> Nope - not the same problem.
> 
> I do see a SYSASSERT that reminds me another FW bug.
> 
> Can you please move to -31.ucode?

Where can I get this? The newest I can see is iwlwifi-7265D-29.ucode
I checked https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/linux-firmware.git/tree/ and https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/
Comment 17 Luca Coelho 2017-10-17 17:19:45 UTC
iwlwifi-7265D-29.ucode is the latest version for 7265D.  The -31 is not supported for this NIC.

But you could try the newer one that I published last week from here: https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/linux-firmware.git/plain/iwlwifi-7265D-29.ucode

It includes some fixes compared to the version in the official linux-firmware.git tree.  I already sent a pull-request but it will take a while for it to reach it.
Comment 18 David Weber 2017-10-17 17:29:24 UTC
Ok, I updated to this version(29.588277.0) and will see if the problem appears again
Comment 19 David Weber 2017-10-19 14:33:03 UTC
Created attachment 260293 [details]
dmesg

Again the traffic stopped but I could disconnect and connect again. Attached is the dmesg output.
Comment 20 Emmanuel Grumbach 2017-10-20 08:09:25 UTC
This now really looks like https://bugzilla.kernel.org/show_bug.cgi?id=197279.
Comment 21 Luca Coelho 2017-11-13 08:13:54 UTC
Yes, now we have the same BAD_COMMAND happening.  Marking as duplicate.

*** This bug has been marked as a duplicate of bug 197279 ***

Note You need to log in before you can comment on or make changes to this bug.