Created attachment 274979 [details] dmesg from 4.16rc7 4.16 regresses iwlwifi on Intel Corporation Wireless 8260 (rev 3a) wireless card, as part of Dell Precision 7510. The iwlwifi driver consistently fails to initialize immediately after boot with "iwlwifi 0000:02:00.0: swiotlb: coherent allocation failed, size=4096" (full dmesg with stack trace attached).
Allocating 4K of coherent memory can't be something overwhelmingly complicated... Does this work on 4.15? I doubt it is a bug in iwlwifi really... This is very early in the init process and this flow hasn't changed for a while. Nevertheless, if we see that this a regression, it may help to nail down the problem.
> Does this work on 4.15? Yes, it does. 4.15.0 is fine. For the record, this is off of a ubuntu mainline build (http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc7/) but past experience has shown these to be pretty reliable, changes are entirely in configuration/packaging and the patchset is not very big.
Anything else I can help with that would point to a more specific problem?
Can you try with 4.15 and our master branch from https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/backport-iwlwifi.git/ ? If that works, we'll be able to bisect maybe.
Created attachment 275097 [details] dmesg from 4.15 + backport-iwlwifi backport-iwlwifi from commit f75f445080d1eb6059cc29ff5ab55ad12d80b937 fails in new and exciting ways. dmesg attached - something to do with the firmware. You know this code better - is this before or after the coherent allocation that fails on 4.16?
way after :) But it is ... weird... It means that we have a mismatch in the features that are advertised by the firmware. The firmware is angry at the driver because the firmware didn't expect the TIME_QUOTA_CMD that being caused by the fact that firmware has this logic offloaded now... But if that's the case, the firmware should have advertised this... Anyway separate issue... Should be fine with: diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/quota.c b/drivers/net/wireless/intel/iwlwifi/mvm/quota.c index 03cd22e88ab0..af837a91fe53 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/quota.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/quota.c @@ -279,7 +279,7 @@ int iwl_mvm_update_quotas(struct iwl_mvm *mvm, lockdep_assert_held(&mvm->mutex); if (fw_has_capa(&mvm->fw->ucode_capa, - IWL_UCODE_TLV_CAPA_DYNAMIC_QUOTA)) + IWL_UCODE_TLV_CAPA_DYNAMIC_QUOTA) || true) return 0; /* update all upon completion */ What we do see is that the firmware is loaded... So I am fearing that we need to look in the swiotlb code rather than in iwlwifi....
> So I am fearing that we need to look in the swiotlb code rather than in > iwlwifi.... Oh, fun. How do I help? :)
First I'd like to know that it works with the small patch inline in my previous comment.
Created attachment 275099 [details] dmesg from 4.15 + backport-iwlwifi + patch Nope, fails with microcode error.
Luca just said there is a problem with the firmware version. Please upgrade the firmware from https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/linux-firmware.git/ thanks.
4.15.15 (gentoo-sources) iwlwifi-stack-public:master:6932:7803aa0b firmware version 36.e91976c0.0 This works for me. 4.16.1 (gentoo-sources) iwlwifi-stack-public:master:6932:7803aa0b firmware version 36.e91976c0.0 This still fails with "coherent allocation failed". That does kind of point to it likely being a DMA/swiotlb regression. It looks like there was a refactor of coherent buffer allocation. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2382dc9a3eca644147be83dd2cd0dd64dc9e3e8c
I am moving this bug to the relevant people (CC'ed Christoph). I am pretty sure that IOMMU isn't the right component, OTOH, I couldn't find anything related to SWIOTLB in the components... Intel WiFi folks are still CC'ed to this bug.
I just noticed this on the upstream master: > swiotlb: fix unexpected swiotlb_alloc_coherent failures > The code refactoring by commit 0176adb00406 ("swiotlb: refactor coherent > buffer allocation") made swiotlb_alloc_buffer almost always failing due > to a thinko: namely, the function evaluates the dma_coherent_ok call > incorrectly and dealing as if it's invalid. This ends up with weird > errors like iwlwifi probe failure or amdgpu screen flickering. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9e7f06c8beee304ee21b791653fefcd713f48b9a It's not on the stable 4.16.y branch yet, but I manually applied it to 4.16.2, and I can confirm that it fixed the iwlwifi issues for me.
*** Bug 199447 has been marked as a duplicate of this bug. ***
Created attachment 275465 [details] Boot with 4.16.3 kernel still fails to load iwlwifi driver I manually downloaded and compiled the 4.16.3 kernel using the default configuration for opensuse tumblweed. I still get failure trying to load the iwlwifi driver.
(In reply to Stuart from comment #15) > Created attachment 275465 [details] > Boot with 4.16.3 kernel still fails to load iwlwifi driver If you look at the 4.16.y mainline branch, the fix commit is nowhere to be found.
Thanks, the openSuSE team picked up the change in their 4.16.2 kernel update: # diff /lib/modules/4.16.2-1-default/source/lib/swiotlb.c /srv/ftp/pub/kernel/swiotlb.c 735c735 < if (!dma_coherent_ok(dev, *dma_handle, size)) --- > if (dma_coherent_ok(dev, *dma_handle, size)) iwlwifi is working now: [ 5.249858] Intel(R) Wireless WiFi driver for Linux [ 5.249859] Copyright(c) 2003- 2015 Intel Corporation [ 5.249894] iwlwifi 0000:03:00.0: enabling device (0000 -> 0002) [ 5.251436] iwlwifi 0000:03:00.0: loaded firmware version 36.e91976c0.0 op_mode iwlmvm [ 5.284221] iwlwifi 0000:03:00.0: Detected Intel(R) Dual Band Wireless AC 8260, REV=0x208 [ 5.367985] iwlwifi 0000:03:00.0: base HW address: 44:85:00:4a:92:9b [ 5.408014] input: Lid Switch as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:18/PNP0C09:01/PNP0C0D:00/input/input5 [ 5.454412] ieee80211 phy0: Failed to initialize wep: -2 [ 5.454434] ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
Fedora still hasn't picked this fix up to date, and just released a problematic 4.16.3 as an update yesterday.