Created attachment 243241 [details] lspci -vv output The slowest resume device on the Dell XPS13 - 9630 is the 3a:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32) This laptop comes from dell pre-installed with Ubuntu 16.04 and so this is the "out of the box" customer experience.
Created attachment 243251 [details] analyze_suspend output on 4.4.0-23-generic looks like the fun begins with ath10k_pcie_cold_reset()
Created attachment 243261 [details] analyze_suspend output on 4.9.0-994-generic Reproduced with recent upstream-based kernel, as distributed here: http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/2016-10-20/ this kernel calls itself 4.9.0-994-generic I used this kernel because this machine has a new (KBL) graphics controller, which I'm not sure is fully supported in Linus' tree.
Created attachment 243271 [details] analyze_suspend output for freeze on 4.9-rc3 ath10k issue appears to be the same with Linux-4.9-rc3 have reproduced both with suspend to S3, and here, freeze to idle path.
I think the problem here is that starting the firmware takes too long. If Wake on Wireless LAN (WoWLAN) is disabled mac80211 stops the firmware during suspend and starts it again during resume. (But if WoWLAN would be enabled mac80211 would actually leave the firmware running and resume should be faster.) We should try to make starting the firmware faster but no idea how much we can speed it up. Other option is to modify mac80211 not to stop firmware but that will again increase power consumption.
How to enable WoWLAN to verify this theory? I don't see any BIOS SETUP options for this, is there an OS/driver option? Is there an option to mac80211 to allow the firmware to not stop during suspend? If yes, I'll be happy to measure the power before/after. Also, in the case that the firmware does need to be started, the driver could also make this asynchronous. The current synchronous design makes all of user-space wait while this driver sorts out its firmware.
(In reply to Len Brown from comment #5) > How to enable WoWLAN to verify this theory? > I don't see any BIOS SETUP options for this, is there an OS/driver option? I haven't tried WoWLAN myself but it should be possible to configure it with the iw tool. I found some documentation for wl18xx, which hopefully also applies to ath10k as they both use mac80211: http://processors.wiki.ti.com/index.php/WL18xx_Adding_WoWLAN > Is there an option to mac80211 to allow the firmware to not stop during > suspend? If yes, I'll be happy to measure the power before/after. I'm not aware of any way except to make an ugly hack to mac80211. > Also, in the case that the firmware does need to be started, the driver > could also make this asynchronous. The current synchronous design makes all > of user-space wait while this driver sorts out its firmware. I think mac80211 requires firmware bootup to be synchronous so that might be challenging to change. There might be a way to improve this as one guy was able to make the firmware bootup faster in ath10k. He hasn't been able to post a proper patch yet as he is busy with something else at the moment but hopefully he does that soon. Also ath10k devs (and wireless devs for that part) don't really follow bugzilla that closely. It would be easier, and get more people involved, if we could switch to ath10k and linux-wireless mailing lists: ath10k@lists.infradead.org linux-wireless@vger.kernel.org
Ryan now submitted a patch which should speed up the firmware upload from 2 seconds to 0.5 seconds: ath10k: improve the firmware download time for QCA6174 https://patchwork.kernel.org/patch/9575833/ Feedback very welcome. If a full tree is preferred the patch is in master-pending branch of my ath.git tree: https://git.kernel.org/cgit/linux/kernel/git/kvalo/ath.git/log/?h=master-pending
(In reply to Kalle Valo from comment #7) > Ryan now submitted a patch which should speed up the firmware upload from 2 > seconds to 0.5 seconds: > > ath10k: improve the firmware download time for QCA6174 > https://patchwork.kernel.org/patch/9575833/ Very nice. Thank you. It looks like there is a missing ; in there? > Feedback very welcome. If a full tree is preferred the patch is in > master-pending branch of my ath.git tree: > > https://git.kernel.org/cgit/linux/kernel/git/kvalo/ath.git/log/?h=master- > pending It’d be awesome if that commit would be tagged to be backported to the stable releases.
Merging the branch master-pending into the master branch of Linus’ tree, and building that, the WLAN does not work at all on the Dell XPS 13. ``` ath10k_pci 0000:3a:00.0: failed to fetch board data for bus=pci,vendor=168c,device=003e,subsystem-vendor=1a56,subsystem-device=1535,variant=RV_0520 from ath10k/QCA6174/hw3.0/board-2.bin ``` I’ll try to cherry-pick only the commits.
Created attachment 254791 [details] HTML output of `analyze_suspend.py` I cherry picked on top of v4.10-rc8-3-g747ae0a96f1a. ``` 63616c41d242 ath10k: improve the firmware download time for QCA6174 747ae0a96f1a Merge tag 'media/v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media ``` With that resume time went down by 1.5 seconds. ``` ieee80211 @ phy0 {ieee80211} async_device (Total Suspend: 89.181 ms Total Resume: 763.594 ms) ``` Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
(In reply to Paul Menzel from comment #8) > (In reply to Kalle Valo from comment #7) > > Ryan now submitted a patch which should speed up the firmware upload from 2 > > seconds to 0.5 seconds: > > > > ath10k: improve the firmware download time for QCA6174 > > https://patchwork.kernel.org/patch/9575833/ > > Very nice. Thank you. It looks like there is a missing ; in there? That was fixed in v2. I accidentally gave you a link to v1, sorry. > > Feedback very welcome. If a full tree is preferred the patch is in > > master-pending branch of my ath.git tree: > > > > https://git.kernel.org/cgit/linux/kernel/git/kvalo/ath.git/log/?h=master- > > pending > > It’d be awesome if that commit would be tagged to be backported to the > stable releases. At least in my opinion this is more like a new feature, not a bug fix. So I'm not sure if it fulfills the stable patch rules.
(In reply to Paul Menzel from comment #9) > Merging the branch master-pending into the master branch of Linus’ tree, and > building that, the WLAN does not work at all on the Dell XPS 13. > > ``` > ath10k_pci 0000:3a:00.0: failed to fetch board data for > bus=pci,vendor=168c,device=003e,subsystem-vendor=1a56,subsystem-device=1535, > variant=RV_0520 from ath10k/QCA6174/hw3.0/board-2.bin > ``` > > I’ll try to cherry-pick only the commits. I believe that's a regression caused by this comment: f2593cb1b291 ath10k: Search SMBIOS for OEM board file extension A workaround would be to update to the latest board-2.bin: https://github.com/kvalo/ath10k-firmware/commit/8d15818b0f9c7b09f743538ac2d3e1409779f52a But we will work on getting this fixed properly so that the latest kernel will continue to work with the older board-2.bin. Thanks for reporting this!
(In reply to Paul Menzel from comment #10) > Created attachment 254791 [details] > HTML output of `analyze_suspend.py` > > I cherry picked on top of v4.10-rc8-3-g747ae0a96f1a. > > ``` > 63616c41d242 ath10k: improve the firmware download time for QCA6174 > 747ae0a96f1a Merge tag 'media/v4.10-4' of > git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media > ``` > > With that resume time went down by 1.5 seconds. > > ``` > ieee80211 @ phy0 {ieee80211} async_device (Total Suspend: 89.181 ms Total > Resume: 763.594 ms) > ``` > > Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Nice, thanks for testing this. I added your Tested-by to the commit.
(In reply to Kalle Valo from comment #11) > (In reply to Paul Menzel from comment #8) > > (In reply to Kalle Valo from comment #7) […] > > > Feedback very welcome. If a full tree is preferred the patch is in > > > master-pending branch of my ath.git tree: > > > > > > https://git.kernel.org/cgit/linux/kernel/git/kvalo/ath.git/log/?h=master- > > > pending > > > > It’d be awesome if that commit would be tagged to be backported to the > > stable releases. > > At least in my opinion this is more like a new feature, not a bug fix. So > I'm not sure if it fulfills the stable patch rules. From the document `Documentation/process/stable-kernel-rules.rst` [1] I would see the items below justifying backporting it. 1. “It must fix a real bug that bothers people (not a, “This could be a problem...” type thing).” (It bothers all the laptop owners.) 2. “Serious issues as reported by a user of a distribution kernel may also be considered if they fix a notable performance or interactivity issue. As these fixes are not as obvious and have a higher risk of a subtle regression they should only be submitted by a distribution kernel maintainer and include an addendum linking to a bugzilla entry if it exists and additional information on the user-visible impact. New device IDs and quirks are also accepted.” The following points reject it though. 1. “oh, that’s not good” is subjective, but it’s probably *not* critical. 2. “It cannot be bigger than 100 lines, with context.” Is there a way to notify distributions to backport it? [1] https://static.lwn.net/kerneldoc/process/stable-kernel-rules.html
As I don’t see it yet in Linus’ branch, will that commit be included in Linux 4.11?
(In reply to Paul Menzel from comment #14) > (In reply to Kalle Valo from comment #11) > > (In reply to Paul Menzel from comment #8) > > > (In reply to Kalle Valo from comment #7) > > […] > > > > > Feedback very welcome. If a full tree is preferred the patch is in > > > > master-pending branch of my ath.git tree: > > > > > > > > > https://git.kernel.org/cgit/linux/kernel/git/kvalo/ath.git/log/?h=master- > > > > pending > > > > > > It’d be awesome if that commit would be tagged to be backported to the > > > stable releases. > > > > At least in my opinion this is more like a new feature, not a bug fix. So > > I'm not sure if it fulfills the stable patch rules. > > From the document `Documentation/process/stable-kernel-rules.rst` [1] I > would see the items below justifying backporting it. > > 1. “It must fix a real bug that bothers people (not a, “This could be a > problem...” type thing).” (It bothers all the laptop owners.) > 2. “Serious issues as reported by a user of a distribution kernel may also > be considered if they fix a notable performance or interactivity issue. As > these fixes are not as obvious and have a higher risk of a subtle regression > they should only be submitted by a distribution kernel maintainer and > include an addendum linking to a bugzilla entry if it exists and additional > information on the user-visible impact. > New device IDs and quirks are also accepted.” Ah, I haven't realised this. > The following points reject it though. > > 1. “oh, that’s not good” is subjective, but it’s probably *not* critical. > 2. “It cannot be bigger than 100 lines, with context.” Yeah, the first patch is around 400 lines so it is too big for stable. > Is there a way to notify distributions to backport it? I don't know. There's always the backports which people can use to run latest ath10k on older kernels but the project hasn't done a release for some time now so one has to get the latest version from git: https://backports.wiki.kernel.org/index.php/Main_Page > [1] https://static.lwn.net/kerneldoc/process/stable-kernel-rules.html (In reply to Paul Menzel from comment #15) > As I don’t see it yet in Linus’ branch, will that commit be included in > Linux 4.11? If all goes well the first release containing the patches will be 4.12-rc1. https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=ath-next&id=583a6629432ca95813a585a7117331ffe36fe939 https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=ath-next&id=912b6e8850a51a09ec771aedf2b4428ac9b34e20
Marking bug fixed.
Just an update that with Linux 4.13-rc1 it really takes that long, that means 0.5 seconds, to download the firmware. ath10k_bmi_fast_download [ath10k_core] (531.523 ms @ 184.159001) • ath10k_bmi_lz_stream_start [ath10k_core] (0.007 ms @ 184.159001) • ath10k_bmi_lz_data [ath10k_core] (531.359 ms @ 184.159008) • ath10k_bmi_lz_data [ath10k_core] (0.144 ms @ 184.690368) • ath10k_bmi_lz_stream_start [ath10k_core] (0.012 ms @ 184.690512) I created bug #196413 [1] to improve the time further. [1] https://bugzilla.kernel.org/show_bug.cgi?id=196413