Bug 208649
Summary: | Bisected - System wakes up asap after standby | ||
---|---|---|---|
Product: | Drivers | Reporter: | Martin (mwolf) |
Component: | Sound(ALSA) | Assignee: | Jaroslav Kysela (perex) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | kai.heng.feng, tiwai |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.7.x | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Alsa-Info
Test fix patch Debug patch full dmesg __azx_runtime_suspend(chip); debug power-save debug power-save-controller-debug power-save-controller-corrected-debug snd_hda_intel.enable=1,0 snd_hda_intel.enable=0,1 Test fix patch #2 Another try, minimalistic changes Revised patch |
Description
Martin
2020-07-21 16:35:48 UTC
This should have been addressed in the later commit 8d6762af302d69f76fa788a277a56a9d9cd275d5 ALSA: hda: Always use jackpoll helper for jack update after resume that is odd, because I still have it with Kernel 5.7.9 and Kernel 5.8.0-rc5 So it might be another issue? Try to run bisect with applying 8d6762af302 manually when c4c8dd6e is committed. IOW, if you apply 8d6762af302 on top of c4c8dd6e, does the problem still appear? Can you please elaborate in detail how I do that? I don't want to make a mistake here ... You tried bisection and it pointed to the commit c4c8dd6e. And try to cherry-pick 8d6762af302 on top of the commit c4c8dd6e, and see whether this state still shows the problem. If yes, it implies that the fix in commit 8d6762af302 doesn't suffice. Or if it doesn't show the problem, the problem is something different, introduced by another commit at a later point. thank you I hope I did it right. 1. git reset --hard c4c8dd6e 2. git cherry-pick 8d6762af302 3. make localmodconfig 4. make -j10 sadly it does not compile. Did I do something wrong? make[1]: *** [scripts/Makefile.build:101: arch/x86/kernel/asm-offsets.s] Fehler 1 make: *** [Makefile:1113: prepare0] Fehler 2 make: *** Es wird auf noch nicht beendete Prozesse gewartet.... In Datei, eingebunden von ./include/linux/fs.h:27, von ./include/linux/cgroup.h:17, von ./include/linux/memcontrol.h:13, von ./include/linux/swap.h:9, von ./include/linux/suspend.h:5, von arch/x86/kernel/asm-offsets.c:13: ./include/linux/fiemap.h:23:3: Fehler: Unbekannter Typname »get_block_t«; meinten Sie »seqlock_t«? 23 | get_block_t *get_block); | ^~~~~~~~~~~ | seqlock_t HDRINST usr/include/linux/atm.h HDRINST usr/include/linux/minix_fs.h HDRINST usr/include/linux/if_hippi.h HDRINST usr/include/linux/igmp.h HDRINST usr/include/linux/nbd-netlink.h HDRINST usr/include/linux/tls.h In Datei, eingebunden von ./include/linux/cgroup.h:17, von ./include/linux/memcontrol.h:13, von ./include/linux/swap.h:9, von ./include/linux/suspend.h:5, von arch/x86/kernel/asm-offsets.c:13: ./include/linux/fs.h:1746:8: Fehler: Redefinition von »struct fiemap_extent_info« 1746 | struct fiemap_extent_info { | ^~~~~~~~~~~~~~~~~~ In Datei, eingebunden von ./include/linux/fs.h:27, von ./include/linux/cgroup.h:17, von ./include/linux/memcontrol.h:13, von ./include/linux/swap.h:9, von ./include/linux/suspend.h:5, von arch/x86/kernel/asm-offsets.c:13: ./include/linux/fiemap.h:8:8: Anmerkung: ursprünglich hier definiert 8 | struct fiemap_extent_info { | ^~~~~~~~~~~~~~~~~~ In Datei, eingebunden von ./include/linux/cgroup.h:17, von ./include/linux/memcontrol.h:13, von ./include/linux/swap.h:9, von ./include/linux/suspend.h:5, von arch/x86/kernel/asm-offsets.c:13: ./include/linux/fs.h:1753:5: Fehler: In Konflikt stehende Typen für »fiemap_fill_next_extent« 1753 | int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical, | ^~~~~~~~~~~~~~~~~~~~~~~ In Datei, eingebunden von ./include/linux/fs.h:27, von ./include/linux/cgroup.h:17, von ./include/linux/memcontrol.h:13, von ./include/linux/swap.h:9, von ./include/linux/suspend.h:5, von arch/x86/kernel/asm-offsets.c:13: ./include/linux/fiemap.h:18:5: Anmerkung: Vorherige Deklaration von »fiemap_fill_next_extent« war hier 18 | int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical, (In reply to Martin from comment #6) > 1. git reset --hard c4c8dd6e At this point you should try to build the kernel once and confirm that it's broken. Do you get the same build error now? > 2. git cherry-pick 8d6762af302 After confirming the bad kernel, do cherry-pick, build and retest. Hi, I probably have broken something in the git tree while figuring out how to proceed, so I checked it out again and ran both commands. (git reset --hard c4c8dd6e && git cherry-pick 8d6762af302) Now I got a bootable build. (Just for me to double-check, was this the correct way?) Sadly my system wakes up with that kernel. What do I do now? @Takashi I have some free time over the weekend, is there something I can do? This Problem is really annoying for me. Hm, then please double-check whether the commit 9479e75fca does *not* show the problem. % git reset --hard 9479e75fca make, install and test I can confirm, that a kernel built from what you suggested, does NOT show the problem and stays therefore in standby. OK, then it's likely the change to call pm_runtime_force_suspend() in azx_suspend(). Try the change below: --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1026,7 +1026,7 @@ static int azx_suspend(struct device *dev) chip = card->private_data; bus = azx_bus(chip); snd_power_change_state(card, SNDRV_CTL_POWER_D3hot); - pm_runtime_force_suspend(dev); + __azx_runtime_suspend(chip); if (bus->irq >= 0) { free_irq(bus->irq, chip); bus->irq = -1; This may possibly break the resume side, but at least we can know whether that's the cause. BTW, is it about S3 (suspend/resume), or really stand-by mode? Last but not least: this must be very specific to the platform. Please give alsa-info.sh output to Bugzilla. Run the script with --no-upload option and attach to Bugzilla. Created attachment 290495 [details]
Alsa-Info
Yeah, it is about S3. I am patching now and will try again. patch -p0 < ../patch.patch can't find file to patch at input line 4 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -------------------------- | |--- a/sound/pci/hda/hda_intel.c |+++ b/sound/pci/hda/hda_intel.c -------------------------- "azx_runtime_suspend(chip);" is already there did I do something wrong? Apply the patch with -p1. But it'd be faster for you to edit the code directly, it's just oneliner. Oh, maybe it wasn't clear: try to modify the code that is broken (i.e. commit c4c8dd6e and cherry-pick 8d6762af3), then replace the call of pm_runtime_force_suspend(dev); in azx_suspend() with __azx_runtime_suspend(chip); right, I mixed it up. c4c8dd6e + 8d6762af3 + "__azx_runtime_suspend(chip);" keeps my System in Standby. So it's a positive effect? The next test to try is the patch below instead of the previous oneliner. Scratch the previous change and apply it on the top of c4c8dd6e + 8d6762af3. Created attachment 290497 [details]
Test fix patch
With your new patch ontop of c4c8dd6e + 8d6762af3 my system sadly wakes up automatically. Hmm, strange. Could you try to remove or comment out the two lines in azx_runtime_suspend()? if (snd_power_get_state(card) == SNDRV_CTL_POWER_D0) enable_wakeen(chip); The spurious wake up should be driven from this WAKEEN bit. But this must have been cleared in azx_suspend(). If removing enable_wakeen() works, it implies that disable_wakeen() call in azx_suspend() had no effect. if I comment these two out, the module does not build. It can't be. Check it again. You might have deleted too much or too little. you were right, I used # instead of /* */ sorry for that. But sadly it still wakes up. OK, then apply the patch below instead of the previous one. The enable_wakeen() is already commented out there. This patch contains the debug prints during runtime and system-suspend/resume. So, boot with "no_console_suspend" boot option and get the dmesg output after suspend/resume, and attach to Bugzilla. Created attachment 290541 [details]
Debug patch
Created attachment 290543 [details]
full dmesg
just curious, is no_console_suspend or no_console_suspend=1 the correct way to set it? Thanks. It's no_console_suspend, without anything else. There is nothing obviously wrong in the call pattern. The only slight concern is the azx_runtime_resume() call just before azx_runtime_suspend() call at the system suspend time. Could you replace again the call of pm_runtime_force_suspend(dev) with __azx_runtime_suspend(chip) on this already patched kernel, retest (and confirm it working), and give the dmesg output? ok, I did the following: snd_power_change_state(card, SNDRV_CTL_POWER_D3hot); /* pm_runtime_force_suspend(dev); */ __azx_runtime_suspend(chip); now I will install the kernel and reboot reboot. Created attachment 290547 [details]
__azx_runtime_suspend(chip); debug
It stayed in standby, but when I woke it, it took ages to wake up.
Thanks. The result indicated that the latter case didn't contain the runtime resume before suspend. To make it clearer, could you do a couple of more test? Namely: - Go back to c4c8dd6e + 8d6762af3 + debug patch in comment 26. Boot with no_console_suspend option. Then, turn off the runtime suspend (as root): echo 0 > /sys/module/snd_hda_intel/parameters/power_save Test suspend/resume, save the dmesg output. - If the result is positive (no spurious wakeup), try echo 1 > /sys/module/snd_hda_intel/parameters/power_save Then test suspend/resume and confirm that the spurious wakeup returns. Save the dmesg output again. - Now, try another in addition to the previous state: echo 0 > /sys/module/snd_hda_intel/parameters/power_save_controller Then test again suspend/resume, and again save the dmesg output. Created attachment 290571 [details]
power-save debug
it woke up by itself when I set
echo 0 > /sys/module/snd_hda_intel/parameters/power_save
Created attachment 290573 [details]
power-save-controller-debug
fresh boot
(just to be sure)
set echo 1 > /sys/module/snd_hda_intel/parameters/power_save
standby
woke up on itself
set echo 0 > /sys/module/snd_hda_intel/parameters/power_save_controller
standby
stayed asleep
Created attachment 290575 [details]
power-save-controller-corrected-debug
sorry I made an error.
ignore my last post.
I forgot for a moment, that my pc only wakes once on itself. the following standby cycles are normal.
this time I only set
echo 0 > /sys/module/snd_hda_intel/parameters/power_save_controller
and system woke again automatically
Thanks. So the all tests with those options don't seem influencing on the spurious wakeup behavior. Interesting. The problem is much deeper (and more mysterious) than I expected, unfortunately. Your system has two different HD-audio controllers (the onboard audio on PCH and HDMI audio from AMDGPU), and we need to identify which causes the problem. Judging from the pattern, my guess is the latter one. For testing it, please check the following: - The same kernel as the previous test, pass snd_hda_intel.enable=1,0 boot option together with no_console_suspend, and test suspend/resume. This should disable HDMI audio while keeping the onboard audio - Similarly, test with snd_hda_intel.enable=0,1 option. This will disable the onboard audio instead. Created attachment 290579 [details]
snd_hda_intel.enable=1,0
dmesg with snd_hda_intel.enable=1,0
and I forgot to mention, that it woke up automatically. Created attachment 290581 [details]
snd_hda_intel.enable=0,1
with snd_hda_intel.enable=0,1 the system stays in standby.
Also about my Mainboard ... it is some horrible piece of hardware.
I had to disable all wakeup triggers except ps/2 because networking, usb2 and usb3 ports cause the exact same problem as the hd-audio causes now.
"a one time wakeup but on linux AND windows"
I really appreciate your help!
Thanks! Then I must have looked at the wrong direction. The trace difference might be rather a red herring. Now, scratch the previous debug patch, but apply the below instead, and retest. You don't need to take dmesg output at this time because it has no debug prints added. Created attachment 290585 [details]
Test fix patch #2
c4c8dd6e + 8d6762af3 + attachment 290585 [details]
does not fix it. My PC still wakes up.
Too bad. It's really weird that a spurious wakeup is triggered... Does the simple replacement below work instead? Created attachment 290609 [details]
Another try, minimalistic changes
this one does the trick. My PC stays in standby. OK, then let's take this way for now, although I'm not confident what actually caused this spurious wakeup. Below is the slightly revised patch for the formal submission. Please confirm this one. Thanks. Created attachment 290621 [details]
Revised patch
Yes, this patch also works fine. The System stays asleep. Thank you very much! I really appreciate what you did ;) Since you work for Suse, if you are ever in Nuernberg, I will buy you a beer ;) Good to hear, I submitted the fix and will merge it soon. I'm not sure whether I'll send a pull request to Linus for 5.8-final (or rc8), but it'll be certainly in 5.9-rc1. I'm in Nuremberg, but it's no good time to go out now. Let's postpone after Corona crisis :) Now closing as CODE_FIX. Thank you for your patient testing! sure ;) I patched my latest fedora kernel with it. building right now. that should work as well right? Martin, I am currently working on refactoring HDA PM, can you please test the following branch and see if it can keep your system stay in S3? https://github.com/khfeng/linux/tree/hda-pm-refactor Hello, I am quite busy at the moment. I will compile the kernel at the weekend if that is fine? Of course, I'll wait for your testing before sending it to upstream. I had a few minutes so I thought I at least can compile the kernel. Sadly I was unable to build it. I ran: make localmodconfig (accepted the default for all new stuff) make -j12 pager.c: In Funktion »pager_preexec«: pager.c:36:12: Fehler: Argument 2 wird an den mit »restrict« qualifizierten Parameter übergeben, dadurch wird es zum Alias für Argument 4 [-Werror=restrict] 36 | select(1, &in, NULL, &in, NULL); | ^~~ ~~~ cc1: Alle Warnungen werden als Fehler behandelt mv: der Aufruf von stat für '/home/martin/linux/tools/objtool/.pager.o.tmp' ist nicht möglich: Datei oder Verzeichnis nicht gefunden Weird, I am not seeing this issue on my system. I rebased the patches on top of 5.8.15, please see if it works for you: https://github.com/khfeng/linux/tree/hda-pm-refactor-5.8.15 this time it worked. First I can say, that it stays asleep (tested 3 times), but on the first try my system locked up after wakeup. It didnt happen on the two other tries. Can you please attach the log from the locked up session? Do you see the same issue on stable kernel 5.8.15? I am sorry, but I do not see anything related to that in the logs. I have tested now 10 times and it did not occur again. I will test over the next few days and if it occurs again, I will prepare something for you. I rebooted my pc since friday at least 15 times and sent it to standby 40 times. It did not occur again. So maybe it was something else? Martin, Can you please test my v2 series: https://patchwork.kernel.org/project/alsa-devel/list/?series=371143 I will do in the next few days. Will this Patch work with Kernel 5.8.x? Martin, I sent v3 and you were in the cc list. The patch series can be applied to 5.8.x cleanly. Building the kernel now. I will let you know tonight or tomorrow if it is ok. @Kai-Heng Feng it wakes up when it should, have not noticed anything out of the ordinary. I will test till Monday and then report back. I had no issues over the weekend. I think you can send it to production. ty for giving me the chance to test it. Martin, nice to hear it doesn't break your system. These patch will be in v5.10. |