Bug 218658

Summary: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23
Product: Linux Reporter: Petri Kaukasoina (petri.kaukasoina)
Component: KernelAssignee: Virtual assignee for kernel bugs (linux-kernel)
Status: NEW ---    
Severity: normal CC: padulodeveloper, petri.kaukasoina, philippolson886, regressions
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 6.6.23 Subsystem:
Regression: Yes Bisected commit-id: 5a70baec2294e8a7d0fcc4558741c23e752dad5c
Attachments: dmesg of the broken 6.6.23
bisection log

Description Petri Kaukasoina 2024-03-29 13:38:06 UTC
Created attachment 306057 [details]
dmesg of the broken 6.6.23

With kernel 6.6.23 hibernating usually hangs here: the display stays on but the mouse pointer does not move and the keyboard does not work. But SysRq REISUB does reboot. Sometimes it seems to hibernate: the computer powers down and can be waked up and the previous display comes visible, but it is stuck there.

With 6.6.22 and earlier, hibernate works ok.

When I rebuilt 6.6.23 with the previous linux-6.6.22/kernel/workqueue.c, hibernating works again.
Comment 1 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-04-01 10:40:51 UTC
Could you please bisect which of the changes that modified that file broke things? And check if mainline is affected as well?

This guide explains the steps: 
https://docs.kernel.org/admin-guide/verify-bugs-and-bisect-regressions.html

When actually doing the regression, you can limit the range to changes to that modified said file like this:

git bisect start -- kernel/workqueue.c
Comment 2 Petri Kaukasoina 2024-04-01 16:36:07 UTC
Thank you for your answer.

5a70baec2294e8a7d0fcc4558741c23e752dad5c is the first bad commit

commit 5a70baec2294e8a7d0fcc4558741c23e752dad5c
Author: Tejun Heo <tj@kernel.org>
Date:   Mon Jan 29 08:11:25 2024 -1000

    workqueue: Implement system-wide nr_active enforcement for unbound workqueues

I also tried mainline (6.9.0-rc2) but it couldn't even mount the root fs for some reason. Probably .config would have needed some work.
Comment 3 Petri Kaukasoina 2024-04-01 16:37:28 UTC
Created attachment 306076 [details]
bisection log
Comment 4 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-04-01 17:28:01 UTC
Could you please give 6.6.24-rc1 a quick try; there was a workqueue change missing that might or might not be related:

https://lore.kernel.org/all/20240401152547.867452742@linuxfoundation.org/
Comment 5 Petri Kaukasoina 2024-04-01 18:05:01 UTC
6.6.24-rc1 did not fix this problem.
Comment 6 Petri Kaukasoina 2024-04-01 18:39:50 UTC
When I revert commit 5a70baec2294e8a7d0fcc4558741c23e752dad5c from 6.6.24-rc1, hibernating and waking up work again.
Comment 7 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-04-02 08:29:18 UTC
Forwarded by mail: https://lore.kernel.org/all/ce4c2f67-c298-48a0-87a3-f933d646c73b@leemhuis.info/
Comment 8 padulodeveloper 2024-06-23 17:19:02 UTC
I got the same problem on Debian 11 bullseye updating kernel from version linux-image-5.10.0-28-amd to version linux-image-5.10.0-29-amd64. 
 
From an hibernated system:

- When in GRUB I select the new kernel (5.10.0-29) the system waked up and the previous display comes visible, but it is stuck there;
- When in GRUB I select the previous kernel (5.10.0-28-amd), the waked up works.
Comment 9 padulodeveloper 2024-06-23 20:30:18 UTC
As explained in comment #8, the system "wake up" doesn't work after hibernation. 

I have updated Debian 11 bullseye to the kernel version "linux-image-5.10.0-30-amd64", but the bug persists.

From an hibernated system:

- When in GRUB I select the new kernel (5.10.0-30) the system waked up and the previous display comes visible, but it is stuck there;
- When in GRUB I select the working kernel (5.10.0-28-amd), the system correctly wake up.
Comment 10 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-06-24 05:39:34 UTC
padulodeveloper those versions mean nothing here, those are not upstream kernel versions; reporting problems in an existing and likely fixed bug is also bad. Both things are listed here: https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/