Created attachment 307529 [details] otcpl-freeze_hang_on_vt_waitactive.html In our weekly S2idle stress testing we run a series of suspends over and over via an ssh connection to test the latest kernel release candidates. Two of our machines have an issue where suspend and sleepgraph eventually hangs and stress testing fails. Basically sleepgraph refuses to exit and prevents any other attempts at suspend until a reboot. I've traced the issue to the pm_prepare_console call in kernel/power/suspend.c $> echo freeze > /sys/power/state 1) suspend_devices_and_enter -> 2) pm_suspend -> 3) pm_prepare_console -> 4) vt_move_to_console -> 5) vt_waitactive -> 6) __vt_event_wait_part.0 (ad infinitum) The vt_waitactive call does not appear to have a timeout, and for some reason it refuses to become active, so this literally hangs suspend in vt_waitactive. This leaves the user space suspend call stuck forever: whether it be a "sleepgraph -m freeze", "rtcwake -m freeze", or "echo freeze > /sys/power/state". Killing the process doesn't help because the "system_transition_mutex" remains set and is not properly unlocked if the process is killed. Thus any subsequent calls to suspend fail on the trylock or system_transition_mutex and exit. Thus suspend is broken until the system is rebooted. The solution I believe would work best would be to add a timeout to vt_waitactive that returns an error after say a default of 20 seconds (which is the same the freeze_processes timeout). This way the suspend call will fail gracefully and possibly allow for another suspend attempt that could work. The other solution would be to figure out why any call to vt_waitactive would stall at all. Our machines have USB-console connectors and connect to other hosts through minicom. I've noticed that this issue seems to happen only when I issue a suspend through an ssh connection. So it makes it a rare use case. Also, in sleepgraph I've added a signal capture that prints out a stack trace to a file if SIGUSR1 is received. If I send this to the hung sleepgraph process, the vt_waitactive completes for some reason and stress testing can continue. So this issue is both rare and seems to have unusual ways of getting around it. I've included a sleepgraph timeline callgraph which shows the failed suspend and the unending vt_waitactive call. I've also included a basic dmesg breakdown of what we see during stress testing.
Created attachment 307530 [details] dmesg_on_pm_prepare_console_fail.txt The dmesg log during stress testing when this issue occurs. I added some printks to the various mutexes for debug purposes.
Created attachment 307531 [details] vt_waitactive.png I quick snapshot of the call trace during fail from the sleepgraph html.
# dmesg from stress testing over this error. # last successful suspend/resume [ 738.093610] PM: resume of devices complete after 1215.077 msecs [ 738.096069] probe of 0000:00:1f.3 returned 19 after 54 usecs [ 738.096362] sof-audio-pci-intel-ptl 0000:00:1f.3: sof_debug value: 0x1 [ 738.096634] probe of 0000:00:1f.3 returned 517 after 558 usecs [ 738.096840] OOM killer enabled. [ 738.096843] Restarting tasks ... done. [ 738.100022] random: crng reseeded on system resumption [ 738.100269] mutex_lock3 vt_switch_mutex [ 738.104210] mutex_unlock3 vt_switch_mutex [ 738.108944] PM: mutex_unlock1 suspend system_transition_mutex [ 738.114798] PM: suspend exit # final resume is done, sleepgraph exits # The system_transition_mutex is unlocked [ 738.158311] atkbd serio0: Failed to enable keyboard on isa0060/serio0 [ 743.013191] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None # start of next suspend [ 753.086604] PM: suspend entry (s2idle) [ 753.086613] PM: mutex_trylock1 suspend system_transition_mutex [ 753.100012] Filesystems sync: 0.007 seconds [ 753.100027] mutex_lock3 vt_switch_mutex [ 753.103934] mutex_unlock3 vt_switch_mutex # suspend hangs here at pm_prepare_console in # kernel/power/suspend.c: line 365, function suspend_prepare # stress tester times out and tries to run another sleepgraph, # but system_transition_mutex is still locked. [ 1359.964769] PM: suspend entry (s2idle) # specific line is kernel/power/suspend.c: line 576, function enter_state [ 1359.964778] PM: mutex_trylock1 suspend system_transition_mutex [ 1359.970693] PM: suspend system_transition_mutex FAILED [ 1359.975887] PM: suspend exit` [ 1368.186078] PM: suspend entry (s2idle) [ 1368.186088] PM: mutex_trylock1 suspend system_transition_mutex [ 1368.192033] PM: suspend system_transition_mutex FAILED [ 1368.197263] PM: suspend exit [ 1376.550004] PM: suspend entry (s2idle) [ 1376.550013] PM: mutex_trylock1 suspend system_transition_mutex [ 1376.555957] PM: suspend system_transition_mutex FAILED [ 1376.561182] PM: suspend exit [ 1384.614296] PM: suspend entry (s2idle) [ 1384.614304] PM: mutex_trylock1 suspend system_transition_mutex [ 1384.620213] PM: suspend system_transition_mutex FAILED [ 1384.625423] PM: suspend exit [ 1392.802663] PM: suspend entry (s2idle) [ 1392.802673] PM: mutex_trylock1 suspend system_transition_mutex [ 1392.808589] PM: suspend system_transition_mutex FAILED [ 1392.813786] PM: suspend exit # "echo freeze > /sys/power/state" never exits
NOTE: upon further testing ssh appears to have nothing to do with it. I tried running a stresstest -multi block manually on the machine and the same issue happened. So ssh is not a factor.
NOTE: Of the two machines this occurs on only one has a USB-serial cable and minicom session attached, the other has no serial cable at all, so the hardware does not appear to affect this. They do both have the console setup in the kernel command line. I will test removing this. otcpl-lkf: BOOT_IMAGE=/boot/vmlinuz-6.13.0-mutex-debug root=UUID=e2a10d9b-0912-45c4-85a3-62a1c7060027 ro quiet splash console=tty0 console=ttyS0,115200n8 loglevel=7 pm_debug_messages resume=UUID=e2a10d9b-0912-45c4-85a3-62a1c7060027 resume_offset=34816 vt.handoff=7 otcpl-ptl-h-4: BOOT_IMAGE=/boot/vmlinuz-6.13.0-mutex-debug root=UUID=3335e9e7-923c-417f-bac9-71fdc393ed52 ro console=tty0 console=ttyS0,115200n8 initcall_debug
Here's the kprobe trace of the fail. It looks like Xorg issues an event 73 us after vt_move_to_console is called and a race condition occurs. Suspend is preventing Xorg from finishing this event and Xorg and Suspend are stuck waiting: python3-34762 [005] ..... 1709.166210: vt_move_to_console_cal: (vt_move_to_console+0x4/0xc0) Xorg-34837 [003] ..... 1709.166283: vt_event_post_cal: (vt_event_post+0x4/0xc0) Xorg-34837 [003] ..... 1709.166284: vt_event_post_ret: (do_unblank_screen+0xf5/0x180 <- vt_event_post) arg1=0x80000000 Xorg-34837 [003] ..... 1709.166287: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 Xorg-34837 [003] ..... 1709.166289: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) Xorg-34837 [003] ..... 1709.166290: vt_do_kdskbmode_cal: (vt_do_kdskbmode+0x4/0x100) Xorg-34837 [003] ..... 1709.166291: vt_do_kdskbmode_ret: (vt_ioctl+0xd69/0x13f0 <- vt_do_kdskbmode) arg1=0x0 Xorg-34837 [003] ..... 1709.166292: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 Xorg-34837 [003] ..... 1709.166293: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) Xorg-34837 [003] ..... 1709.166293: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0xfffffdfd Xorg-34837 [003] ..... 1709.166294: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) Xorg-34837 [003] ..... 1709.166294: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0xfffffdfd Xorg-34837 [003] ..... 1709.166296: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) Xorg-34837 [003] ..... 1709.166296: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0xfffffdfd Xorg-34837 [003] ..... 1709.166297: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) python3-34762 [005] .N... 1709.166310: vt_waitactive_cal: (vt_waitactive+0x4/0x180) python3-34762 [005] ..... 1709.166317: vt_event_wait_cal: (__vt_event_wait.part.0+0x0/0x90) Xorg-34837 [003] ..... 1709.166327: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 Xorg-34837 [003] ..... 1709.166330: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) Xorg-34837 [003] ..... 1709.166339: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 systemd-logind-615 [004] ..... 1709.166654: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) systemd-logind-615 [004] ..... 1709.166655: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0xfffffdfd systemd-logind-615 [004] ..... 1709.166657: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) systemd-logind-615 [004] ..... 1709.166658: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 systemd-logind-615 [004] ..... 1709.166680: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) systemd-logind-615 [004] ..... 1709.166681: vt_do_kdskbmode_cal: (vt_do_kdskbmode+0x4/0x100) systemd-logind-615 [004] ..... 1709.166682: vt_do_kdskbmode_ret: (vt_ioctl+0xd69/0x13f0 <- vt_do_kdskbmode) arg1=0x0 systemd-logind-615 [004] ..... 1709.166682: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 systemd-logind-615 [004] ..... 1709.166683: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) systemd-logind-615 [004] ..... 1709.166684: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 gdm-session-wor-34793 [001] ..... 1709.178048: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) gdm-session-wor-34793 [001] ..... 1709.178051: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 gdm-session-wor-34793 [001] ..... 1709.178053: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) gdm-session-wor-34793 [001] ..... 1709.178053: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 gdm-session-wor-34793 [001] ..... 1709.178055: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) gdm-session-wor-34793 [001] ..... 1709.178056: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 gdm-session-wor-34793 [001] ..... 1709.178056: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) gdm-session-wor-34793 [001] .N... 1709.178061: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 kworker/1:1-95 [001] ..... 1709.178075: vt_set_leds_compute_shiftstate_cal: (vt_set_leds_compute_shiftstate+0x4/0x60) kworker/1:1-95 [001] .N... 1709.178079: vt_set_leds_compute_shiftstate_ret: (redraw_screen+0x1c7/0x290 <- vt_set_leds_compute_shiftstate) arg1=0x0 kworker/1:1-95 [001] .N... 1709.178080: vt_event_post_cal: (vt_event_post+0x4/0xc0) kworker/1:1-95 [001] .N... 1709.178084: vt_event_post_ret: (complete_change_console+0x76/0x120 <- vt_event_post) arg1=0x0 gdm-session-wor-34793 [001] ..... 1709.178098: vt_ioctl_cal: (vt_ioctl+0x4/0x13f0) gdm-session-wor-34793 [001] ..... 1709.178099: vt_waitactive_cal: (vt_waitactive+0x4/0x180) gdm-session-wor-34793 [001] ..... 1709.178100: vt_waitactive_ret: (vt_ioctl+0xa6f/0x13f0 <- vt_waitactive) arg1=0x0 gdm-session-wor-34793 [001] ..... 1709.178100: vt_ioctl_ret: (tty_ioctl+0x35e/0x910 <- vt_ioctl) arg1=0x0 python3-34762 [000] ..... 1709.178167: vt_event_wait_ret: (vt_waitactive+0x105/0x180 <- __vt_event_wait.part.0) arg1=0x0 python3-34762 [000] ..... 1709.178170: vt_event_wait_cal: (__vt_event_wait.part.0+0x0/0x90)