Bug 214271 - unchecked MSR access error: WRMSR when resuming laptop, which results in sporadic freezes
Summary: unchecked MSR access error: WRMSR when resuming laptop, which results in spor...
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: ACPICA-Core (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Rafael J. Wysocki
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-09-02 06:19 UTC by Reik Keutterling
Modified: 2021-10-09 06:26 UTC (History)
5 users (show)

See Also:
Kernel Version: 5.13.13
Subsystem:
Regression: No
Bisected commit-id:


Attachments
ACPICA: Get sleep type data for S0 before suspend (3.34 KB, patch)
2021-09-10 16:23 UTC, Rafael J. Wysocki
Details | Diff

Description Reik Keutterling 2021-09-02 06:19:59 UTC
May related:
https://bugzilla.kernel.org/show_bug.cgi?id=203637
https://bbs.archlinux.org/viewtopic.php?id=269203
https://bbs.archlinux.org/viewtopic.php?id=269141

I currently use a ThinkPad X280.

When resume from deep sleep I sometime get the the unchecked MSR access error and the resume freezes (black screen, fan speeds up, only hard reset works).

When checking the logs from the resume I see the unchecked MSR access error, so this is probably the reason for this issue.

Log:
Sep 01 20:58:08 xpaddy systemd-sleep[3800705]: Entering sleep state 'suspend'...
Sep 01 20:58:08 xpaddy kernel: PM: suspend entry (deep)
Sep 01 20:58:08 xpaddy kernel: Filesystems sync: 0.015 seconds
Sep 02 07:38:06 xpaddy kernel: Freezing user space processes ... (elapsed 0.006 seconds) done.
Sep 02 07:38:06 xpaddy kernel: OOM killer disabled.
Sep 02 07:38:06 xpaddy kernel: Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
Sep 02 07:38:06 xpaddy kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Sep 02 07:38:06 xpaddy kernel: e1000e: EEE TX LPI TIMER: 00000011
Sep 02 07:38:06 xpaddy kernel: ACPI: EC: interrupt blocked
Sep 02 07:38:06 xpaddy kernel: ACPI: Preparing to enter system sleep state S3
Sep 02 07:38:06 xpaddy kernel: ACPI: EC: event blocked
Sep 02 07:38:06 xpaddy kernel: ACPI: EC: EC stopped
Sep 02 07:38:06 xpaddy kernel: PM: Saving platform NVS memory
Sep 02 07:38:06 xpaddy kernel: Disabling non-boot CPUs ...
Sep 02 07:38:06 xpaddy kernel: smpboot: CPU 1 is now offline
Sep 02 07:38:06 xpaddy kernel: smpboot: CPU 2 is now offline
Sep 02 07:38:06 xpaddy kernel: smpboot: CPU 3 is now offline
Sep 02 07:38:06 xpaddy kernel: smpboot: CPU 4 is now offline
Sep 02 07:38:06 xpaddy kernel: smpboot: CPU 5 is now offline
Sep 02 07:38:06 xpaddy kernel: smpboot: CPU 6 is now offline
Sep 02 07:38:06 xpaddy kernel: smpboot: CPU 7 is now offline
Sep 02 07:38:06 xpaddy kernel: ACPI: Low-level resume complete
Sep 02 07:38:06 xpaddy kernel: unchecked MSR access error: WRMSR to 0x83f (tried to write 0x00000000000000f6) at rIP: 0xffffffff86672ad4 (native_write_msr+0x4/0x20)
Sep 02 07:38:06 xpaddy kernel: Call Trace:
Sep 02 07:38:06 xpaddy kernel:  native_apic_msr_write+0x28/0x30
Sep 02 07:38:06 xpaddy kernel:  arch_irq_work_raise+0x26/0x40
Sep 02 07:38:06 xpaddy kernel:  irq_work_queue+0x2e/0x60
Sep 02 07:38:06 xpaddy kernel:  __kfence_alloc+0x80/0x3c0
Sep 02 07:38:06 xpaddy kernel:  ? acpi_get_sleep_type_data+0xaa/0x214
Sep 02 07:38:06 xpaddy kernel:  kmem_cache_alloc_trace+0x1fa/0x2e0
Sep 02 07:38:06 xpaddy kernel:  acpi_get_sleep_type_data+0xaa/0x214
Sep 02 07:38:06 xpaddy kernel:  acpi_hw_legacy_wake_prep+0x57/0x10b
Sep 02 07:38:06 xpaddy kernel:  acpi_leave_sleep_state_prep+0x37/0x69
Sep 02 07:38:06 xpaddy kernel:  acpi_suspend_enter+0xa3/0x1f0
Sep 02 07:38:06 xpaddy kernel:  suspend_devices_and_enter+0x70a/0x7f0
Sep 02 07:38:06 xpaddy kernel:  pm_suspend.cold+0x329/0x374
Sep 02 07:38:06 xpaddy kernel:  state_store+0x71/0xd0
Sep 02 07:38:06 xpaddy kernel:  kernfs_fop_write_iter+0x11c/0x1b0
Sep 02 07:38:06 xpaddy kernel:  new_sync_write+0x159/0x1f0
Sep 02 07:38:06 xpaddy kernel:  vfs_write+0x1db/0x270
Sep 02 07:38:06 xpaddy kernel:  ksys_write+0x67/0xe0
Sep 02 07:38:06 xpaddy kernel:  do_syscall_64+0x61/0x80
Sep 02 07:38:06 xpaddy kernel:  ? syscall_exit_to_user_mode+0x23/0x50
Sep 02 07:38:06 xpaddy kernel:  ? do_syscall_64+0x6e/0x80
Sep 02 07:38:06 xpaddy kernel:  ? handle_mm_fault+0xdb/0x2c0
Sep 02 07:38:06 xpaddy kernel:  ? do_user_addr_fault+0x1e8/0x690
Sep 02 07:38:06 xpaddy kernel:  ? exc_page_fault+0x78/0x180
Sep 02 07:38:06 xpaddy kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Sep 02 07:38:06 xpaddy kernel: RIP: 0033:0x7f085ee12907
Sep 02 07:38:06 xpaddy kernel: Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
Sep 02 07:38:06 xpaddy kernel: RSP: 002b:00007ffd71b0eab8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Sep 02 07:38:06 xpaddy kernel: RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f085ee12907
Sep 02 07:38:06 xpaddy kernel: RDX: 0000000000000004 RSI: 00007ffd71b0eba0 RDI: 0000000000000004
Sep 02 07:38:06 xpaddy kernel: RBP: 00007ffd71b0eba0 R08: 0000562aef3d9180 R09: 00007f085eea84e0
Sep 02 07:38:06 xpaddy kernel: R10: 00007f085eea83e0 R11: 0000000000000246 R12: 0000000000000004
Sep 02 07:38:06 xpaddy kernel: R13: 0000562aef3d53c0 R14: 0000000000000004 R15: 00007f085eee4700
Sep 02 07:38:06 xpaddy kernel: ACPI: EC: EC started
Sep 02 07:38:06 xpaddy kernel: PM: Restoring platform NVS memory
Sep 02 07:38:06 xpaddy kernel: Enabling non-boot CPUs ...
Sep 02 07:38:06 xpaddy kernel: x86: Booting SMP configuration:
Sep 02 07:38:06 xpaddy kernel: smpboot: Booting Node 0 Processor 1 APIC 0x2



Distribution: ArchLinux
Kernel: 5.13.13
Comment 1 Thomas Gleixner 2021-09-02 08:28:46 UTC
That's caused by ACPI doing a memory allocation in the very early resume state where almost everything is disabled. The kernel has KFENCE enabled which tries to queue irq work and self IPIs the CPU. That fails simply because the APIC is still shut down.


Reassigned it to ACPI
Comment 2 Rafael J. Wysocki 2021-09-02 17:51:59 UTC
It looks like the values to be written the the "PM control" registers during resume need to be computed when suspending and stored for later use.
Comment 3 Rafael J. Wysocki 2021-09-10 16:23:07 UTC
Created attachment 298743 [details]
ACPICA: Get sleep type data for S0 before suspend

Please test this patch (untested so far) and let me know if it makes a difference.
Comment 4 Reik Keutterling 2021-09-13 15:27:02 UTC
Thanks, I applied the patch to kernel 5.14.3 and it looks promising so far, but I guess this needs some days of testing, as the issue only happens once in while.
Comment 5 Reik Keutterling 2021-09-21 10:04:43 UTC
Tested this patch now for one week without any issues so far. So I guess it made a difference.

Thanks.
Comment 6 Rafael J. Wysocki 2021-09-27 16:53:04 UTC
Thanks!

I'll submit it for review and we'll proceed with applying it if there are no issues.
Comment 7 Rafael J. Wysocki 2021-09-29 18:53:46 UTC
Patch submitted:

https://patchwork.kernel.org/project/linux-acpi/patch/2618833.mvXUDI8C0e@kreacher/

Note You need to log in before you can comment on or make changes to this bug.