Bug 203637

Summary: unchecked MSR access error: WRMSR when resuming laptop
Product: Platform Specific/Hardware Reporter: Laurent Bonnaud (L.Bonnaud)
Component: x86-64Assignee: Borislav Petkov (bp)
Status: RESOLVED CODE_FIX    
Severity: normal CC: bp, promarbler14, shanxifanshi
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.1.9 Subsystem:
Regression: No Bisected commit-id:
Attachments: Full kernel log
Archlinux 5.1.9 kernel config
Dell Inspiron 15 dmesg
test patch

Description Laurent Bonnaud 2019-05-18 08:30:06 UTC
Hi,

my Dell Latitude 5590 laptop is working well, including suspend/resume.
However, I noticed the following error message in kernel logs:

[34508.271249] unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0000000000000000) at rIP: 0xffffffff8ec73f58 (native_write_msr+0x8/0x30)
[34508.271251] Call Trace:
[34508.271256]  intel_pmu_cpu_starting+0x87/0x260
[34508.271260]  ? x86_pmu_dead_cpu+0x30/0x30
[34508.271262]  x86_pmu_starting_cpu+0x1a/0x30
[34508.271266]  cpuhp_invoke_callback+0x99/0x540
[34508.271269]  notify_cpu_starting+0x58/0x70
[34508.271272]  start_secondary+0xc3/0x1d0
[34508.271274]  secondary_startup_64+0xa4/0xb0

This looks like something that should be fixed.
Comment 1 Laurent Bonnaud 2019-05-18 08:31:07 UTC
Created attachment 282813 [details]
Full kernel log
Comment 2 Adric Blake 2019-06-12 22:53:16 UTC
Likewise, on an Dell Inspiron 5570. (Dell Inc. Inspiron 5570/09YTN7, BIOS 1.2.2 03/11/2019)

Latest Archlinux kernel (5.1.9).

[ 7735.905774] Enabling non-boot CPUs ...
[ 7735.905827] x86: Booting SMP configuration:
[ 7735.905827] smpboot: Booting Node 0 Processor 1 APIC 0x2
[ 7735.907278] unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0000000000000000) at rIP: 0xffffffff8d267924 (native_write_msr+0x4/0x20)
[ 7735.907280] Call Trace:
[ 7735.907283]  intel_set_tfa+0x25/0x30
[ 7735.907285]  intel_pmu_cpu_starting+0x80/0x250
[ 7735.907288]  ? x86_pmu_dead_cpu+0x20/0x20
[ 7735.907289]  x86_pmu_starting_cpu+0x16/0x20
[ 7735.907291]  cpuhp_invoke_callback+0x9b/0x5f0
[ 7735.907294]  ? _raw_spin_lock_irqsave+0x25/0x50
[ 7735.907295]  notify_cpu_starting+0x52/0x70
[ 7735.907297]  start_secondary+0xc7/0x1d0
[ 7735.907299]  secondary_startup_64+0xa4/0xb0
[ 7735.907486] microcode: sig=0x806ea, pf=0x80, revision=0x96
[ 7735.908740] microcode: updated to revision 0xb4, date = 2019-04-01
[ 7735.908819] CPU1 is up
Comment 3 Adric Blake 2019-06-12 22:54:40 UTC
Created attachment 283237 [details]
Archlinux 5.1.9 kernel config
Comment 4 Adric Blake 2019-06-12 22:55:45 UTC
Created attachment 283239 [details]
Dell Inspiron 15 dmesg
Comment 5 Borislav Petkov 2019-06-13 13:32:11 UTC
Created attachment 283255 [details]
test patch

Does that make the warning go away, per chance?

Thx.
Comment 6 Adric Blake 2019-06-14 17:55:23 UTC
Yes. When the patch is applied to linux 5.2rc4. I rebooted-suspended-resumed 3 times, and the warning was gone. I tried 3 suspends in a row for good measure, and still saw no warning.

By comparison, linux 5.1.9, when doing the same sequence, caused the warning each time.
Comment 7 Adric Blake 2019-06-14 17:57:49 UTC
I should clarify: I tested the reboot-suspend-resume sequence on linux 5.1.9 without the patch. I only tested the patch with linux 5.2rc4.
Comment 8 Borislav Petkov 2019-06-14 18:04:04 UTC
Thanks, I'll add your Tested-by to the fix.
Comment 9 Borislav Petkov 2019-06-20 08:20:11 UTC
This is fixed now upstream:

78f4e932f776 ("x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback")
5423f5ce5ca4 ("x86/microcode: Fix the microcode load on CPU hotplug for real")
Comment 10 Laurent Bonnaud 2019-06-20 13:02:08 UTC
I confirm that kernel 5.1.12 fixes the problem on my system.
Thanks a lot!