Bug 210939

Summary: echo 0 > /sys/devices/system/cpu/cpu16/online causes a null dereference
Product: Platform Specific/Hardware Reporter: Rafael Kitover (rkitover)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: CLOSED CODE_FIX    
Severity: normal CC: bp, kim.phillips
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.11-rc1 Subsystem:
Regression: No Bisected commit-id:
Attachments: debug patch
debug patch + potential fix
system logs
boot log with patch

Description Rafael Kitover 2020-12-28 22:13:07 UTC
On 5.9 this returns 1 immediately, on 5.11-rc1 it hangs up the process and neither kill nor kill -9 remove it.

Here is my /proc/cpuinfo:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 1
model name      : AMD EPYC 7601 32-Core Processor
stepping        : 2
microcode       : 0x8001250
cpu MHz         : 1199.674
cache size      : 512 KB
physical id     : 0
siblings        : 64
core id         : 0
cpu cores       : 32
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 4401.71
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
Comment 1 Rafael Kitover 2020-12-28 22:21:41 UTC
Here is a backtrace from journalctl:

дек 28 22:19:57 epyc sudo[28110]: rkitover : TTY=pts/3 ; PWD=/home/rkitover ; USER=root ; COMMAND=/bin/sh -c echo 0 > /sys/devices/system/cpu/cpu16/online                                                           
дек 28 22:19:57 epyc sudo[28110]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)                                                                                                          дек 28 22:19:58 epyc kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008                                                                                                                         
дек 28 22:19:58 epyc kernel: #PF: supervisor write access in kernel mode                                                                                                                                             дек 28 22:19:58 epyc kernel: #PF: error_code(0x0002) - not-present page                                                                                                                                              
дек 28 22:19:58 epyc kernel: PGD 849465067 P4D 849465067 PUD 849466067 PMD 0                                                                                                                                         дек 28 22:19:58 epyc kernel: Oops: 0002 [#1] SMP NOPTI                                                                                                                                                               
дек 28 22:19:58 epyc kernel: CPU: 16 PID: 91 Comm: cpuhp/16 Tainted: P           OE     5.11.0-rc1-x86_64+ #1                                                                                                        дек 28 22:19:58 epyc kernel: Hardware name: Supermicro Super Server/H11DSi, BIOS 2.1 02/21/2020                                                                                                                      
дек 28 22:19:58 epyc kernel: RIP: 0010:rapl_cpu_offline+0x4c/0xa0 [rapl]                                                                                                                                             дек 28 22:19:58 epyc kernel: Code: 00 00 00 48 8b 0d f4 2d 00 00 3b 91 28 01 00 00 73 08 48 8b 9c d1 30 01 00 00 f0 48 0f b3 05 db 29 00 00 72 05 31 c0 5b 5d c3 <c7> 43 08 ff ff ff ff 48 c7 c2 48 14 01 00 89 ee 48
 8b 04 c5 00 49                                                                                                                                                                                                      дек 28 22:19:58 epyc kernel: RSP: 0018:ffffb01486db3e70 EFLAGS: 00010247                                                                                                                                             
дек 28 22:19:58 epyc kernel: RAX: 0000000000000010 RBX: 0000000000000000 RCX: ffff935a4247b800                                                                                                                       дек 28 22:19:58 epyc kernel: RDX: 0000000000000002 RSI: 000000000000009e RDI: 0000000000000010                                                                                                                       
дек 28 22:19:58 epyc kernel: RBP: 0000000000000010 R08: ffff9359ffa18788 R09: ffff934b000781b0                                                                                                                       дек 28 22:19:58 epyc kernel: R10: 0000000000000002 R11: 0000000000000002 R12: 000000000000009e                                                                                                                       
дек 28 22:19:58 epyc kernel: R13: ffffffff9c4490b0 R14: ffff9352200018c0 R15: 0000000000000010                                                                                                                       дек 28 22:19:58 epyc kernel: FS:  0000000000000000(0000) GS:ffff9359ffa00000(0000) knlGS:0000000000000000                                                                                                            
дек 28 22:19:58 epyc kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                                                                                                       дек 28 22:19:58 epyc kernel: CR2: 0000000000000008 CR3: 000000085e984000 CR4: 00000000003506e0                                                                                                                       
дек 28 22:19:58 epyc kernel: Call Trace:                                                                                                                                                                             дек 28 22:19:58 epyc kernel:  ? rapl_pmu_event_read+0x10/0x10 [rapl]                                                                                                                                                 
дек 28 22:19:58 epyc kernel:  cpuhp_invoke_callback+0x7e/0x3d0                                                                                                                                                       дек 28 22:19:58 epyc kernel:  cpuhp_thread_fun+0xb0/0x110                                                                                                                                                            
дек 28 22:19:58 epyc kernel:  smpboot_thread_fn+0xc5/0x160                                                                                                                                                           дек 28 22:19:58 epyc kernel:  ? smpboot_register_percpu_thread+0xf0/0xf0                                                                                                                                             
дек 28 22:19:58 epyc kernel:  kthread+0x11b/0x140                                                                                                                                                                    дек 28 22:19:58 epyc kernel:  ? __kthread_bind_mask+0x60/0x60                                                                                                                                                        
дек 28 22:19:58 epyc kernel:  ret_from_fork+0x22/0x30                                                                                                                                                                дек 28 22:19:58 epyc kernel: Modules linked in: rfcomm md4 nf_conntrack_netlink nfnetlink xt_addrtype nls_utf8 br_netfilter cifs libarc4 dns_resolver fscache libdes xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJEC
T nf_reject_ipv4 ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c uinput iptable_filter bpfilter bridge stp llc cmac algif_hash algif_skcipher af_alg bnep uvcvideo intel_rapl_msr snd_usb_audio intel_rapl_common videobuf2_vmalloc videobuf2_memops amd64_edac_mod videobuf2_v4l2 snd_usbmidi_lib edac_mce_amd snd_hwdep videobuf2_common 
snd_rawmidi snd_seq_device kvm_amd snd_pcm pktcdvd amdgpu snd_timer videodev xpad btusb ff_memless btrtl kvm snd btbcm soundcore btintel rapl mfd_core mc iommu_v2 joydev bluetooth gpu_sched drm_ttm_helper pcspkr efi_pstore ttm sp5100_tco ecdh_generic vfat drm_kms_helper rfkill cec fat squashfs ecc k10temp i2c_piix4 mac_hid acpi_cpufreq loop ext4 mbcache jbd2 vfio_pci                                                         
дек 28 22:19:58 epyc kernel:  vfio_virqfd vfio_iommu_type1 vfio irqbypass drm fuse backlight configfs ip_tables hid_logitech_dj hid_logitech_hidpp sr_mod sd_mod cdrom zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) crct10dif_pclmul crc32_pclmul crc32c_intel spl(OE) ghash_clmulni_intel aesni_intel nvme igb i2c_algo_bit crypto_simd nvme_core ahci xhci_pci dca cryptd glue_helper libahci 
xhci_hcd t10_pi ccp i2c_core pinctrl_amd sunrpc dm_mirror dm_region_hash dm_log dm_mod efivarfs ipv6 crc_ccitt autofs4                                                                                               дек 28 22:19:58 epyc kernel: CR2: 0000000000000008                                                                                                                                                                   
дек 28 22:19:58 epyc kernel: ---[ end trace ea49142f3bef383d ]---                                                                                                                                                    дек 28 22:19:58 epyc kernel: RIP: 0010:rapl_cpu_offline+0x4c/0xa0 [rapl]                                                                                                                                             
дек 28 22:19:58 epyc kernel: Code: 00 00 00 48 8b 0d f4 2d 00 00 3b 91 28 01 00 00 73 08 48 8b 9c d1 30 01 00 00 f0 48 0f b3 05 db 29 00 00 72 05 31 c0 5b 5d c3 <c7> 43 08 ff ff ff ff 48 c7 c2 48 14 01 00 89 ee 48 8b 04 c5 00 49                                                                                                                                                                                                      
дек 28 22:19:58 epyc kernel: RSP: 0018:ffffb01486db3e70 EFLAGS: 00010247                                                                                                                                             дек 28 22:19:58 epyc kernel: RAX: 0000000000000010 RBX: 0000000000000000 RCX: ffff935a4247b800                                                                                                                       
дек 28 22:19:58 epyc kernel: RDX: 0000000000000002 RSI: 000000000000009e RDI: 0000000000000010                                                                                                                       дек 28 22:19:58 epyc kernel: RBP: 0000000000000010 R08: ffff9359ffa18788 R09: ffff934b000781b0                                                                                                                       
дек 28 22:19:58 epyc kernel: R10: 0000000000000002 R11: 0000000000000002 R12: 000000000000009e                                                                                                                       дек 28 22:19:58 epyc kernel: R13: ffffffff9c4490b0 R14: ffff9352200018c0 R15: 0000000000000010                                                                                                                       
дек 28 22:19:58 epyc kernel: FS:  0000000000000000(0000) GS:ffff9359ffa00000(0000) knlGS:0000000000000000                                                                                                            дек 28 22:19:58 epyc kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                                                                                                       
дек 28 22:19:58 epyc kernel: CR2: 0000000000000008 CR3: 000000085e984000 CR4: 00000000003506e0
Comment 2 Rafael Kitover 2020-12-28 22:24:42 UTC
Sorry, the actual command that causes the bug as in the backtrace above is:

echo 0 > /sys/devices/system/cpu/cpu16/online

I edited the title.
Comment 3 Borislav Petkov 2020-12-28 22:53:13 UTC
AFAICT, that RIP points to

  2a:*  c7 43 08 ff ff ff ff    movl   $0xffffffff,0x8(%rbx)            <-- trapping instruction

and that looks like:

        pmu->cpu = -1;

in rapl_cpu_offline(). And that turns into a NULL ptr deref because
cpu_to_rapl_pmu() gives NULL, most likely.

And that happens, probably because topology_logical_die_id() gives some
dieid which is >= rapl_pmus->maxdie for some unknown reason of confused
die IDs on this particular platform.

Kim?
Comment 4 Borislav Petkov 2020-12-29 13:12:02 UTC
Created attachment 294395 [details]
debug patch
Comment 5 Borislav Petkov 2020-12-29 13:12:55 UTC
Can you try this debug patch to see what the die IDs and maxdies are on your system? Pls upload full dmesg after offlining CPU 16.

Thx.
Comment 6 Rafael Kitover 2020-12-29 13:53:30 UTC
Here you go:

дек 29 13:52:15 epyc kernel: RAPL PMU: cpu_to_rapl_pmu: CPU16, dieid: 2
дек 29 13:52:15 epyc kernel: ------------[ cut here ]------------
дек 29 13:52:15 epyc kernel: WARNING: CPU: 16 PID: 91 at arch/x86/events/rapl.c:556 rapl_cpu_offline+0x6c/0xc0 [rapl]
дек 29 13:52:15 epyc kernel: Modules linked in: rfcomm md4 nls_utf8 cifs libarc4 dns_resolver fscache libdes nf_conntrack_netlink nfnetlink xt_addrtype br_netfilter xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter uinput bpfilter bridge stp llc cmac algif_hash algif_skcipher af_alg intel_rapl_msr amdgpu intel_rapl_common bnep amd64_edac_mod edac_mce_amd mfd_core snd_usb_audio snd_usbmidi_lib iommu_v2 snd_hwdep uvcvideo videobuf2_vmalloc kvm_amd gpu_sched snd_rawmidi videobuf2_memops videobuf2_v4l2 drm_ttm_helper snd_seq_device videobuf2_common ttm snd_pcm pktcdvd snd_timer btusb drm_kms_helper btrtl kvm btbcm btintel snd bluetooth xpad videodev soundcore ff_memless mc pcspkr efi_pstore joydev cec ecdh_generic rfkill sp5100_tco rapl ecc squashfs k10temp i2c_piix4 mac_hid vfat fat acpi_cpufreq loop ext4 mbcache jbd2 vfio_pci
дек 29 13:52:15 epyc kernel:  vfio_virqfd vfio_iommu_type1 vfio irqbypass drm fuse backlight configfs ip_tables hid_logitech_dj hid_logitech_hidpp sr_mod sd_mod cdrom zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) crct10dif_pclmul znvpair(POE) crc32_pclmul crc32c_intel ghash_clmulni_intel spl(OE) igb aesni_intel nvme i2c_algo_bit crypto_simd cryptd dca ahci glue_helper nvme_core xhci_pci libahci t10_pi xhci_hcd ccp i2c_core pinctrl_amd sunrpc dm_mirror dm_region_hash dm_log dm_mod efivarfs ipv6 crc_ccitt autofs4
дек 29 13:52:15 epyc kernel: CPU: 16 PID: 91 Comm: cpuhp/16 Tainted: P           OE     5.11.0-rc1-x86_64+ #1
дек 29 13:52:15 epyc kernel: Hardware name: Supermicro Super Server/H11DSi, BIOS 2.1 02/21/2020
дек 29 13:52:15 epyc kernel: RIP: 0010:rapl_cpu_offline+0x6c/0xc0 [rapl]
дек 29 13:52:15 epyc kernel: Code: 29 f1 48 8b 05 a6 2c 00 00 44 3b a0 28 01 00 00 73 08 4a 8b 9c e0 30 01 00 00 f0 4c 0f b3 2d 8c 28 00 00 73 50 48 85 db 75 07 <0f> 0b 83 c8 ff eb 46 c7 43 08 ff ff ff ff 48 c7 c0 48 14 01 00 89
дек 29 13:52:15 epyc kernel: RSP: 0018:ffffbabd86db3e60 EFLAGS: 00010246
дек 29 13:52:15 epyc kernel: RAX: ffff931ced583000 RBX: 0000000000000000 RCX: 0000000000000000
дек 29 13:52:15 epyc kernel: RDX: 0000000000000000 RSI: ffff9324bfa18ac0 RDI: ffff9324bfa18ac0
дек 29 13:52:15 epyc kernel: RBP: 0000000000000010 R08: 0000000000000000 R09: ffffbabd86db3c88
дек 29 13:52:15 epyc kernel: R10: ffffbabd86db3c80 R11: ffff9334bd1fffe8 R12: 0000000000000002
дек 29 13:52:15 epyc kernel: R13: 0000000000000010 R14: ffff931ce198ef90 R15: 0000000000000010
дек 29 13:52:15 epyc kernel: FS:  0000000000000000(0000) GS:ffff9324bfa00000(0000) knlGS:0000000000000000
дек 29 13:52:15 epyc kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
дек 29 13:52:15 epyc kernel: CR2: 000055f5b3f8fb7c CR3: 00000019c3888000 CR4: 00000000003506e0
дек 29 13:52:15 epyc kernel: Call Trace:
дек 29 13:52:15 epyc kernel:  ? rapl_event_update.isra.0.cold+0x11/0x11 [rapl]
дек 29 13:52:15 epyc kernel:  cpuhp_invoke_callback+0x7e/0x3d0
дек 29 13:52:15 epyc kernel:  cpuhp_thread_fun+0xb0/0x110
дек 29 13:52:15 epyc kernel:  smpboot_thread_fn+0xc5/0x160
дек 29 13:52:15 epyc kernel:  ? smpboot_register_percpu_thread+0xf0/0xf0
дек 29 13:52:15 epyc kernel:  kthread+0x11b/0x140
дек 29 13:52:15 epyc kernel:  ? __kthread_bind_mask+0x60/0x60
дек 29 13:52:15 epyc kernel:  ret_from_fork+0x22/0x30
дек 29 13:52:15 epyc kernel: ---[ end trace 867ebc2e8e30d46a ]---
дек 29 13:52:15 epyc kernel: ACPI: \_PR_.C020: Found 2 idle states
дек 29 13:52:27 epyc kernel: RAPL PMU: cpu_to_rapl_pmu: CPU16, dieid: 2
дек 29 13:52:27 epyc kernel: smpboot: CPU 16 is now offline
Comment 7 Borislav Petkov 2020-12-29 14:46:18 UTC
Thanks but this is not the full dmesg where there should be also maxdie printed.

Do

dmesg | grep -i rapl

and paste that output here pls.
Comment 8 Rafael Kitover 2020-12-29 14:52:48 UTC
Sorry, here it is:

[   10.995489] RAPL PMU: init_rapl_pmus: maxdie: 2
[   11.002998] RAPL PMU: cpu_to_rapl_pmu: CPU0, dieid: 0
[   11.005704] RAPL PMU: cpu_to_rapl_pmu: CPU1, dieid: 0
[   11.008337] RAPL PMU: cpu_to_rapl_pmu: CPU2, dieid: 0
[   11.014429] RAPL PMU: cpu_to_rapl_pmu: CPU3, dieid: 0
[   11.014459] RAPL PMU: cpu_to_rapl_pmu: CPU4, dieid: 0
[   11.020142] RAPL PMU: cpu_to_rapl_pmu: CPU5, dieid: 0
[   11.020205] RAPL PMU: cpu_to_rapl_pmu: CPU6, dieid: 0
[   11.026769] RAPL PMU: cpu_to_rapl_pmu: CPU7, dieid: 0
[   11.026816] RAPL PMU: cpu_to_rapl_pmu: CPU8, dieid: 1
[   11.026848] RAPL PMU: cpu_to_rapl_pmu: CPU9, dieid: 1
[   11.026875] RAPL PMU: cpu_to_rapl_pmu: CPU10, dieid: 1
[   11.026897] RAPL PMU: cpu_to_rapl_pmu: CPU11, dieid: 1
[   11.026940] RAPL PMU: cpu_to_rapl_pmu: CPU12, dieid: 1
[   11.026966] RAPL PMU: cpu_to_rapl_pmu: CPU13, dieid: 1
[   11.026992] RAPL PMU: cpu_to_rapl_pmu: CPU14, dieid: 1
[   11.027020] RAPL PMU: cpu_to_rapl_pmu: CPU15, dieid: 1
[   11.027047] RAPL PMU: cpu_to_rapl_pmu: CPU16, dieid: 2
[   11.027091] RAPL PMU: cpu_to_rapl_pmu: CPU17, dieid: 2
[   11.027118] RAPL PMU: cpu_to_rapl_pmu: CPU18, dieid: 2
[   11.027153] RAPL PMU: cpu_to_rapl_pmu: CPU19, dieid: 2
[   11.027186] RAPL PMU: cpu_to_rapl_pmu: CPU20, dieid: 2
[   11.027220] RAPL PMU: cpu_to_rapl_pmu: CPU21, dieid: 2
[   11.027249] RAPL PMU: cpu_to_rapl_pmu: CPU22, dieid: 2
[   11.027278] RAPL PMU: cpu_to_rapl_pmu: CPU23, dieid: 2
[   11.027314] RAPL PMU: cpu_to_rapl_pmu: CPU24, dieid: 3
[   11.027351] RAPL PMU: cpu_to_rapl_pmu: CPU25, dieid: 3
[   11.027380] RAPL PMU: cpu_to_rapl_pmu: CPU26, dieid: 3
[   11.027449] RAPL PMU: cpu_to_rapl_pmu: CPU27, dieid: 3
[   11.027528] RAPL PMU: cpu_to_rapl_pmu: CPU28, dieid: 3
[   11.027592] RAPL PMU: cpu_to_rapl_pmu: CPU29, dieid: 3
[   11.027621] RAPL PMU: cpu_to_rapl_pmu: CPU30, dieid: 3
[   11.027661] RAPL PMU: cpu_to_rapl_pmu: CPU31, dieid: 3
[   11.027696] RAPL PMU: cpu_to_rapl_pmu: CPU32, dieid: 4
[   11.027732] RAPL PMU: cpu_to_rapl_pmu: CPU33, dieid: 4
[   11.027769] RAPL PMU: cpu_to_rapl_pmu: CPU34, dieid: 4
[   11.027806] RAPL PMU: cpu_to_rapl_pmu: CPU35, dieid: 4
[   11.027835] RAPL PMU: cpu_to_rapl_pmu: CPU36, dieid: 4
[   11.027893] RAPL PMU: cpu_to_rapl_pmu: CPU37, dieid: 4
[   11.027924] RAPL PMU: cpu_to_rapl_pmu: CPU38, dieid: 4
[   11.027956] RAPL PMU: cpu_to_rapl_pmu: CPU39, dieid: 4
[   11.027992] RAPL PMU: cpu_to_rapl_pmu: CPU40, dieid: 5
[   11.028036] RAPL PMU: cpu_to_rapl_pmu: CPU41, dieid: 5
[   11.028071] RAPL PMU: cpu_to_rapl_pmu: CPU42, dieid: 5
[   11.028110] RAPL PMU: cpu_to_rapl_pmu: CPU43, dieid: 5
[   11.028150] RAPL PMU: cpu_to_rapl_pmu: CPU44, dieid: 5
[   11.028188] RAPL PMU: cpu_to_rapl_pmu: CPU45, dieid: 5
[   11.028222] RAPL PMU: cpu_to_rapl_pmu: CPU46, dieid: 5
[   11.028268] RAPL PMU: cpu_to_rapl_pmu: CPU47, dieid: 5
[   11.028314] RAPL PMU: cpu_to_rapl_pmu: CPU48, dieid: 6
[   11.028356] RAPL PMU: cpu_to_rapl_pmu: CPU49, dieid: 6
[   11.028389] RAPL PMU: cpu_to_rapl_pmu: CPU50, dieid: 6
[   11.028415] RAPL PMU: cpu_to_rapl_pmu: CPU51, dieid: 6
[   11.028446] RAPL PMU: cpu_to_rapl_pmu: CPU52, dieid: 6
[   11.028483] RAPL PMU: cpu_to_rapl_pmu: CPU53, dieid: 6
[   11.028509] RAPL PMU: cpu_to_rapl_pmu: CPU54, dieid: 6
[   11.028539] RAPL PMU: cpu_to_rapl_pmu: CPU55, dieid: 6
[   11.028574] RAPL PMU: cpu_to_rapl_pmu: CPU56, dieid: 7
[   11.028607] RAPL PMU: cpu_to_rapl_pmu: CPU57, dieid: 7
[   11.028641] RAPL PMU: cpu_to_rapl_pmu: CPU58, dieid: 7
[   11.028668] RAPL PMU: cpu_to_rapl_pmu: CPU59, dieid: 7
[   11.028700] RAPL PMU: cpu_to_rapl_pmu: CPU60, dieid: 7
[   11.028738] RAPL PMU: cpu_to_rapl_pmu: CPU61, dieid: 7
[   11.028770] RAPL PMU: cpu_to_rapl_pmu: CPU62, dieid: 7
[   11.028822] RAPL PMU: cpu_to_rapl_pmu: CPU63, dieid: 7
[   11.028877] RAPL PMU: cpu_to_rapl_pmu: CPU64, dieid: 0
[   11.028914] RAPL PMU: cpu_to_rapl_pmu: CPU65, dieid: 0
[   11.028933] RAPL PMU: cpu_to_rapl_pmu: CPU66, dieid: 0
[   11.028952] RAPL PMU: cpu_to_rapl_pmu: CPU67, dieid: 0
[   11.040248] RAPL PMU: cpu_to_rapl_pmu: CPU68, dieid: 0
[   11.040278] RAPL PMU: cpu_to_rapl_pmu: CPU69, dieid: 0
[   11.054943] RAPL PMU: cpu_to_rapl_pmu: CPU70, dieid: 0
[   11.060783] RAPL PMU: cpu_to_rapl_pmu: CPU71, dieid: 0
[   11.066462] RAPL PMU: cpu_to_rapl_pmu: CPU72, dieid: 1
[   11.104217] RAPL PMU: cpu_to_rapl_pmu: CPU73, dieid: 1
[   11.141980] RAPL PMU: cpu_to_rapl_pmu: CPU74, dieid: 1
[   11.207703] RAPL PMU: cpu_to_rapl_pmu: CPU75, dieid: 1
[   11.207795] RAPL PMU: cpu_to_rapl_pmu: CPU76, dieid: 1
[   11.226103] RAPL PMU: cpu_to_rapl_pmu: CPU77, dieid: 1
[   11.232535] RAPL PMU: cpu_to_rapl_pmu: CPU78, dieid: 1
[   11.244614] RAPL PMU: cpu_to_rapl_pmu: CPU79, dieid: 1
[   11.251280] RAPL PMU: cpu_to_rapl_pmu: CPU80, dieid: 2
[   11.252529] RAPL PMU: cpu_to_rapl_pmu: CPU81, dieid: 2
[   11.265301] RAPL PMU: cpu_to_rapl_pmu: CPU82, dieid: 2
[   11.265354] RAPL PMU: cpu_to_rapl_pmu: CPU83, dieid: 2
[   11.265377] RAPL PMU: cpu_to_rapl_pmu: CPU84, dieid: 2
[   11.265419] RAPL PMU: cpu_to_rapl_pmu: CPU85, dieid: 2
[   11.265449] RAPL PMU: cpu_to_rapl_pmu: CPU86, dieid: 2
[   11.265492] RAPL PMU: cpu_to_rapl_pmu: CPU87, dieid: 2
[   11.265511] RAPL PMU: cpu_to_rapl_pmu: CPU88, dieid: 3
[   11.265531] RAPL PMU: cpu_to_rapl_pmu: CPU89, dieid: 3
[   11.265634] RAPL PMU: cpu_to_rapl_pmu: CPU90, dieid: 3
[   11.265658] RAPL PMU: cpu_to_rapl_pmu: CPU91, dieid: 3
[   11.265710] RAPL PMU: cpu_to_rapl_pmu: CPU92, dieid: 3
[   11.265879] RAPL PMU: cpu_to_rapl_pmu: CPU93, dieid: 3
[   11.265907] RAPL PMU: cpu_to_rapl_pmu: CPU94, dieid: 3
[   11.266026] RAPL PMU: cpu_to_rapl_pmu: CPU95, dieid: 3
[   11.266046] RAPL PMU: cpu_to_rapl_pmu: CPU96, dieid: 4
[   11.266103] RAPL PMU: cpu_to_rapl_pmu: CPU97, dieid: 4
[   11.266123] RAPL PMU: cpu_to_rapl_pmu: CPU98, dieid: 4
[   11.266140] RAPL PMU: cpu_to_rapl_pmu: CPU99, dieid: 4
[   11.266166] RAPL PMU: cpu_to_rapl_pmu: CPU100, dieid: 4
[   11.266260] RAPL PMU: cpu_to_rapl_pmu: CPU101, dieid: 4
[   11.266340] RAPL PMU: cpu_to_rapl_pmu: CPU102, dieid: 4
[   11.266389] RAPL PMU: cpu_to_rapl_pmu: CPU103, dieid: 4
[   11.266455] RAPL PMU: cpu_to_rapl_pmu: CPU104, dieid: 5
[   11.266489] RAPL PMU: cpu_to_rapl_pmu: CPU105, dieid: 5
[   11.266580] RAPL PMU: cpu_to_rapl_pmu: CPU106, dieid: 5
[   11.266614] RAPL PMU: cpu_to_rapl_pmu: CPU107, dieid: 5
[   11.266716] RAPL PMU: cpu_to_rapl_pmu: CPU108, dieid: 5
[   11.266762] RAPL PMU: cpu_to_rapl_pmu: CPU109, dieid: 5
[   11.266894] RAPL PMU: cpu_to_rapl_pmu: CPU110, dieid: 5
[   11.364059] RAPL PMU: cpu_to_rapl_pmu: CPU111, dieid: 5
[   11.367804] RAPL PMU: cpu_to_rapl_pmu: CPU112, dieid: 6
[   11.371025] RAPL PMU: cpu_to_rapl_pmu: CPU113, dieid: 6
[   11.371037] RAPL PMU: cpu_to_rapl_pmu: CPU114, dieid: 6
[   11.371091] RAPL PMU: cpu_to_rapl_pmu: CPU115, dieid: 6
[   11.371134] RAPL PMU: cpu_to_rapl_pmu: CPU116, dieid: 6
[   11.371355] RAPL PMU: cpu_to_rapl_pmu: CPU117, dieid: 6
[   11.371566] RAPL PMU: cpu_to_rapl_pmu: CPU118, dieid: 6
[   11.371734] RAPL PMU: cpu_to_rapl_pmu: CPU119, dieid: 6
[   11.371915] RAPL PMU: cpu_to_rapl_pmu: CPU120, dieid: 7
[   11.371960] RAPL PMU: cpu_to_rapl_pmu: CPU121, dieid: 7
[   11.372126] RAPL PMU: cpu_to_rapl_pmu: CPU122, dieid: 7
[   11.372161] RAPL PMU: cpu_to_rapl_pmu: CPU123, dieid: 7
[   11.372308] RAPL PMU: cpu_to_rapl_pmu: CPU124, dieid: 7
[   11.372338] RAPL PMU: cpu_to_rapl_pmu: CPU125, dieid: 7
[   11.372368] RAPL PMU: cpu_to_rapl_pmu: CPU126, dieid: 7
[   11.372531] RAPL PMU: cpu_to_rapl_pmu: CPU127, dieid: 7
[   11.372787] RAPL PMU: API unit is 2^-32 Joules, 1 fixed counters, 163840 ms ovfl timer
[   11.372790] RAPL PMU: hw unit of domain package 2^-16 Joules
[   11.882885] intel_rapl_common: Found RAPL domain package
[   11.886067] intel_rapl_common: Found RAPL domain core
[   11.889724] intel_rapl_common: Found RAPL domain package
[   11.895907] intel_rapl_common: Found RAPL domain core
[   11.902607] intel_rapl_common: Found RAPL domain package
[   11.914489] intel_rapl_common: Found RAPL domain core
[   11.921354] intel_rapl_common: Found RAPL domain package
[   11.924479] intel_rapl_common: Found RAPL domain core
[   11.928291] intel_rapl_common: Found RAPL domain package
[   11.931265] intel_rapl_common: Found RAPL domain core
[   11.933535] intel_rapl_common: Found RAPL domain package
[   11.948387] intel_rapl_common: Found RAPL domain core
[   11.954657] intel_rapl_common: Found RAPL domain package
[   11.960257] intel_rapl_common: Found RAPL domain core
[   11.966490] intel_rapl_common: Found RAPL domain package
[   11.972044] intel_rapl_common: Found RAPL domain core
[  139.874182] RAPL PMU: cpu_to_rapl_pmu: CPU16, dieid: 2
[  139.874278] WARNING: CPU: 16 PID: 91 at arch/x86/events/rapl.c:556 rapl_cpu_offline+0x6c/0xc0 [rapl]
[  139.874295] Modules linked in: rfcomm md4 nls_utf8 cifs libarc4 dns_resolver fscache libdes xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink ip6table_mangle ip6table_nat ip6table_filter ip6_tables xt_addrtype br_netfilter iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c uinput iptable_filter bpfilter bridge stp llc cmac algif_hash algif_skcipher af_alg bnep amdgpu snd_usb_audio mfd_core intel_rapl_msr snd_usbmidi_lib intel_rapl_common pktcdvd snd_hwdep amd64_edac_mod snd_rawmidi iommu_v2 edac_mce_amd uvcvideo gpu_sched snd_seq_device kvm_amd btusb videobuf2_vmalloc drm_ttm_helper snd_pcm videobuf2_memops btrtl ttm snd_timer videobuf2_v4l2 btbcm videobuf2_common drm_kms_helper snd btintel kvm bluetooth videodev soundcore xpad ecdh_generic vfat pcspkr efi_pstore cec ff_memless mc sp5100_tco rfkill squashfs rapl joydev ecc fat k10temp i2c_piix4 mac_hid acpi_cpufreq loop ext4 mbcache jbd2 vfio_pci
[  139.874501] RIP: 0010:rapl_cpu_offline+0x6c/0xc0 [rapl]
[  139.874561]  ? rapl_event_update.isra.0.cold+0x11/0x11 [rapl]
Comment 9 Rafael Kitover 2020-12-29 15:02:06 UTC
I should mention that I have two CPUs, with 32 cores and 64 threads each.
Comment 10 Borislav Petkov 2020-12-29 17:33:26 UTC
Right, just as I expected it. The maxdie calculation is wrong because the topology setup on AMD is still not 100% proper yet. Lemme find a box and try to reproduce your observation.

Thx for testing.
Comment 11 Rafael Kitover 2020-12-31 19:08:28 UTC
I wanted to ask you about something on this topic. The reason I do this at all, is some games don't run when there are too many hardware threads, like older builds of portal 2, and this ability Linux has to offline physical CPUs at runtime is extremely useful for this. As far as I know, Windows does not have this ability.

I use this script to offline all but 16 hardware threads:

https://gist.github.com/rkitover/babb9f4477dce80673752cd45e52cbc7

this takes roughly 20 seconds on my hardware.

I was wondering if it may be worthwhile to make the range file writable, then I could simply do:

echo 0-15 > /sys/devices/system/cpu/online

I imagine offlining and onlining hardware threads in a batch would be faster, and this may be useful as e.g. a powersaving measure on some hardware.
Comment 12 Borislav Petkov 2021-01-04 11:21:45 UTC
(In reply to Rafael Kitover from comment #11)
> I wanted to ask you about something on this topic. The reason I do this at
> all, is some games don't run when there are too many hardware threads, like
> older builds of portal 2, and this ability Linux has to offline physical
> CPUs at runtime is extremely useful for this. As far as I know, Windows does
> not have this ability.
> 
> I use this script to offline all but 16 hardware threads:

Why don't you use cpusets for that:

https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/cpusets.html

?

Sounds like your use case without having to offline cores.

> I imagine offlining and onlining hardware threads in a batch would be
> faster, and this may be useful as e.g. a powersaving measure on some
> hardware.

So I'd be surprised that any power consumption change would be
measureable because unused CPUs would simply go into idle. The
soft-offlining we do in the kernel is basically the same thing as when
they go idle - we do HLT or MWAIT or some crap with BIOS which ends up
doing the same thing underneath and it puts the cores into a deep idle
state. They still have power and all.
Comment 13 Rafael Kitover 2021-01-04 11:43:33 UTC
I see, thank you for the explanation.

I have actually tried cpusets, but in a shell under a cpuset whatever that code does, it's likely C++ and likely using this API:

https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency

still sees the same number of hardware threads. Also things like /proc/cpuinfo are not changed when under a cpuset.

I will actually take a look at the stdc++ implementation of hardware_concurrency and see why cpusets have no effect, but I suspect it's because it reads special files like /proc/cpuinfo.
Comment 14 Borislav Petkov 2021-01-04 18:53:36 UTC
Right, that sounds strange.

Btw, here's another version of the debug patch which also has a potential fix. It would be real nice if you ran it to check whether it fixes the issue on your end.

Thx.
Comment 15 Borislav Petkov 2021-01-04 18:54:45 UTC
Created attachment 294491 [details]
debug patch + potential fix
Comment 16 Rafael Kitover 2021-01-05 06:30:04 UTC
Created attachment 294499 [details]
system logs
Comment 17 Rafael Kitover 2021-01-05 06:30:23 UTC
That fixed it! Logs attached.
Comment 18 Borislav Petkov 2021-01-05 10:49:17 UTC
Thanks for testing, lemme send a proper patch.
Comment 19 Rafael Kitover 2021-01-07 13:49:39 UTC
Created attachment 294543 [details]
boot log with patch
Comment 20 Rafael Kitover 2021-01-07 13:50:12 UTC
I am running the latest patch, offlining works correctly, dmesg attached.
Comment 21 Borislav Petkov 2021-01-08 08:57:52 UTC
Thanks, looks good, I'll add your Tested-by.
Comment 22 Borislav Petkov 2021-01-12 17:55:29 UTC
Ok, fix is queued and will go to Linus soon, let's close here:

https://git.kernel.org/tip/76e2fc63ca40977af893b724b00cc2f8e9ce47a4