Bug 218739 - pmu_counters_test kvm-selftest fails with (count != NUM_INSNS_RETIRED)
Summary: pmu_counters_test kvm-selftest fails with (count != NUM_INSNS_RETIRED)
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: Intel Linux
: P3 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-04-17 14:29 UTC by jarichte
Modified: 2024-04-23 00:21 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description jarichte 2024-04-17 14:29:29 UTC
Environment:
CPU Architecture: x86_64, Intel(R) Atom(TM) CPU C2750 @ 2.40GHz
Host OS: Fedorarawhide
Host kernel: Linux Kernel 6.9.0-rc3
gcc: gcc (GCC) 14.0.1
Host kernel source: https://git.kernel.org/pub/scm/virt/kvm/kvm.git
Branch: master
Commit: 1c3bed8006691f485156153778192864c9d8e14f
Bug Detailed Description:
Assertion failure executing kvm selftest pmu_counters_test.

Reproducing Steps:
git clone https://git.kernel.org/pub/scm/virt/kvm/kvm.git
cd kvm && make headers_install
cd kvm/tools/testing/selftests/kvm && make
cd x86_64 && ./pmu_counters_test

Actual Result:
Testing arch events, PMU version 0, perf_caps = 0
Testing GP counters, PMU version 0, perf_caps = 0
Testing fixed counters, PMU version 0, perf_caps = 0
Testing arch events, PMU version 0, perf_caps = 2000
Testing GP counters, PMU version 0, perf_caps = 2000
Testing fixed counters, PMU version 0, perf_caps = 2000
Testing arch events, PMU version 1, perf_caps = 0
==== Test Assertion Failure ====
x86_64/pmu_counters_test.c:107: count == NUM_INSNS_RETIRED
pid=51128 tid=51128 errno=4 - Interrupted system call
1	0x0000000000402c7d: run_vcpu at pmu_counters_test.c:61
2	0x0000000000402ead: test_arch_events at pmu_counters_test.c:307
3	0x0000000000402674: test_arch_events at pmu_counters_test.c:296
4	 (inlined by) test_intel_counters at pmu_counters_test.c:601
5	 (inlined by) main at pmu_counters_test.c:635
6	0x00007f78bd1981c7: ?? ??:0
7	0x00007f78bd19828a: ?? ??:0
8	0x0000000000402924: _start at ??:?
0x12 != 0x11 (count != NUM_INSNS_RETIRED)
Comment 1 Dongli Zhang 2024-04-23 00:21:14 UTC
Perhaps more information can be printed by pmu_counters_test in the future, e.g., msr, msr_ctrl, their values, cflush and whether forced emulation?

Just from the output, the number of instructions by GUEST_MEASURE_EVENT() does not match with NUM_INSNS_RETIRED=17,

------------------------

I have tried on an Icelake server and I could not reproduce anything for most of times, except the below for only once.

# ./pmu_counters_test 
Testing arch events, PMU version 0, perf_caps = 0
Testing GP counters, PMU version 0, perf_caps = 0
Testing fixed counters, PMU version 0, perf_caps = 0
Testing arch events, PMU version 0, perf_caps = 2000
Testing GP counters, PMU version 0, perf_caps = 2000
Testing fixed counters, PMU version 0, perf_caps = 2000
Testing arch events, PMU version 1, perf_caps = 0
Testing GP counters, PMU version 1, perf_caps = 0
Testing fixed counters, PMU version 1, perf_caps = 0
Testing arch events, PMU version 1, perf_caps = 2000
Testing GP counters, PMU version 1, perf_caps = 2000
Testing fixed counters, PMU version 1, perf_caps = 2000
Testing arch events, PMU version 2, perf_caps = 0
Testing GP counters, PMU version 2, perf_caps = 0
Testing fixed counters, PMU version 2, perf_caps = 0
Testing arch events, PMU version 2, perf_caps = 2000
Testing GP counters, PMU version 2, perf_caps = 2000
Testing fixed counters, PMU version 2, perf_caps = 2000
Testing arch events, PMU version 3, perf_caps = 0
Testing GP counters, PMU version 3, perf_caps = 0
Testing fixed counters, PMU version 3, perf_caps = 0
Testing arch events, PMU version 3, perf_caps = 2000
Testing GP counters, PMU version 3, perf_caps = 2000
Testing fixed counters, PMU version 3, perf_caps = 2000
Testing arch events, PMU version 4, perf_caps = 0
==== Test Assertion Failure ====
  x86_64/pmu_counters_test.c:120: count != 0
  pid=39696 tid=39696 errno=4 - Interrupted system call
     1	0x0000000000402baf: run_vcpu at pmu_counters_test.c:61
     2	0x0000000000402ddd: test_arch_events at pmu_counters_test.c:307
     3	0x0000000000402683: test_arch_events at pmu_counters_test.c:605
     4	 (inlined by) test_intel_counters at pmu_counters_test.c:605
     5	 (inlined by) main at pmu_counters_test.c:635
     6	0x00007fcfeb43ae44: ?? ??:0
     7	0x000000000040288d: _start at ??:?
  0x0 == 0x0 (count == 0)


# cat /sys/module/kvm/parameters/enable_pmu 
Y
# cat /sys/module/kvm/parameters/force_emulation_prefix 
0

# cpuid -l 0xa -1
CPU:
   Architecture Performance Monitoring Features (0xa):
      version ID                               = 0x5 (5)
      number of counters per logical processor = 0x8 (8)
      bit width of counter                     = 0x30 (48)
      length of EBX bit vector                 = 0x8 (8)
      core cycle event                         = available
      instruction retired event                = available
      reference cycles event                   = available
      last-level cache ref event               = available
      last-level cache miss event              = available
      branch inst retired event                = available
      branch mispred retired event             = available
      top-down slots event                     = available
      fixed counter  0 supported               = true
      fixed counter  1 supported               = true
      fixed counter  2 supported               = true
      fixed counter  3 supported               = true
      fixed counter  4 supported               = false
      fixed counter  5 supported               = false
      fixed counter  6 supported               = false
      fixed counter  7 supported               = false
      fixed counter  8 supported               = false
      fixed counter  9 supported               = false
      fixed counter 10 supported               = false
      fixed counter 11 supported               = false
      fixed counter 12 supported               = false
      fixed counter 13 supported               = false
      fixed counter 14 supported               = false
      fixed counter 15 supported               = false
      fixed counter 16 supported               = false
      fixed counter 17 supported               = false
      fixed counter 18 supported               = false
      fixed counter 19 supported               = false
      fixed counter 20 supported               = false
      fixed counter 21 supported               = false
      fixed counter 22 supported               = false
      fixed counter 23 supported               = false
      fixed counter 24 supported               = false
      fixed counter 25 supported               = false
      fixed counter 26 supported               = false
      fixed counter 27 supported               = false
      fixed counter 28 supported               = false
      fixed counter 29 supported               = false
      fixed counter 30 supported               = false
      fixed counter 31 supported               = false
      number of contiguous fixed counters      = 0x4 (4)
      bit width of fixed counters              = 0x30 (48)
      anythread deprecation                    = true



-------------------------------------------

I also did tests on nested L1 hypervisor (more legacy hardware). Most of time are good, except once.

# ./pmu_counters_test
Testing arch events, PMU version 0, perf_caps = 0
Testing GP counters, PMU version 0, perf_caps = 0
Testing fixed counters, PMU version 0, perf_caps = 0
Testing arch events, PMU version 0, perf_caps = 2000
Testing GP counters, PMU version 0, perf_caps = 2000
Testing fixed counters, PMU version 0, perf_caps = 2000
Testing arch events, PMU version 1, perf_caps = 0
Testing GP counters, PMU version 1, perf_caps = 0
Testing fixed counters, PMU version 1, perf_caps = 0
Testing arch events, PMU version 1, perf_caps = 2000
Testing GP counters, PMU version 1, perf_caps = 2000
Testing fixed counters, PMU version 1, perf_caps = 2000
Testing arch events, PMU version 2, perf_caps = 0
Testing GP counters, PMU version 2, perf_caps = 0
Testing fixed counters, PMU version 2, perf_caps = 0
Testing arch events, PMU version 2, perf_caps = 2000
Testing GP counters, PMU version 2, perf_caps = 2000
Testing fixed counters, PMU version 2, perf_caps = 2000
Testing arch events, PMU version 3, perf_caps = 0
==== Test Assertion Failure ====
  x86_64/pmu_counters_test.c:120: count != 0
  pid=9301 tid=9301 errno=4 - Interrupted system call
     1	0x0000000000402bdf: run_vcpu at pmu_counters_test.c:61
     2	0x0000000000402dfd: test_arch_events at pmu_counters_test.c:307
     3	0x00000000004026a3: test_arch_events at pmu_counters_test.c:605
     4	 (inlined by) test_intel_counters at pmu_counters_test.c:605
     5	 (inlined by) main at pmu_counters_test.c:635
     6	0x00007f05e2f60d8f: ?? ??:0
     7	0x00007f05e2f60e3f: ?? ??:0
     8	0x00000000004028b4: _start at ??:?
  0x0 == 0x0 (count == 0)

Note You need to log in before you can comment on or make changes to this bug.