Bug 216177
Summary: | kvm-unit-tests vmx has about 60% of failure chance | ||
---|---|---|---|
Product: | Virtualization | Reporter: | Yang Lixiao (lixiao.yang) |
Component: | kvm | Assignee: | virtualization_kvm |
Status: | NEW --- | ||
Severity: | normal | CC: | seanjc |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.19-rc1 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | vmx failure log |
Description
Yang Lixiao
2022-06-27 02:17:17 UTC
It's vmx_preemption_timer_expiry_test, which is known to be flaky (though IIRC it's KVM that's at fault). Test suite: vmx_preemption_timer_expiry_test FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > On Jun 27, 2022, at 5:28 PM, bugzilla-daemon@kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > Sean Christopherson (seanjc@google.com) changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |seanjc@google.com > > --- Comment #1 from Sean Christopherson (seanjc@google.com) --- > It's vmx_preemption_timer_expiry_test, which is known to be flaky (though > IIRC > it's KVM that's at fault). > > Test suite: vmx_preemption_timer_expiry_test > FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) For the record: https://lore.kernel.org/kvm/D121A03E-6861-4736-8070-5D1E4FEE1D32@gmail.com/ (In reply to Nadav Amit from comment #2) > > On Jun 27, 2022, at 5:28 PM, bugzilla-daemon@kernel.org wrote: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > Sean Christopherson (seanjc@google.com) changed: > > > > What |Removed |Added > > > ---------------------------------------------------------------------------- > > CC| |seanjc@google.com > > > > --- Comment #1 from Sean Christopherson (seanjc@google.com) --- > > It's vmx_preemption_timer_expiry_test, which is known to be flaky (though > > IIRC > > it's KVM that's at fault). > > > > Test suite: vmx_preemption_timer_expiry_test > > FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > > For the record: > > https://lore.kernel.org/kvm/D121A03E-6861-4736-8070-5D1E4FEE1D32@gmail.com/ Thanks for your reply. So this is a KVM bug, and you have sent a patch to kvm to fix this bug, right? On Tue, Jun 28, 2022, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > --- Comment #3 from Yang Lixiao (lixiao.yang@intel.com) --- > (In reply to Nadav Amit from comment #2) > > > On Jun 27, 2022, at 5:28 PM, bugzilla-daemon@kernel.org wrote: > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > > > Sean Christopherson (seanjc@google.com) changed: > > > > > > What |Removed |Added > > > > > > ---------------------------------------------------------------------------- > > > CC| |seanjc@google.com > > > > > > --- Comment #1 from Sean Christopherson (seanjc@google.com) --- > > > It's vmx_preemption_timer_expiry_test, which is known to be flaky (though > > > IIRC > > > it's KVM that's at fault). > > > > > > Test suite: vmx_preemption_timer_expiry_test > > > FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > > > > For the record: > > > > https://lore.kernel.org/kvm/D121A03E-6861-4736-8070-5D1E4FEE1D32@gmail.com/ > > Thanks for your reply. So this is a KVM bug, and you have sent a patch to kvm > to fix this bug, right? No, AFAIK no one has posted a fix. If it's the KVM issue I'm thinking of, the fix is non-trivial. It'd require scheduling a timer in L0 with a deadline shorter than what L1 requests when emulating the VMX timer, and then busy waiting in L0 if the host timer fires early. KVM already does this for e.g. L1's TSC deadline timer. That code would need to be adapated for the nested VMX preemption timer. > On Jun 27, 2022, at 6:19 PM, bugzilla-daemon@kernel.org wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=216177
>
> --- Comment #3 from Yang Lixiao (lixiao.yang@intel.com) ---
> (In reply to Nadav Amit from comment #2)
>>> On Jun 27, 2022, at 5:28 PM, bugzilla-daemon@kernel.org wrote:
>>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=216177
>>>
>>> Sean Christopherson (seanjc@google.com) changed:
>>>
>>> What |Removed |Added
>>>
>> ----------------------------------------------------------------------------
>>> CC| |seanjc@google.com
>>>
>>> --- Comment #1 from Sean Christopherson (seanjc@google.com) ---
>>> It's vmx_preemption_timer_expiry_test, which is known to be flaky (though
>>> IIRC
>>> it's KVM that's at fault).
>>>
>>> Test suite: vmx_preemption_timer_expiry_test
>>> FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048)
>>
>> For the record:
>>
>> https://lore.kernel.org/kvm/D121A03E-6861-4736-8070-5D1E4FEE1D32@gmail.com/
>
> Thanks for your reply. So this is a KVM bug, and you have sent a patch to kvm
> to fix this bug, right?
As I noted, at some point I did not manage to reproduce the failure.
The failure on bare-metal that I experienced hints that this is either a test
bug or (much less likely) a hardware bug. But I do not think it is likely to be
a KVM bug.
On Tue, Jun 28, 2022, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > --- Comment #5 from Nadav Amit (nadav.amit@gmail.com) --- > > On Jun 27, 2022, at 6:19 PM, bugzilla-daemon@kernel.org wrote: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > --- Comment #3 from Yang Lixiao (lixiao.yang@intel.com) --- > > (In reply to Nadav Amit from comment #2) > >>> On Jun 27, 2022, at 5:28 PM, bugzilla-daemon@kernel.org wrote: > >>> > >>> https://bugzilla.kernel.org/show_bug.cgi?id=216177 > >>> > >>> Sean Christopherson (seanjc@google.com) changed: > >>> > >>> What |Removed |Added > >>> > >> > ---------------------------------------------------------------------------- > >>> CC| |seanjc@google.com > >>> > >>> --- Comment #1 from Sean Christopherson (seanjc@google.com) --- > >>> It's vmx_preemption_timer_expiry_test, which is known to be flaky (though > >>> IIRC > >>> it's KVM that's at fault). > >>> > >>> Test suite: vmx_preemption_timer_expiry_test > >>> FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > >> > >> For the record: > >> > >> > https://lore.kernel.org/kvm/D121A03E-6861-4736-8070-5D1E4FEE1D32@gmail.com/ > > > > Thanks for your reply. So this is a KVM bug, and you have sent a patch to > kvm > > to fix this bug, right? > > As I noted, at some point I did not manage to reproduce the failure. > > The failure on bare-metal that I experienced hints that this is either a test > bug or (much less likely) a hardware bug. But I do not think it is likely to > be > a KVM bug. Oooh, your failure was on bare-metal. I didn't grok that. Though it could be both a hardware bug and a KVM bug :-) (In reply to Sean Christopherson from comment #6) > On Tue, Jun 28, 2022, bugzilla-daemon@kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > --- Comment #5 from Nadav Amit (nadav.amit@gmail.com) --- > > > On Jun 27, 2022, at 6:19 PM, bugzilla-daemon@kernel.org wrote: > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > > > --- Comment #3 from Yang Lixiao (lixiao.yang@intel.com) --- > > > (In reply to Nadav Amit from comment #2) > > >>> On Jun 27, 2022, at 5:28 PM, bugzilla-daemon@kernel.org wrote: > > >>> > > >>> https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > >>> > > >>> Sean Christopherson (seanjc@google.com) changed: > > >>> > > >>> What |Removed |Added > > >>> > > >> > > > ---------------------------------------------------------------------------- > > >>> CC| |seanjc@google.com > > >>> > > >>> --- Comment #1 from Sean Christopherson (seanjc@google.com) --- > > >>> It's vmx_preemption_timer_expiry_test, which is known to be flaky > (though > > >>> IIRC > > >>> it's KVM that's at fault). > > >>> > > >>> Test suite: vmx_preemption_timer_expiry_test > > >>> FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > > >> > > >> For the record: > > >> > > >> > > https://lore.kernel.org/kvm/D121A03E-6861-4736-8070-5D1E4FEE1D32@gmail.com/ > > > > > > Thanks for your reply. So this is a KVM bug, and you have sent a patch to > > kvm > > > to fix this bug, right? > > > > As I noted, at some point I did not manage to reproduce the failure. > > > > The failure on bare-metal that I experienced hints that this is either a > test > > bug or (much less likely) a hardware bug. But I do not think it is likely > to > > be > > a KVM bug. > > Oooh, your failure was on bare-metal. I didn't grok that. Though it could > be > both a hardware bug and a KVM bug :-) In my tests, I tested kvm-unit-tests vmx on bare-metal (not on VM) and this bug happened on two different Ice Lake machines and one Cooper Lake machine. On Mon, Jun 27, 2022 at 8:54 PM Nadav Amit <nadav.amit@gmail.com> wrote: > The failure on bare-metal that I experienced hints that this is either a test > bug or (much less likely) a hardware bug. But I do not think it is likely to > be > a KVM bug. KVM does not use the VMX-preemption timer to virtualize L1's VMX-preemption timer (and that is why KVM is broken). The KVM bug was introduced with commit f4124500c2c1 ("KVM: nVMX: Fully emulate preemption timer"), which uses an L0 CLOCK_MONOTONIC hrtimer to emulate L1's VMX-preemption timer. There are many reasons that this cannot possibly work, not the least of which is that the CLOCK_MONOTONIC timer is subject to time slew. Currently, KVM reserves L0's VMX-preemption timer for emulating L1's APIC timer. Better would be to determine whether L1's APIC timer or L1's VMX-preemption timer is scheduled to fire first, and use L0's VMX-preemption timer to trigger a VM-exit on the nearest alarm. Alternatively, as Sean noted, one could perhaps arrange for the hrtimer to fire early enough that it won't fire late, but I don't really think that's a viable solution. I can't explain the bare-metal failures, but I will note that the test assumes the default treatment of SMIs and SMM. The test will likely fail with the dual-monitor treatment of SMIs and SMM. Aside from the older CPUs with broken VMX-preemption timers, I don't know of any relevant errata. Of course, it is possible that the test itself is buggy. For the person who reported bare-metal failures on Ice Lake and Cooper Lake, how long was the test in VMX non-root mode past the VMX-preemption timer deadline? (In reply to Jim Mattson from comment #8) > On Mon, Jun 27, 2022 at 8:54 PM Nadav Amit <nadav.amit@gmail.com> wrote: > > > The failure on bare-metal that I experienced hints that this is either a > test > > bug or (much less likely) a hardware bug. But I do not think it is likely > to > > be > > a KVM bug. > > KVM does not use the VMX-preemption timer to virtualize L1's > VMX-preemption timer (and that is why KVM is broken). The KVM bug was > introduced with commit f4124500c2c1 ("KVM: nVMX: Fully emulate > preemption timer"), which uses an L0 CLOCK_MONOTONIC hrtimer to > emulate L1's VMX-preemption timer. There are many reasons that this > cannot possibly work, not the least of which is that the > CLOCK_MONOTONIC timer is subject to time slew. > > Currently, KVM reserves L0's VMX-preemption timer for emulating L1's > APIC timer. Better would be to determine whether L1's APIC timer or > L1's VMX-preemption timer is scheduled to fire first, and use L0's > VMX-preemption timer to trigger a VM-exit on the nearest alarm. > Alternatively, as Sean noted, one could perhaps arrange for the > hrtimer to fire early enough that it won't fire late, but I don't > really think that's a viable solution. > > I can't explain the bare-metal failures, but I will note that the test > assumes the default treatment of SMIs and SMM. The test will likely > fail with the dual-monitor treatment of SMIs and SMM. Aside from the > older CPUs with broken VMX-preemption timers, I don't know of any > relevant errata. > > Of course, it is possible that the test itself is buggy. For the > person who reported bare-metal failures on Ice Lake and Cooper Lake, > how long was the test in VMX non-root mode past the VMX-preemption > timer deadline? On the first Ice lake: Test suite: vmx_preemption_timer_expiry_test FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) On the second Ice lake: Test suite: vmx_preemption_timer_expiry_test FAIL: Last stored guest TSC (27014488614) < TSC deadline (27014469152) On Cooper lake: Test suite: vmx_preemption_timer_expiry_test FAIL: Last stored guest TSC (29030585690) < TSC deadline (29030565024) On Mon, Jun 27, 2022 at 11:32 PM <bugzilla-daemon@kernel.org> wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > --- Comment #9 from Yang Lixiao (lixiao.yang@intel.com) --- > (In reply to Jim Mattson from comment #8) > > On Mon, Jun 27, 2022 at 8:54 PM Nadav Amit <nadav.amit@gmail.com> wrote: > > > > > The failure on bare-metal that I experienced hints that this is either a > > test > > > bug or (much less likely) a hardware bug. But I do not think it is likely > > to > > > be > > > a KVM bug. > > > > KVM does not use the VMX-preemption timer to virtualize L1's > > VMX-preemption timer (and that is why KVM is broken). The KVM bug was > > introduced with commit f4124500c2c1 ("KVM: nVMX: Fully emulate > > preemption timer"), which uses an L0 CLOCK_MONOTONIC hrtimer to > > emulate L1's VMX-preemption timer. There are many reasons that this > > cannot possibly work, not the least of which is that the > > CLOCK_MONOTONIC timer is subject to time slew. > > > > Currently, KVM reserves L0's VMX-preemption timer for emulating L1's > > APIC timer. Better would be to determine whether L1's APIC timer or > > L1's VMX-preemption timer is scheduled to fire first, and use L0's > > VMX-preemption timer to trigger a VM-exit on the nearest alarm. > > Alternatively, as Sean noted, one could perhaps arrange for the > > hrtimer to fire early enough that it won't fire late, but I don't > > really think that's a viable solution. > > > > I can't explain the bare-metal failures, but I will note that the test > > assumes the default treatment of SMIs and SMM. The test will likely > > fail with the dual-monitor treatment of SMIs and SMM. Aside from the > > older CPUs with broken VMX-preemption timers, I don't know of any > > relevant errata. > > > > Of course, it is possible that the test itself is buggy. For the > > person who reported bare-metal failures on Ice Lake and Cooper Lake, > > how long was the test in VMX non-root mode past the VMX-preemption > > timer deadline? > > On the first Ice lake: > Test suite: vmx_preemption_timer_expiry_test > FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > > On the second Ice lake: > Test suite: vmx_preemption_timer_expiry_test > FAIL: Last stored guest TSC (27014488614) < TSC deadline (27014469152) > > On Cooper lake: > Test suite: vmx_preemption_timer_expiry_test > FAIL: Last stored guest TSC (29030585690) < TSC deadline (29030565024) Wow! Those are *huge* overruns. What is the value of MSR 0x9B on these hosts? (In reply to Jim Mattson from comment #10) > On Mon, Jun 27, 2022 at 11:32 PM <bugzilla-daemon@kernel.org> wrote: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > --- Comment #9 from Yang Lixiao (lixiao.yang@intel.com) --- > > (In reply to Jim Mattson from comment #8) > > > On Mon, Jun 27, 2022 at 8:54 PM Nadav Amit <nadav.amit@gmail.com> wrote: > > > > > > > The failure on bare-metal that I experienced hints that this is either > a > > > test > > > > bug or (much less likely) a hardware bug. But I do not think it is > likely > > > to > > > > be > > > > a KVM bug. > > > > > > KVM does not use the VMX-preemption timer to virtualize L1's > > > VMX-preemption timer (and that is why KVM is broken). The KVM bug was > > > introduced with commit f4124500c2c1 ("KVM: nVMX: Fully emulate > > > preemption timer"), which uses an L0 CLOCK_MONOTONIC hrtimer to > > > emulate L1's VMX-preemption timer. There are many reasons that this > > > cannot possibly work, not the least of which is that the > > > CLOCK_MONOTONIC timer is subject to time slew. > > > > > > Currently, KVM reserves L0's VMX-preemption timer for emulating L1's > > > APIC timer. Better would be to determine whether L1's APIC timer or > > > L1's VMX-preemption timer is scheduled to fire first, and use L0's > > > VMX-preemption timer to trigger a VM-exit on the nearest alarm. > > > Alternatively, as Sean noted, one could perhaps arrange for the > > > hrtimer to fire early enough that it won't fire late, but I don't > > > really think that's a viable solution. > > > > > > I can't explain the bare-metal failures, but I will note that the test > > > assumes the default treatment of SMIs and SMM. The test will likely > > > fail with the dual-monitor treatment of SMIs and SMM. Aside from the > > > older CPUs with broken VMX-preemption timers, I don't know of any > > > relevant errata. > > > > > > Of course, it is possible that the test itself is buggy. For the > > > person who reported bare-metal failures on Ice Lake and Cooper Lake, > > > how long was the test in VMX non-root mode past the VMX-preemption > > > timer deadline? > > > > On the first Ice lake: > > Test suite: vmx_preemption_timer_expiry_test > > FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > > > > On the second Ice lake: > > Test suite: vmx_preemption_timer_expiry_test > > FAIL: Last stored guest TSC (27014488614) < TSC deadline (27014469152) > > > > On Cooper lake: > > Test suite: vmx_preemption_timer_expiry_test > > FAIL: Last stored guest TSC (29030585690) < TSC deadline (29030565024) > > Wow! Those are *huge* overruns. What is the value of MSR 0x9B on these hosts? All of the values of MSR 0x9B on the three hosts are 0. On Tue, Jun 28, 2022 at 5:22 PM <bugzilla-daemon@kernel.org> wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > --- Comment #11 from Yang Lixiao (lixiao.yang@intel.com) --- > (In reply to Jim Mattson from comment #10) > > On Mon, Jun 27, 2022 at 11:32 PM <bugzilla-daemon@kernel.org> wrote: > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > > > --- Comment #9 from Yang Lixiao (lixiao.yang@intel.com) --- > > > (In reply to Jim Mattson from comment #8) > > > > On Mon, Jun 27, 2022 at 8:54 PM Nadav Amit <nadav.amit@gmail.com> > wrote: > > > > > > > > > The failure on bare-metal that I experienced hints that this is > either > > a > > > > test > > > > > bug or (much less likely) a hardware bug. But I do not think it is > > likely > > > > to > > > > > be > > > > > a KVM bug. > > > > > > > > KVM does not use the VMX-preemption timer to virtualize L1's > > > > VMX-preemption timer (and that is why KVM is broken). The KVM bug was > > > > introduced with commit f4124500c2c1 ("KVM: nVMX: Fully emulate > > > > preemption timer"), which uses an L0 CLOCK_MONOTONIC hrtimer to > > > > emulate L1's VMX-preemption timer. There are many reasons that this > > > > cannot possibly work, not the least of which is that the > > > > CLOCK_MONOTONIC timer is subject to time slew. > > > > > > > > Currently, KVM reserves L0's VMX-preemption timer for emulating L1's > > > > APIC timer. Better would be to determine whether L1's APIC timer or > > > > L1's VMX-preemption timer is scheduled to fire first, and use L0's > > > > VMX-preemption timer to trigger a VM-exit on the nearest alarm. > > > > Alternatively, as Sean noted, one could perhaps arrange for the > > > > hrtimer to fire early enough that it won't fire late, but I don't > > > > really think that's a viable solution. > > > > > > > > I can't explain the bare-metal failures, but I will note that the test > > > > assumes the default treatment of SMIs and SMM. The test will likely > > > > fail with the dual-monitor treatment of SMIs and SMM. Aside from the > > > > older CPUs with broken VMX-preemption timers, I don't know of any > > > > relevant errata. > > > > > > > > Of course, it is possible that the test itself is buggy. For the > > > > person who reported bare-metal failures on Ice Lake and Cooper Lake, > > > > how long was the test in VMX non-root mode past the VMX-preemption > > > > timer deadline? > > > > > > On the first Ice lake: > > > Test suite: vmx_preemption_timer_expiry_test > > > FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > > > > > > On the second Ice lake: > > > Test suite: vmx_preemption_timer_expiry_test > > > FAIL: Last stored guest TSC (27014488614) < TSC deadline (27014469152) > > > > > > On Cooper lake: > > > Test suite: vmx_preemption_timer_expiry_test > > > FAIL: Last stored guest TSC (29030585690) < TSC deadline (29030565024) > > > > Wow! Those are *huge* overruns. What is the value of MSR 0x9B on these > hosts? > > All of the values of MSR 0x9B on the three hosts are 0. > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are watching the assignee of the bug. Doh! There is a glaring bug in the test. I'll post a fix soon. (In reply to Jim Mattson from comment #12) > On Tue, Jun 28, 2022 at 5:22 PM <bugzilla-daemon@kernel.org> wrote: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > --- Comment #11 from Yang Lixiao (lixiao.yang@intel.com) --- > > (In reply to Jim Mattson from comment #10) > > > On Mon, Jun 27, 2022 at 11:32 PM <bugzilla-daemon@kernel.org> wrote: > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216177 > > > > > > > > --- Comment #9 from Yang Lixiao (lixiao.yang@intel.com) --- > > > > (In reply to Jim Mattson from comment #8) > > > > > On Mon, Jun 27, 2022 at 8:54 PM Nadav Amit <nadav.amit@gmail.com> > > wrote: > > > > > > > > > > > The failure on bare-metal that I experienced hints that this is > > either > > > a > > > > > test > > > > > > bug or (much less likely) a hardware bug. But I do not think it is > > > likely > > > > > to > > > > > > be > > > > > > a KVM bug. > > > > > > > > > > KVM does not use the VMX-preemption timer to virtualize L1's > > > > > VMX-preemption timer (and that is why KVM is broken). The KVM bug was > > > > > introduced with commit f4124500c2c1 ("KVM: nVMX: Fully emulate > > > > > preemption timer"), which uses an L0 CLOCK_MONOTONIC hrtimer to > > > > > emulate L1's VMX-preemption timer. There are many reasons that this > > > > > cannot possibly work, not the least of which is that the > > > > > CLOCK_MONOTONIC timer is subject to time slew. > > > > > > > > > > Currently, KVM reserves L0's VMX-preemption timer for emulating L1's > > > > > APIC timer. Better would be to determine whether L1's APIC timer or > > > > > L1's VMX-preemption timer is scheduled to fire first, and use L0's > > > > > VMX-preemption timer to trigger a VM-exit on the nearest alarm. > > > > > Alternatively, as Sean noted, one could perhaps arrange for the > > > > > hrtimer to fire early enough that it won't fire late, but I don't > > > > > really think that's a viable solution. > > > > > > > > > > I can't explain the bare-metal failures, but I will note that the > test > > > > > assumes the default treatment of SMIs and SMM. The test will likely > > > > > fail with the dual-monitor treatment of SMIs and SMM. Aside from the > > > > > older CPUs with broken VMX-preemption timers, I don't know of any > > > > > relevant errata. > > > > > > > > > > Of course, it is possible that the test itself is buggy. For the > > > > > person who reported bare-metal failures on Ice Lake and Cooper Lake, > > > > > how long was the test in VMX non-root mode past the VMX-preemption > > > > > timer deadline? > > > > > > > > On the first Ice lake: > > > > Test suite: vmx_preemption_timer_expiry_test > > > > FAIL: Last stored guest TSC (28067103426) < TSC deadline (28067086048) > > > > > > > > On the second Ice lake: > > > > Test suite: vmx_preemption_timer_expiry_test > > > > FAIL: Last stored guest TSC (27014488614) < TSC deadline (27014469152) > > > > > > > > On Cooper lake: > > > > Test suite: vmx_preemption_timer_expiry_test > > > > FAIL: Last stored guest TSC (29030585690) < TSC deadline (29030565024) > > > > > > Wow! Those are *huge* overruns. What is the value of MSR 0x9B on these > > hosts? > > > > All of the values of MSR 0x9B on the three hosts are 0. > > > > -- > > You may reply to this email to add a comment. > > > > You are receiving this mail because: > > You are watching the assignee of the bug. > Doh! There is a glaring bug in the test. I'll post a fix soon. Thanks! |