Bug 203183 - tsc marked unstable after entered PC10 on Intel CoffeeLake
Summary: tsc marked unstable after entered PC10 on Intel CoffeeLake
Status: NEW
Alias: None
Product: Timers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: john stultz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-08 06:09 UTC by You-Sheng Yang
Modified: 2021-12-03 08:37 UTC (History)
13 users (show)

See Also:
Kernel Version: 5.1-rc2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
cpuid output (321.63 KB, text/plain)
2019-04-08 06:09 UTC, You-Sheng Yang
Details
syslog (556.59 KB, text/plain)
2019-04-08 06:11 UTC, You-Sheng Yang
Details
csnow/wdnow dumps over long period (1.50 MB, application/x-gzip)
2019-04-16 08:07 UTC, You-Sheng Yang
Details
cpuid output from Dell Inspiron 15 7000 (20.10 KB, text/plain)
2019-04-25 10:08 UTC, You-Sheng Yang
Details
dmidecode output from Dell Inspiron 15 7000 (17.25 KB, text/plain)
2019-04-25 10:10 UTC, You-Sheng Yang
Details
turbostat output from Dell Inspiron 15 7000 (43.49 KB, text/plain)
2019-04-25 10:14 UTC, You-Sheng Yang
Details
tsc output from Dell Inspiron 15 7000 (680 bytes, text/plain)
2019-04-25 10:23 UTC, You-Sheng Yang
Details
dmidecode of Harry's Dell 7440 DVT (34.34 KB, text/plain)
2019-04-26 03:30 UTC, Harry Pan
Details
Untrust the clocksource watchdog when its interval is too small (2.66 KB, patch)
2019-05-16 08:36 UTC, Harry Pan
Details | Diff
ftrace data and debug patch for HPET behavior (2.29 MB, application/gzip)
2019-05-17 08:18 UTC, Harry Pan
Details
dmesg (60.38 KB, text/plain)
2019-11-14 17:15 UTC, Alex
Details
Errors after hpet disabled on kernel 5.3.13 (183.11 KB, text/plain)
2019-11-26 04:45 UTC, Josh Freeno
Details

Description You-Sheng Yang 2019-04-08 06:09:12 UTC
Created attachment 282173 [details]
cpuid output

[Overview]

On Intel CoffeeLake it's observed tsc is always marked unstable after turning off screen and let cpu enter idle state Package C10(PC10), and then clock source is switched to hpet. Following messages are printed in dmesg when this occurs:

  [ 115.454117] clocksource: timekeeping watchdog on CPU2: Marking clocksource 'tsc' as unstable because the skew is too large:
  [ 115.454146] clocksource: 'hpet' wd_now: a44e6960 wd_last: a42fe23b mask: ffffffff
  [ 115.454151] clocksource: 'tsc' cs_now: 69ee1f8d9f cs_last: 699580f6fd mask: ffffffffffffffff
  [ 115.454160] tsc: Marking TSC unstable due to clocksource watchdog
  [ 115.454675] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
  [ 115.454679] sched_clock: Marking unstable (115712298090, -257671127)<-(115460381128, -5706726)
  [ 115.455825] clocksource: Switched to clocksource hpet

[Step to reproduce]

1. use `powertop` to observe if CPU has ever enter PC10,
2. issue `xset dpms force off` to turn off screen and idle for some minutes,
3. check dmesg for above tsc marked unstable messages

[Actual Results]

TSC marked unstable and clock source switched to HPET.

[Expected Results]

TSC remains the clock source across the PC10 event.

[Build Date & Hardware]

This is reproducible on kernel version 4.14 through 5.1-rc2. Order kernel versions are not tested.

[Additional Information]

This was first found on Ubuntu (all versions) that system doesn't go suspend after stayed idle for 20 minutes. Clock source was switched to HPET, the system time is no correctly synced with reality so that the system doesn't consider it has stayed idle long enough to begin suspend process.
Comment 1 You-Sheng Yang 2019-04-08 06:11:44 UTC
Created attachment 282175 [details]
syslog
Comment 2 You-Sheng Yang 2019-04-08 10:43:54 UTC
Patch sent to https://lkml.org/lkml/2019/4/8/322
Comment 3 You-Sheng Yang 2019-04-11 03:06:12 UTC
(paste additional info posted to patch discussion thread)

Usually the differences between wd_nsec and cs_nsec in function clocksource_watchdog in kernel/time/clocksource.c would be less than a few thousand nanoseconds. However, when CPU is entering deeper idle state, PC10, the hpet clocksource readings starts to give inaccurate values for unknown reason, and the differences to cs_nsec varies from a few hundred nanoseconds to several hundred millisecond, which is larger than WATCHDOG_THRESHOLD (62.5ms) and finally results in tsc being marked unreliable. No HPET overflow is found when this occurs.
Comment 4 Tom B 2019-04-15 10:03:54 UTC
I'm not sure if this should be a different bug but it also happens on AMD hardware. 

After suspend I get the following in dmesg:

[14315.373934] tsc: Marking TSC unstable due to check_tsc_sync_source failed
[14315.373940] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.


If I try to manually set the clocksource to tsc after suspend I get:

[20331.176246] clocksource: Override clocksource tsc is unstable and not HRT compatible - cannot switch while in HRT/NOHZ mode
Comment 5 Tom B 2019-04-15 10:05:01 UTC
Apologies, I didn't include my hardware:

AMD Threadripper 1950x
Gigabyte Aourus X399


The issue happens after a suspend/resume cycle.
Comment 6 You-Sheng Yang 2019-04-16 02:13:51 UTC
@Tom, I think that's a different issue.
Comment 7 You-Sheng Yang 2019-04-16 08:07:52 UTC
Created attachment 282345 [details]
csnow/wdnow dumps over long period

cpuid/dmesg/syslog of clocksource readings dump from NBA5-DVT2.1-C2.

This is captured with a modified Ubuntu 4.15.0-48-generic kernel. Read TSC, HPET and TSC again in clocksource_watchdog() and dump the readings accordingly. The meanings of the fields are:

csnow1: TSC counter of the first read
wdnow: HPET counter read
csnow2: TSC counter of the second read
cs_nsec1: nanoseconds between csnow1 and last csnow2
wd_nsec: nanoseconds between two adjacent wdnow read
cs_nsec2: nanoseconds between csnow1 and csnow2

Whenever abs(cs_nsec1 + cs_nsec2 - wd_nsec) > 62.5ms, an additional line is inserted as indication, but clocksource switching is disabled. So TSC served as main clocksource throughout the log capturing, and the system does suspend right after 20 minutes idle.
Comment 8 Harry Pan 2019-04-24 08:26:12 UTC
Hi You-Sheng,

This is Harry from Intel.

Would you mind to document (dump) firmware info through 'dmidecode'?

And I examined the log, I don't think the device entered PC10 before hitting TSC unstable, Linux tool 'turbostat' or MSR 0x632 read shall get the PKG_C10 residency count.


Cheers,
Harry
Comment 9 You-Sheng Yang 2019-04-25 06:24:13 UTC
Hi Harry Pan,

PC10 residency is observed in `powertop`, which is also used to set all the tunable  options to their good setting state to enable PC10.
Comment 10 Harry Pan 2019-04-25 07:22:02 UTC
Hi You-Sheng,

I suggest documenting all configurations to repro issue including firmware version, here is my BIOS [1].

I got a Dell sample internally, I played w/ it couple days including manually do suspend-to-idle and suspend-to-ram, while I could not see this issue (of TSC unstable), this DVT never goes to PC10, mostly I see CC7 and PC2 in the statistic.  

This first patch is equal to adding tsc=reliable in kernel command while I don't think it is the right approach because the root cause is unknown, eventually, it may happen again and then the kernel would mark TSC unstable again.

Alternately, would you mind to share some 'turbostat' dump? let's also perform triage to see whether or not it is really related to the PC10, I know the C-state profiling well, CC and PC states profiling are basically pulled from the MSR, this is in the same mechanism in various Linux tools including powertop and turbostat.


Cheers,
Harry

[1] dmidecode --quiet
BIOS Information
        Vendor: Dell Inc.
        Version: 0.5.1
        Release Date: 12/27/2018
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 32 MB
        Characteristics:
..snip...
        BIOS Revision: 0.5
Comment 11 Harry Pan 2019-04-25 07:34:55 UTC
Given you have the perfect setup to repro issue, you may also consider tweak intel_idle driver to limit the CC state (thru MWAIT hints) to indirectly limit PC states, see [1].
Combine this w/ turbostat dump, you may realize whether or not it takes effect; this tool is small and handy you could easily make it under kernel source pool.


[1] Documentation/admin-guide/kernel-parameters.txt
...
  intel_idle.max_cstate=  [KNL,HW,ACPI,X86]
      0 disables intel_idle and fall back on acpi_idle.
      1 to 9  specify maximum depth of C-state.
Comment 12 You-Sheng Yang 2019-04-25 10:08:08 UTC
Created attachment 282513 [details]
cpuid output from Dell Inspiron 15 7000
Comment 13 You-Sheng Yang 2019-04-25 10:10:47 UTC
Created attachment 282515 [details]
dmidecode output from Dell Inspiron 15 7000

BIOS Information
	Vendor: Dell Inc.
	Version: 0.7.8
	Release Date: 04/01/2019
	Address: 0xF0000
	Runtime Size: 64 kB
	ROM Size: 16 MB
        ...
        BIOS Revision: 0.7
Comment 14 You-Sheng Yang 2019-04-25 10:14:40 UTC
Created attachment 282517 [details]
turbostat output from Dell Inspiron 15 7000

$ cat turbostat.dell-inspiron-15-7000 |awk '{print $41}' | grep \.
Pk%pc10
0.00
0.00
...
Pk%pc10
0.00
0.00
Pk%pc10
35.53
35.53
Pk%pc10
0.00
0.00
Comment 15 You-Sheng Yang 2019-04-25 10:23:37 UTC
Created attachment 282519 [details]
tsc output from Dell Inspiron 15 7000

Hi Harry,

The DVT you have may not be able to enter PC10 by default configuration with the kernel version you installed. You may want to, as the first step, install powertop and run it with `powertop --auto-tune` the enable all the tunable items. This should help the most cases, but if that still doesn't work for you, you may want to install other kernel versions.

turbostat dump is also attached. The capturing was terminated after I saw tsc being marked unstable and PC10 became non-zero on powertop Idle Stats view.
Comment 16 Harry Pan 2019-04-26 03:30:37 UTC
Created attachment 282537 [details]
dmidecode of Harry's Dell 7440 DVT
Comment 17 Harry Pan 2019-04-26 03:33:04 UTC
Okay, I realized my problem is becasuse the firmware locks the PC state

i.e.,
root@u-Precision-7740:~# turbostat -D 2>&1 | grep MSR_PKG_CST_CONFIG_CONTROL
cpu4: MSR_PKG_CST_CONFIG_CONTROL: 0x1e008008 (UNdemote-C3, UNdemote-C1, demote-C3, demote-C1, locked, pkg-cstate-limit=8 (unlimited))

Hi You-Sheng,

Do you have Dell's firmware engineering contact that allows me to upgrade the BIOS (along w/ instructions) close to your testee?
Comment 18 You-Sheng Yang 2019-04-26 07:29:33 UTC
Hi Harry, I'll do as much as I can. Besides this, maybe you can provide some more details, e.g. cpuid/lspic -vvv, so that I may also find a similar sku to yours and try to reproduce this problem on it. And, maybe the easiest way so far would be borrow another device with Intel Coffee Lake instead.
Comment 19 Harry Pan 2019-05-06 06:41:25 UTC
Here is some update of my investigation during the weekend:

 - I am able to see TSC unstable right after PC10 residency increased
 - Hacking idle driver to honor PC8/9/10 IRTL [1] was unable to prevent the issue
 - Hacking idle driver to apply extra large exit_latency/target_residency (in 10000 us) on C10 was unable to prevent the issue 
 - Append "intel_idle.max_cstate=7" is helpful to mitigate the issue, I evaluated it 2+ days the clocksource remaining in TSC, while the max package C-state becomes PC8 against the PC10  

[1]
cpu1: MSR_PKGC8_IRTL: 0x000088fa (valid, 256000 ns)
cpu1: MSR_PKGC9_IRTL: 0x0000894c (valid, 339968 ns)
cpu1: MSR_PKGC10_IRTL: 0x00008bf2 (valid, 1034240 ns)

static struct cpuidle_state skl_cstates[] = {
...
        {
                .name = "C8",
                .desc = "MWAIT 0x40",
                .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
                .exit_latency = 200,
                .target_residency = 800,
                .enter = &intel_idle,
                .enter_s2idle = intel_idle_s2idle, },
        {
                .name = "C9",
                .desc = "MWAIT 0x50",
                .flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
                .exit_latency = 480,
                .target_residency = 5000,
                .enter = &intel_idle,
                .enter_s2idle = intel_idle_s2idle, },
        {
                .name = "C10",
                .desc = "MWAIT 0x60",
                .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
                .exit_latency = 890,
                .target_residency = 5000,
                .enter = &intel_idle,
                .enter_s2idle = intel_idle_s2idle, },
...
};
Comment 20 Harry Pan 2019-05-08 03:04:03 UTC
Giving a short update that I've evaluated 'tsc=reliable' (equivalent at the #c2 patch to remove watchdog clocksource) as well.

So, short recap of current two cmdline workarounds
 - intel_idle.max_cstate=7
 - tsc=reliable

There shall be a trade-off in terms of pros and cons; given I have limited resource, I will keep diving into this timekeeping problem in the original setup, leave the workarounds evaluation for you to archive well collaboration.
Comment 21 Harry Pan 2019-05-15 06:35:48 UTC
By now the problem looks to us in HPET (w/ PC10 allowed) rather than TSC; there is timestamp delay when HPET exits from PC10.
Though out the existing dmesg(s) review we should be able to get this hint.

The theory is, HPET uses XTAL (24 MHz) as the main clock, while in PC10 the XTAL is off then its clock domain will offload to the RTC (32KHz).
We suspect the offload scale is not well handled in between fast/slow clock domains which cause the HPET timestamp delay after PC10 exit.

Also, following three exclusive kernel cmdline workarounds (w/o rebuild the kernel) indirectly prove (along w/ some hacking and ftrace) this theory, while non of them perfectly tackles the root cause; they have pros and cons.

- intel_idle.max_cstate=7: this indirectly disables PC10, keeps TSC and HPET, clocksource and watchdog are in low difference within 62.5ms.
- tsc=reliable: this disables Linux kernel clocksource watchdog
- hpet=disable: this makes kernel choose another existing clocksource as watchdog

Since it looks to me likely involving in the firmware domain, I've sent email to explain the above theory in detail to Dell team; hopefully, they can continue the triage and justify the responsibility; this is also respecting Dell's final sign-off in case of the solution in firmware.

As to my side, I am continuing digesting this timekeeping issue in kernel perspective.
Comment 22 Tom B 2019-05-15 10:00:46 UTC
I posted earlier about the same issue with a Threadripper 1950x, I found an alternative workaround but I'm not sure if it's any better than tsc=reliable.

After boot run


# echo hpet > /sys/devices/system/clocksource/clocksource0/unbind_clocksource
# echo acpi_pm > /sys/devices/system/clocksource/clocksource0/unbind_clocksource

This disables hpet and acpi_pm. When the machine resumes from suspend, the clocksources now list jiffies and tsc, with tsc selected.

I've been using this for weeks now with no negative effects that I can see.  I'm not sure if it's relevant here but thought I'd present it as an alternative to try.
Comment 23 Harry Pan 2019-05-16 08:36:25 UTC
Created attachment 282785 [details]
Untrust the clocksource watchdog when its interval is too small

Hi You-Sheng,

May I have your assessment and comment in given patch targeting this timekeeping odd in general w/o hardwired knowledge?

Having said that, this is not fixing the root cause; I am unable to get positive feedback from Dell team to tackle the problematic timer in low power mode.
Comment 24 Harry Pan 2019-05-16 12:43:58 UTC
Some additional ftrace print to support my analysis of the #c32 patch:

In case the kernel selects a buggy timer as the clocksource watchdog (HPET in this case), once the problem happens and the validation timer generates delays, it causes a false alarm incorrectly marks the current main clocksource as unstable; the current mechanism (looks to me) is unable to validate the watchdog timer is unreliable or not (in the runtime) in case that the watchdog timestamps interval is less than the expected timer interval (0.5 sec) too much.

Below I hacked my buggy platform w/ some trace_printk in clocksource.c, the metric of data (diff/cs_nsec/wd_nsec) is in a unit of nanosecond.

<idle>-0     [007] ..s.  3600.314079: clocksource_watchdog: HP: CPU7 wd=hpet diff=16268753
...
<idle>-0     [008] ..s.  3600.794001: clocksource_watchdog: HP: CPU8 wd=hpet diff=107570455
<idle>-0     [008] ..s.  3600.794003: clocksource_watchdog: HP: timekeeping watchdog on CPU8: cs_nsec=479916845 wd_nsec=372346390 diff=107570455

My patch is not targeting on HPET only (as hardwired knowledge), but for any problematic timer been selected as a clocksource watchdog used to validate the main clocksource.
Comment 25 Harry Pan 2019-05-16 12:45:14 UTC
correct my typo: #c32 -> #c23
Comment 26 Harry Pan 2019-05-16 12:56:43 UTC
Hi Tom,

Would you mind to post your dmesg/syslog around these lines [1]?
I think the info is (potentially) helpful to justify the issue in your case. 


[1]
https://github.com/torvalds/linux/blob/master/kernel/time/clocksource.c#L239
...
https://github.com/torvalds/linux/blob/master/kernel/time/clocksource.c#L244
Comment 27 You-Sheng Yang 2019-05-16 13:27:39 UTC
Hi Harry,

Patch in #c23 looks reasonable for me. I'll have this applied tomorrow and verify if there could still be some corner cases. Thank you for the help.
Comment 28 Harry Pan 2019-05-16 13:46:12 UTC
Hi You-Sheng,

Thanks for the reply.

Here is my post on upstream:
https://lkml.org/lkml/2019/5/16/191
Let's collaborate online making the odd follow a good engineering shape.
Comment 29 Harry Pan 2019-05-17 08:18:36 UTC
Created attachment 282801 [details]
ftrace data and debug patch for HPET behavior

I fall back to watch the clocksource watchdog behavior (short as [1], long as enclosed) since I got no help to dive into the firmware.

From the data it looks to me the missing tick goes to the PC10 residency; namely PC10 residency plus the watchdog deviation is very closed to the interval.

I am yet able to understand how the off-load calibrator exactually working.
I am wondering whether it is not off-load'ed at all? or because of the calibrator is not set up?


[1]
<idle>-0     [003] d...    26.285337: cpu_idle: state=4294967295 cpu_id=3
<idle>-0     [003] ..s.    26.285381: clocksource_watchdog: HP: CPU3 wd=hpet diff=95611206 pc10=98850674
<idle>-0     [003] ..s.    26.285382: clocksource_watchdog: HP: timekeeping watchdog on CPU3: cs_nsec=479184013 wd_nsec=383572807 diff=95611206


[   26.262550] clocksource: timekeeping watchdog on CPU3: Marking clocksource 'tsc' as unstable because the skew is too large:
[   26.262575] clocksource:                       'hpet' wd_now: 2511b605 wd_last: 24853e12 mask: ffffffff
[   26.262579] clocksource:                       'tsc' cs_now: 15fbd8e8f4 cs_last: 15b74cab0a mask: ffffffffffffffff
[   26.262584] clocksource:                       'tsc' pc10_now: 168d4310 pc10_last: 8693e08 mask: ffffffffffffffff
Comment 30 Harry Pan 2019-05-18 14:20:29 UTC
re #23, #28: a little revise in v2:
https://lore.kernel.org/patchwork/patch/1075507/
Comment 31 Harry Pan 2019-05-23 03:03:10 UTC
Hi You-Sheng,

Please drop my patch, I've email discussion w/ my reviewer last weekend.
The main issue is the patch creates a circular dependency on the watchdog.


In another side, I setup an experiment on an old KBL machine years ago I worked for, I watched the system 5 seconds for runtime PC10 residency, the paired of timers (TSC and HPET) are behaved well no big tick lost.

i.e.,
localhost /sys/kernel/debug/tracing # turbostat --quiet --show Pkg%pc8,Pkg%pc9,Pkg%pc10 sleep 5; cat /sys/devices/system/clocksource/clocksource0/current_clocksource; grep -e 'HP:' trace | tail -3
5.003667 sec
Pkg%pc8	Pkg%pc9	Pk%pc10
55.74	1.41	36.03
55.74	1.41	36.03

tsc
          <idle>-0     [001] ..s2  3430.102153: clocksource_watchdog: HP: CPU1 wd=hpet diff=1255
          <idle>-0     [000] .Ns2  3430.601949: clocksource_watchdog: HP: CPU0 wd=hpet diff=1119
          <idle>-0     [001] ..s2  3431.102171: clocksource_watchdog: HP: CPU1 wd=hpet diff=1367
Comment 32 Andreas Kloeckner 2019-10-28 19:18:07 UTC
I have a Dell XPS 7390 2-in-1, an Ice Lake machine. ("Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz") It appears that this machine is affected by the same issue. I observe PC10 residency (enabled by PSR), and I see the same types of messages on Linux 5.4-rc5:

[Mo Okt 28 13:19:50 2019] clocksource: timekeeping watchdog on CPU3: Marking clocksource 'tsc' as unstable because the skew is too large:
[Mo Okt 28 13:19:50 2019] clocksource:                       'hpet' wd_now: 4b66035f wd_last: 4aeff372 mask: ffffffff
[Mo Okt 28 13:19:50 2019] clocksource:                       'tsc' cs_now: 19ad4b7abf cs_last: 197f9f2b71 mask: ffffffffffffffff
[Mo Okt 28 13:19:50 2019] tsc: Marking TSC unstable due to clocksource watchdog
[Mo Okt 28 13:19:50 2019] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[Mo Okt 28 13:19:50 2019] sched_clock: Marking unstable (66257968652, 611404)<-(66273921888, -15341837)
[Mo Okt 28 13:19:50 2019] clocksource: Switched to clocksource hpet
Comment 33 Kai-Heng Feng 2019-10-31 08:06:19 UTC
Since there's no response from upstream maintainers, so I send the fix to Ubuntu kernel:
https://lists.ubuntu.com/archives/kernel-team/2019-October/105131.html
https://lists.ubuntu.com/archives/kernel-team/2019-October/105132.html
Comment 34 Alex 2019-11-14 17:09:29 UTC
I'm running into, what seems like, the same problem.
My CPU is Intel E-2136 Hexa-Core Coffee Lake.
Kernel: 4.18.0-80.11.2.el8_0.x86_64
OS: CentOS 8 

Here's a fragment of dmesg output:
clocksource: timekeeping watchdog on CPU7: Marking clocksource 'tsc' as unstable because the skew is too large:
clocksource:                       'hpet' wd_now: 54b0f25 wd_last: 4e9fe44 mask: ffffffff
clocksource:                       'tsc' cs_now: 3e0e2debe0 cs_last: 3daab92440 mask: ffffffffffffffff
tsc: Marking TSC unstable due to clocksource watchdog
TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
sched_clock: Marking unstable (4024083039, -219)<-(4075977403, -51894628)
clocksource: Switched to clocksource hpet

RTC and NTP times are correct, but system time is falling behind ~3 seconds per every minute
Comment 35 Alex 2019-11-14 17:15:18 UTC
Created attachment 285933 [details]
dmesg
Comment 36 Josh Freeno 2019-11-26 04:45:21 UTC
Created attachment 286059 [details]
Errors after hpet disabled on kernel 5.3.13

On kernel 5.3.13 on i7 9750H Coffee Lake I have a new error now that hpet is disabled. 

hpet_acpi_add: no address or irqs in _CRS

See attached journal boot log and  dmesg --level=err,warn

Operating System: Arch Linux 
KDE Plasma Version: 5.17.3
KDE Frameworks Version: 5.64.0
Qt Version: 5.13.2
Kernel Version: 5.3.13-arch1-1
OS Type: 64-bit
Processors: 12 × Intel® Core™ i7-9750H CPU @ 2.60GHz
Memory: 15.4 GiB of RAM
Comment 37 Kai-Heng Feng 2019-11-26 05:07:13 UTC
Josh Freeno,

Can you please describe what the error is?
Comment 38 Josh Freeno 2019-11-26 16:40:56 UTC
Maybe just a warning but others are gonna see this and believe it's a erros too if it isn't removed. I have gave you everything to see it.

hpet_acpi_add: no address or irqs in _CRS
Comment 39 Josh Freeno 2019-11-26 16:48:21 UTC
[josh@archkde ~]$ dmesg --level=err,warn                                  
[    0.452780] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
[    0.492242]   #7  #8  #9 #10 #11
[    0.659629] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    1.186440] hpet_acpi_add: no address or irqs in _CRS
[    1.400341] nvme nvme0: missing or invalid SUBNQN field.
[    2.080883] [drm:lspcon_init [i915]] *ERROR* Failed to probe lspcon
[    2.080911] [drm:intel_ddi_init [i915]] *ERROR* LSPCON init failed on port D
[    2.120786] [Firmware Bug]: ACPI(PEGP) defines _DOD but not _DOS
[    2.512675] i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
[    2.513107] i8042: Warning: Keylock active
[    2.528019] usb: port power management may be unreliable
[    3.106905] acpi PNP0C14:02: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
[    3.106961] wmi_bus wmi_bus-PNP0C14:03: WQBC data block query control method not found
[    3.106962] acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
[    3.126997] acpi PNP0C14:04: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
[    3.173537] acpi PNP0C14:05: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
[    3.304600] i801_smbus 0000:00:1f.4: Accelerometer lis3lv02d is present on SMBus but its address is unknown, skipping registration
[    3.443119] i2c_hid i2c-DELL0926:00: i2c-DELL0926:00 supply vdd not found, using dummy regulator
[    3.443128] i2c_hid i2c-DELL0926:00: i2c-DELL0926:00 supply vddl not found, using dummy regulator
[    3.509268] ATPX version 1, functions 0x00000033
[    3.515789] CRAT table not found
[    3.778101] sd 1:0:0:0: [sda] Optimal transfer size 33553920 bytes not a multiple of physical block size (4096 bytes)
[    4.166418] thermal thermal_zone10: failed to read out thermal zone (-61)
[    4.386089] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[    4.399460] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[    4.611275] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[    4.624428] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[    5.089202] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[    5.592623] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[    7.484253] kauditd_printk_skb: 16 callbacks suppressed
[   13.162262] kauditd_printk_skb: 2 callbacks suppressed
[   13.374667] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[   13.385519] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[   16.632770] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[   17.132871] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[   18.416895] kauditd_printk_skb: 4 callbacks suppressed
[ 2237.382716] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[ 2237.882807] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[ 2238.463416] IRQ 168: no longer affine to CPU1
[ 2238.467363] IRQ 169: no longer affine to CPU2
[ 2238.471114] IRQ 170: no longer affine to CPU3
[ 2238.475129] IRQ 171: no longer affine to CPU4
[ 2238.479024] IRQ 172: no longer affine to CPU5
[ 2238.483000] IRQ 173: no longer affine to CPU6
[ 2238.487937] IRQ 174: no longer affine to CPU7
[ 2238.491610] IRQ 175: no longer affine to CPU8
[ 2238.495246] IRQ 176: no longer affine to CPU9
[ 2238.499253] IRQ 177: no longer affine to CPU10
[ 2239.012115] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[ 2239.021987] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[ 2240.051135] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[ 2240.551318] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[ 2241.313265] done.
[ 2241.627633] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[ 2241.634287] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[ 2241.826540] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[ 2241.833916] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[ 2244.823144] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[ 2244.839761] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[ 2288.566895] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[ 2289.066887] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[ 2290.195562] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[ 2290.205070] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[ 2291.231488] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[ 2291.731598] [drm:hwss_edp_wait_for_hpd_ready [amdgpu]] *ERROR* hwss_edp_wait_for_hpd_ready: wait timed out!
[ 2292.470088] done.
[ 2292.768371] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[ 2292.776361] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[ 2292.971945] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[ 2292.978692] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
[ 2296.011173] iwlwifi 0000:6e:00.0: FW already configured (0) - re-configuring
[ 2296.027585] iwlwifi 0000:6e:00.0: BIOS contains WGDS but no WRDS
Comment 40 mgutt 2021-06-12 09:40:11 UTC
An acquaintance of mine and I have the same problem after the Gigabyte C246N-WU2 reaches C10. He uses an i3-9350K and I use an i3-8100.

Example output:
Jun 10 09:33:02 nas kernel: clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
Jun 10 09:33:02 nas kernel: clocksource: 'hpet' wd_now: 92274da6 wd_last: 91c6f146 mask: ffffffff
Jun 10 09:33:02 nas kernel: clocksource: 'tsc' cs_now: aa0ab034c867 cs_last: aa0a3983f649 mask: ffffffffffffffff
Jun 10 09:33:02 nas kernel: tsc: Marking TSC unstable due to clocksource watchdog
Jun 10 09:33:02 nas kernel: TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
Jun 10 09:33:02 nas kernel: sched_clock: Marking unstable (46630688013441, 955547879)<-(46631645259416, -1687699)
Jun 10 09:33:02 nas kernel: clocksource: Switched to clocksource hpet

Discussion:
https://forums.unraid.net/topic/109990-kein-spindown-nach-hardwareaustausch-mit-aktiviertem-powertop/?do=findComment&comment=1004287

Although the system switched to hpet, the timer is completely wrong. It's 10x slower or so.

In addition I found out that the system isn't responsive anymore. Starting powertop to check the C-States is much slower compared to the system if it's in C9 state.
Comment 41 Thomas Gleixner 2021-06-19 20:36:40 UTC
On Sat, Jun 12 2021 at 09:40, bugzilla-daemon wrote:
> Example output:
> Jun 10 09:33:02 nas kernel: clocksource: timekeeping watchdog on CPU0:
> Marking
> clocksource 'tsc' as unstable because the skew is too large:
> Jun 10 09:33:02 nas kernel: clocksource: 'hpet' wd_now: 92274da6 wd_last:
> 91c6f146 mask: ffffffff
> Jun 10 09:33:02 nas kernel: clocksource: 'tsc' cs_now: aa0ab034c867 cs_last:
> aa0a3983f649 mask: ffffffffffffffff
> Jun 10 09:33:02 nas kernel: tsc: Marking TSC unstable due to clocksource
> watchdog
> Jun 10 09:33:02 nas kernel: TSC found unstable after boot, most likely due to
> broken BIOS. Use 'tsc=unstable'.
> Jun 10 09:33:02 nas kernel: sched_clock: Marking unstable (46630688013441,
> 955547879)<-(46631645259416, -1687699)
> Jun 10 09:33:02 nas kernel: clocksource: Switched to clocksource hpet

CoffeeLake has a broken HPET. Please add "hpet=disable" to the kernel
command line.

Which kernel version are you using?

Also please provide the output of lscpi -nn | grep Host

Thanks,

        tglx
Comment 42 yanghui 2021-08-20 09:08:49 UTC
If clocksource_watchdog can't be schedule in time when happend Softlockup. schedule time of clocksource_watchdog will longer than WATCHDOG_INTERVAL(0.5s).

So current clocksource will be set unstable misjudged.
Comment 43 Xavier Chantry 2021-10-16 17:14:48 UTC
I see a similar problem on a Lenovo P14s, Intel Comet Lake. tsc is regularly marked unstable, but I have no idea if it's related to PC10.

Should I open another ticket with more details, or add them here ? And what do you need exactly ?

It seems to happen on 5.13 and 5.14 kernels, but not on 5.10. Could there be a regression for that CPU ?

Thanks,
Xavier
Comment 44 mgutt 2021-10-17 21:17:52 UTC
Looks like it will be solved in the recent kernel:
https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.15-rc5-x86
Comment 45 Xavier Chantry 2021-10-19 08:07:05 UTC
Good news !

But what do you think about that comment from kiffmet on phoronix article :

> Honestly, the changes to clocksource/timers introduced with kernel 5.13 or
> 5.14 are a mess. I did get constant TSC errors and failed recalibrations (3
> retries, then TSC gets disabled) via HPET due to the new WATCHDOG_MAX_SKEW
> (kernel/time/clocksource.c) setting of 50 being way too tight (the limit
> before was set to 100 - double of what it's now).

Is it possible that 2e27e793e280ff12cb5c202a1214c08b0d3a0f26 made the problem occur more often on 5.14 ?

On 23 users of that laptop, the 3 users having the bug are on 5.14, all others (or the majority of them) are on 5.10.

Because if that's the case, it would be a good idea to plan a fix for stable as well.

Note You need to log in before you can comment on or make changes to this bug.