Bug 219295

Summary: CLOCK_MONOTONIC_COARSE time moves backwards due to NTP adjustments
Product: Timers Reporter: Hari Krishna Sunder (hsunder)
Component: OtherAssignee: john stultz (john.stultz)
Status: NEW ---    
Severity: normal CC: hsunder
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 5.14.0-427.33.1.el9_4.x86_64 Subsystem:
Regression: No Bisected commit-id:

Description Hari Krishna Sunder 2024-09-20 22:03:04 UTC
Overview:
The documentations states:
`All CLOCK_MONOTONIC variants guarantee that the time returned by consecutive calls will not go backwards, but successive calls may—depending on the architecture—return identical (not-increased) time values.`
https://man7.org/linux/man-pages/man2/clock_gettime.2.html


But I see that the time returned by consecutive calls to clock_gettime(CLOCK_MONOTONIC_COARSE, ...) moves backwards due to NTP adjustments by chrony. In my sample program, I can see that every occurance of the downward clock step has a log entry in chrony/tracking.log

On a multi-thread test program I was able to observe that all my threads were detecting the issue together so I do not think this has anything to do with any per CPU core caches, or threads being moved across the physical cores.


===
Machine info:
gcp VM
Machine type: n1-standard-8
CPU platform: Intel Broadwell

Operating System: AlmaLinux 8.10 (Cerulean Leopard)
CPE OS Name: cpe:/o:almalinux:almalinux:8::baseos
Kernel: Linux 4.18.0-553.5.1.el8_10.x86_64
Architecture: x86-64

$ chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^* metadata.google.internal      2   3   377     0    -50us[  -52us] +/-  303us

===
Repro:

Sample program:

#include <ctime>
#include <iostream>
#include <string>
#include <time.h>

const std::string now() {
  time_t now = time(0);
  struct tm tstruct;
  tstruct = *localtime(&now);
  char buf[80];
  strftime(buf, sizeof(buf), "%Y-%m-%d %X ", &tstruct);
  return buf;
}

int main() {
  struct timespec prev_time, curr_time;

  // Get the initial timestamp
  clock_gettime(CLOCK_MONOTONIC_COARSE, &prev_time);
  std::cout << now() << "Start timestamp: " << prev_time.tv_sec << "."
            << prev_time.tv_nsec << std::endl;

  while (1) {
    // Get the current timestamp
    clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_time);

    // Check if the current timestamp is earlier than the previous one
    if (curr_time.tv_sec < prev_time.tv_sec ||
        (curr_time.tv_sec == prev_time.tv_sec &&
         curr_time.tv_nsec < prev_time.tv_nsec)) {
      std::cout << now()
                << "CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: "
                << prev_time.tv_sec << "." << prev_time.tv_nsec
                << ", Current timestamp: " << curr_time.tv_sec << "."
                << curr_time.tv_nsec << std::endl;
    }

    // Update the previous timestamp
    prev_time = curr_time;
  }

  return 0;
}

Output:
$ gcc coarse_time.cc -o coarse_time -lrt -lstdc++
$ ./coarse_time 
2024-09-20 20:49:29 Start timestamp: 250585.834298136
2024-09-20 21:00:02 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 251219.278437025, Current timestamp: 251219.278437024
2024-09-20 21:01:07 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 251283.786156290, Current timestamp: 251283.786156289
2024-09-20 21:05:09 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 251526.422668850, Current timestamp: 251526.422668849
2024-09-20 21:08:24 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 251720.513861339, Current timestamp: 251720.513861338
2024-09-20 21:28:13 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 252909.779272086, Current timestamp: 252909.779272085
2024-09-20 21:40:04 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 253621.475844894, Current timestamp: 253621.475844893
2024-09-20 21:43:03 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 253799.534616994, Current timestamp: 253799.534616993
2024-09-20 21:48:10 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 254107.282887909, Current timestamp: 254107.282887908
2024-09-20 21:51:24 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 254301.291093735, Current timestamp: 254301.291093734
2024-09-20 21:53:50 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 254446.764989951, Current timestamp: 254446.764989950
2024-09-20 21:55:03 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 254519.791467210, Current timestamp: 254519.791467209


$ sudo vi /var/log/chrony/tracking.log (cropped)
2024-09-20 21:48:10 169.254.169.254  3    -88.648      0.081 -2.246e-06 N  1  6.587e-06 -5.919e-07  3.364e-04  8.845e-05  2.666e-04
2024-09-20 21:51:24 169.254.169.254  3    -88.666      0.138 -2.015e-06 N  1  7.867e-06 -2.711e-06  3.579e-04  7.582e-05  2.669e-04
2024-09-20 21:53:50 169.254.169.254  3    -88.683      0.090 -1.820e-06 N  1  7.219e-06  1.080e-06  3.814e-04  7.133e-05  2.712e-04
2024-09-20 21:55:03 169.254.169.254  3    -88.703      0.094 -2.791e-06 N  1  7.353e-06  2.197e-06  3.814e-04  7.718e-05  2.794e-04
Comment 1 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-09-25 13:55:38 UTC
4.18.0-553.5.1.el8_10.x86_64 is a old and heavily modified kernel, you need to report the problem to your vendor.

https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/
Comment 2 Hari Krishna Sunder 2024-09-25 19:57:47 UTC
Filed a bug with AlmaLinux. Link for anyone else who might stumble upon this:
https://bugs.almalinux.org/view.php?id=481
Comment 3 Hari Krishna Sunder 2024-09-30 23:30:23 UTC
https://bugs.almalinux.org/view.php?id=481
Comment 4 Hari Krishna Sunder 2024-09-30 23:31:40 UTC
Same issue happens on 5.14.0-427.33.1.el9_4.x86_64 as well.

AlmaLinux team was able to hit this on 6.11 as well.
root@f41:~# uname -r; ./coarse_time
6.11.0
2024-09-28 15:42:44 Start timestamp: 27.942317575
2024-09-28 15:43:03 CLOCK_MONOTONIC_COARSE went backward! Previous timestamp: 47.459618893, Current timestamp: 47.459618881