Bug 112521

Summary: monotonic sem_timedwait
Product: Timers Reporter: Jay (bugzilla.kernel.org)
Component: OtherAssignee: john stultz (john.stultz)
Status: NEW ---    
Severity: normal CC: bique.alexandre, Dean_Jenkins, fweimer, grotlek, lorenzo.novara, tomi.kyostila, xihajuan2010
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: all Subsystem:
Regression: No Bisected commit-id:

Description Jay 2016-02-16 14:08:23 UTC
Currently there is no support for sem_timedwait() using CLOCK_MONOTONIC.
The wait-call will get disrupted in various ways if the clock on the system changes by another process.


A discussion on glibc bugzilla reveals that the kernel does not support it. Would it be possible to add ?

https://sourceware.org/bugzilla/show_bug.cgi?id=14717
Comment 1 grotlek 2016-02-16 16:32:09 UTC
Okay, I'll try to make a slightly better description here than the original report, for those who don't want to have to wade through glibc-related things :-).

In the CRT, there are a few ways of waiting on a synchronisation object (conditionals, semaphores and mutexes), but timing out if a specific time is reached before the state of the synchronisation object is changed.

For PThread conditionals, you can choose the clock which this timeout is based on; however for semaphores and mutexes you cannot. It uses the "wall time" clock which is subject to changes.

Apparently this requires the underlying kernel futex() calls to support the different system clocks, such as CLOCK_MONOTONIC or CLOCK_MONOTONIC_RAW rather than just the standard CLOCK_REALTIME.

The issue is that other processes (or the user) can change the value the system clock while a process is waiting (or about to wait) on one of these synchronisation objects may return early (if the clock is moved forwards) or late/never (if the clock is moved backwards).

An example case is given in the glibc bug report I filed, linked above.

By allowing the appropriate choice of clocks when the futex() call is made, this undesirable situation can be prevented.

(Myself and 2 other people have seen this issue occur, so it's a little more than hypothetical unfortunately).
Comment 2 Dean Jenkins 2016-02-16 17:41:46 UTC
In the olden days of a UNIX server there was 1 time source namely NTP or a systems administrator that set the system clock a few times a year.

In modern times in an embedded environment such as in a vehicle there can be multiple time sources such as:

a) 1970 EPOCH (initial system clock value)
b) GPS
c) NTP from Internet via Bluetooh
d) NTP from Internet via WiFi
e) Satellite Radio
f) RTC
g) end user (very inaccurate)

You will notice that the time sources are mainly RF based.

All these time sources fight with each other. If you imagine the system clock being regularly updated during an hour. The system clock is going to go backwards and forwards. I agree this looks like a poor design but it is a scenario that needs a solution.

This makes sem_timedwait() fail because it's timeout is dependent on the system clock which is subject to arbitrary changes in time.

Therefore, in my opinion it would be better to make sem_timedwait() agnostic (not dependent) on the system clock at all.

My recommendation is to allow CLOCK_MONOTONIC to be used with sem_timedwait() to avoid any disruption due to arbitrary changes of the system clock.
Comment 3 wuxiang zhou 2016-08-09 03:29:00 UTC
these days, I'm also facing this timout issue, on my embdeded device. 
the senario is as below:
-------
clock_gettime(CLOCK_REALTIME, &ts)
...
(ts + timeout)
sem_timedwait(&sem->sem, &ts); 
-------
1. system bring up with the epoch time(1970.1.1)
2. after a while, time will be updated base on the received GPS signal.
3. if time update happens after clock_gettime() but  before sem_timedwait() there  will be a  time jump, so timeout happens immediately, this is what we don't want to see.  as my analysis, only change to clock_gettime(CLOCK_MONOTONIC, &ts) also don't work.

 
apparently, in case of such senario it's better to use the MONOTONIC time(or relative time). But as the posix standard, sem_timedwait should use the system real time. 


I found that QNX has already has their solution 
QNX has two function to deal with timeout:
1. sem_timedwait()  is using CLOCK_REALTIME
2. sem_timedwait_monotonic() is using CLOCK_MONOTONIC

sem_timedwait() is POSIX 1003.1 SEM TMO; sem_timedwait_monotonic() is QNX Neutrino

http://www.qnx.co.jp/developers/docs/6.5.0/index.jsp?topic=%2Fcom.qnx.doc.neutrino_lib_ref%2Fs%2Fsem_timedwait.html

maybe we can refer to this and implement our own
----------------
Thanks
Comment 4 john stultz 2016-08-09 03:40:28 UTC
Might be better to propose a patch to lkml. Feature requests in this Bugzilla tend not to get much attention.
Comment 5 Alexandre BIQUE 2016-12-09 13:12:24 UTC
It would be awesome to be able to use sem_timedwait_np(sem, timeout, clock).

Don't give up guys! :)