Bug 10410

Summary: "x86: tsc prevent time going backwards" broke suspend
Product: ACPI Reporter: Oleksij Rempel (fishor) (bug-track)
Component: Power-Sleep-WakeAssignee: acpi_power-sleep-wake
Status: CLOSED CODE_FIX    
Severity: normal CC: bunk, rjw, tglx
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.25-rc8 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 9832    
Attachments: dmesg
Test patch which presets the tsc->cycle_last value on resume

Description Oleksij Rempel (fishor) 2008-04-07 02:08:04 UTC
PC unable to resume after suspend to ram.

This patch make this regression for me:

commit 47001d603375f857a7fab0e9c095d964a1ea0039
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue Apr 1 19:45:18 2008 +0200

    x86: tsc prevent time going backwards
    
    We already catch most of the TSC problems by sanity checks, but there
    is a subtle bug which has been in the code for ever. This can cause
    time jumps in the range of hours.
    
    This was reported in:
         http://lkml.org/lkml/2007/8/23/96
    and
         http://lkml.org/lkml/2008/3/31/23
    
    I was able to reproduce the problem with a gettimeofday loop test on a
    dual core and a quad core machine which both have sychronized
    TSCs. The TSCs seems not to be perfectly in sync though, but the
    kernel is not able to detect the slight delta in the sync check. Still
    there exists an extremly small window where this delta can be observed
    with a real big time jump. So far I was only able to reproduce this
    with the vsyscall gettimeofday implementation, but in theory this
    might be observable with the syscall based version as well.
    
    CPU 0 updates the clock source variables under xtime/vyscall lock and
    CPU1, where the TSC is slighty behind CPU0, is reading the time right
    after the seqlock was unlocked.
    
    The clocksource reference data was updated with the TSC from CPU0 and
    the value which is read from TSC on CPU1 is less than the reference
    data. This results in a huge delta value due to the unsigned
    subtraction of the TSC value and the reference value. This algorithm
    can not be changed due to the support of wrapping clock sources like
    pm timer.
    
    The huge delta is converted to nanoseconds and added to xtime, which
    is then observable by the caller. The next gettimeofday call on CPU1
    will show the correct time again as now the TSC has advanced above the
    reference value.
    
    To prevent this TSC specific wreckage we need to compare the TSC value
    against the reference value and return the latter when it is larger
    than the actual TSC value.
    
    I pondered to mark the TSC unstable when the readout is smaller than
    the reference value, but this would render an otherwise good and fast
    clocksource unusable without a real good reason.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
Comment 1 Adrian Bunk 2008-04-07 13:36:50 UTC
fixed by commit 5b13d863573e746739ccfc24ac1a9473cfee8df1
Comment 2 Oleksij Rempel (fishor) 2008-04-08 23:40:27 UTC
Created attachment 15686 [details]
dmesg

[    2.128371]   Magic number: 0:835:626
[    2.128371]   hash matches drivers/base/power/main.c:207
[    2.128482]   hash matches device 0000:00:1b.0
Comment 3 Thomas Gleixner 2008-04-09 01:28:00 UTC
Created attachment 15687 [details]
Test patch which presets the tsc->cycle_last value on resume

Alexey,

can you test the attached patch on top of Linus latest ?

Thanks,
       tglx
Comment 4 Oleksij Rempel (fishor) 2008-04-09 02:28:12 UTC
This patch working for me.
Thanks,
Alex
Comment 5 Thomas Gleixner 2008-04-09 22:50:49 UTC
> ------- Comment #4 from bug-track@fisher-privat.net  2008-04-09 02:28 -------
> This patch working for me.
> Thanks,
> Alex

Alex,

thanks for testing. I queue it for 2.6.26

Thanks,
	tglx