Bug 196205

Summary: Misalignment of timings in perf (annotations)
Product: Tracing/Profiling Reporter: ux (linux-foekeeng)
Component: Perf toolAssignee: Arnaldo Carvalho de Melo (acme)
Status: NEW ---    
Severity: normal CC: jolsa
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.11.6 Subsystem:
Regression: No Bisected commit-id:

Description ux 2017-06-27 17:13:54 UTC
Very often, the timing looks misaligned. The following is a very typical case:

       │    Disassembly of section .text:
       │
       │    0000000000f48a60 <ff_restore_rgb_planes_sse2>:
       │    ff_restore_rgb_planes_sse2():
       │      mov    0x8(%rsp),%rax
       │      mov    0x10(%rsp),%r10
       │      movslq %eax,%rax
       │      add    %rax,%rdi
       │      add    %rax,%rsi
       │      add    %rax,%rdx
       │      neg    %rax
       │      movdqa 0x1392b30,%xmm3
  0.06 │22:   mov    %rax,%r11
  2.07 │25:   movdqa (%rdi,%r11,1),%xmm0
 21.39 │      movdqa (%rsi,%r11,1),%xmm1
 28.50 │      movdqa (%rdx,%r11,1),%xmm2
 28.97 │      psubb  %xmm3,%xmm1
  3.09 │      paddb  %xmm1,%xmm0
  3.93 │      paddb  %xmm1,%xmm2
  4.02 │      movdqa %xmm0,(%rdi,%r11,1)
  4.37 │      movdqa %xmm2,(%rdx,%r11,1)
  3.26 │      add    $0x10,%r11
  0.03 │    ↑ jl     25
  0.06 │      add    %rcx,%rdi
  0.15 │      add    %r8,%rsi
       │      add    %r9,%rdx
  0.09 │      sub    $0x1,%r10d
       │    ↑ jg     22
       │      repz   retq

In this example, the 3 costing calls are obviously the 3 movdqa loads but for some reason there is an "off-by-one" in the timing column.

This kind of misalignment appears all the time and is not specific to x86 (I have the same problem with arm and arm64).

The above example was obtained with the following (ffmpeg.git/master @ 4ed7c2bbc3):

$ ./ffmpeg -lavfi testsrc2=hd1080:d=60 -c:v utvideo -pix_fmt rgb24 -y /tmp/testsrc2.avi
$ perf record ./ffmpeg_g -i /tmp/testsrc2.avi -f null -
$ perf report