Bug 202677 - "perf report" is extremely slow without --no-online
Summary: "perf report" is extremely slow without --no-online
Status: RESOLVED CODE_FIX
Alias: None
Product: Tracing/Profiling
Classification: Unclassified
Component: Perf tool (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Arnaldo Carvalho de Melo
URL:
Keywords:
: 208009 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-02-25 15:27 UTC by Ralf
Modified: 2022-01-22 03:54 UTC (History)
7 users (show)

See Also:
Kernel Version: 4.19
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Ralf 2019-02-25 15:27:07 UTC
Doing "perf report" even on a moderately sized perf dump (say, ~3k events) often takes more than a minute. Doing higher-resolution perf'ing is basically completely out of the question. I thought this was inherent in the tool, but then I learned that passing "--no-online" fixes this problem, and makes "perf report" usable even for traces with >5k events.

This is a very bad default user experience; I would never have guessed that I can get such a speed-up by disabling an on-by-default option. It would be great you could, for example, print a warning when "perf report" is called on a >1k-event file, informing the user that they can speed up "perf report" by passing "--no-inline"?
Comment 1 Jiri Olsa 2019-02-26 22:57:25 UTC
(In reply to Ralf from comment #0)
> Doing "perf report" even on a moderately sized perf dump (say, ~3k events)
> often takes more than a minute. Doing higher-resolution perf'ing is
> basically completely out of the question. I thought this was inherent in the
> tool, but then I learned that passing "--no-online" fixes this problem, and
> makes "perf report" usable even for traces with >5k events.
> 
> This is a very bad default user experience; I would never have guessed that
> I can get such a speed-up by disabling an on-by-default option. It would be
> great you could, for example, print a warning when "perf report" is called
> on a >1k-event file, informing the user that they can speed up "perf report"
> by passing "--no-inline"?

could you please paste in here record/report examples
showing that kind of slow down?

thanks,
jirka
Comment 2 Janno 2019-02-27 10:50:22 UTC
I have had the same problem since March 2018. Here's my email to lkml:

https://lkml.org/lkml/2018/3/20/622

`git bisect` showed that d8a88dd243a170a226aba33e7c53704db2f82aa6 is the first bad commit. This is probably unsurprising given that (IIRC) this commit makes the opposite of `--no-inline` the default.

Note that this bug also affects `perf script` the same way.

> could you please paste in here record/report examples 
> showing that kind of slow down?

Are you looking for perf.data files or just invocations of `perf`? I don't expect the perf.data files to be portable but I uploaded a very tiny one (12MB) anyways. (Note that this is much, much smaller than the usual 1GB-2GB perf.data files I work with.)

http://people.mpi-sws.org/~janno/public/perf.data.tgz

I timed `perf script > /dev/null` on my local machine and it takes about 21s. `perf script --no-inline > /dev/null`, on the other hand, takes 0.3s.


If you are *not* looking for a perf.data file but a CLI invocation of `perf`, it may be a tiny bit tricky. Ralf and I are both using `perf` to profile the Coq proof assistant, which is written in OCaml. If you can install it easily (via your distribution's package manager, for example) I could provide an example input to the proof assistant that generates profile data triggering the slowdown.
Comment 3 Jiri Olsa 2019-03-01 14:25:26 UTC
(In reply to Janno from comment #2)
> I have had the same problem since March 2018. Here's my email to lkml:
> 
> https://lkml.org/lkml/2018/3/20/622
> 
> `git bisect` showed that d8a88dd243a170a226aba33e7c53704db2f82aa6 is the
> first bad commit. This is probably unsurprising given that (IIRC) this
> commit makes the opposite of `--no-inline` the default.
> 
> Note that this bug also affects `perf script` the same way.
> 
> > could you please paste in here record/report examples 
> > showing that kind of slow down?
> 
> Are you looking for perf.data files or just invocations of `perf`? I don't
> expect the perf.data files to be portable but I uploaded a very tiny one
> (12MB) anyways. (Note that this is much, much smaller than the usual 1GB-2GB
> perf.data files I work with.)
> 
> http://people.mpi-sws.org/~janno/public/perf.data.tgz
> 
> I timed `perf script > /dev/null` on my local machine and it takes about
> 21s. `perf script --no-inline > /dev/null`, on the other hand, takes 0.3s.

I got it, but I think to reproduce the issue I also need
'perf archive' data to get the workload binaries, so far
I dont see the slowdown

thanks,
jirka
Comment 4 Janno 2019-03-01 14:41:49 UTC
Here is the archive generated by `perf archive` on the same perf.data file:
http://people.mpi-sws.org/~janno/public/perf.data.tar.bz2

Thanks for looking into this!

Best,
Janno
Comment 5 Alexander 2021-08-19 20:11:32 UTC
I've been looking for a good profiler for Ubuntu. When I tried perf, it looked promising, but I skipped it because it took minutes, and sometimes hours, to load a recording! Eventually I came across "hotspot" and it worked really fast. When I figured that it uses "perf" under the hood, I was puzzled. Eventually I came across this Bug and indeed '--no-inline' helps a lot, making loading almost instant.

Indeed something needs to be done, because default behavior is very slow and it's very non-obvious that '--no-inline' helps.

The steps to reproduce are simple:
----------------------------------
1) Use Ubuntu, at least 20.04 and 21.04 are affected
2) Configure debug infos as described in:
   https://wiki.ubuntu.com/Debug%20Symbol%20Packages
3) Install debug infos for GTK:
   $ apt-get install libgtk-3-0-dbgsym
4) Make a recording:
   $ perf record -F max --call-graph dwarf -- gedit
   Record some 10 seconds of resizing window back and forth.
     [ perf record: Captured and wrote 180,963 MB perf.data (22436 samples) ]
5) Try to open the recording
   $ perf report --call-graph
   After 10 minutes of waiting, I gave up.
   For the entire duration, it prints tons of errors like
     addr2line: DWARF error: could not find variable specification at offset 29448
     addr2line: DWARF error: could not find variable specification at offset 29454
     addr2line: DWARF error: could not find variable specification at offset 2975a
     addr2line: DWARF error: could not find variable specification at offset 29a63
     addr2line: DWARF error: could not find variable specification at offset 29a57
     addr2line: DWARF error: could not find variable specification at offset 29a8e
     addr2line: DWARF error: could not find variable specification at offset 29a9a
     addr2line: DWARF error: could not find variable specification at offset 29aa6
     addr2line: DWARF error: could not find variable specification at offset 290a7
     addr2line: DWARF error: could not find variable specification at offset 20e4b
     addr2line: DWARF error: could not find variable specification at offset 29448
     addr2line: DWARF error: could not find variable specification at offset 29454
6) Try with --no-inline
   $ perf report --call-graph --no-inline
   This immediately shows the progressbar and only takes some 10 seconds to load.

Some analysis:
--------------
$ ps aux | grep addr2line
  addr2line -e /root/.debug/.build-id/07/70ad07862198d4a98e4c97626117aa669d2ddf/debug -i -f 00000000001a09a9

$ readlink /root/.debug/.build-id/07/70ad07862198d4a98e4c97626117aa669d2ddf
  ../../usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2404.21/0770ad07862198d4a98e4c97626117aa669d2ddf
Comment 6 Alexander 2021-08-19 20:19:43 UTC
$ addr2line -e /root/.debug/.build-id/07/70ad07862198d4a98e4c97626117aa669d2ddf/debug -i -f 00000000001a09a9

addr2line: DWARF error: could not find variable specification at offset 29448
addr2line: DWARF error: could not find variable specification at offset 29454
addr2line: DWARF error: could not find variable specification at offset 2975a
addr2line: DWARF error: could not find variable specification at offset 29a63
addr2line: DWARF error: could not find variable specification at offset 29a57
addr2line: DWARF error: could not find variable specification at offset 29a8e
addr2line: DWARF error: could not find variable specification at offset 29a9a
addr2line: DWARF error: could not find variable specification at offset 29aa6
addr2line: DWARF error: could not find variable specification at offset 290a7
addr2line: DWARF error: could not find variable specification at offset 20e4b
parse_selector_pseudo_class.part.0.lto_priv.0
:?
Comment 7 Alexander 2021-08-19 20:21:16 UTC
$ addr2line --version

GNU addr2line (GNU Binutils for Ubuntu) 2.36.1
Copyright (C) 2021 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
Comment 9 Ian Rogers 2022-01-22 03:54:49 UTC
*** Bug 208009 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.