Doing "perf report" even on a moderately sized perf dump (say, ~3k events) often takes more than a minute. Doing higher-resolution perf'ing is basically completely out of the question. I thought this was inherent in the tool, but then I learned that passing "--no-online" fixes this problem, and makes "perf report" usable even for traces with >5k events. This is a very bad default user experience; I would never have guessed that I can get such a speed-up by disabling an on-by-default option. It would be great you could, for example, print a warning when "perf report" is called on a >1k-event file, informing the user that they can speed up "perf report" by passing "--no-inline"?
(In reply to Ralf from comment #0) > Doing "perf report" even on a moderately sized perf dump (say, ~3k events) > often takes more than a minute. Doing higher-resolution perf'ing is > basically completely out of the question. I thought this was inherent in the > tool, but then I learned that passing "--no-online" fixes this problem, and > makes "perf report" usable even for traces with >5k events. > > This is a very bad default user experience; I would never have guessed that > I can get such a speed-up by disabling an on-by-default option. It would be > great you could, for example, print a warning when "perf report" is called > on a >1k-event file, informing the user that they can speed up "perf report" > by passing "--no-inline"? could you please paste in here record/report examples showing that kind of slow down? thanks, jirka
I have had the same problem since March 2018. Here's my email to lkml: https://lkml.org/lkml/2018/3/20/622 `git bisect` showed that d8a88dd243a170a226aba33e7c53704db2f82aa6 is the first bad commit. This is probably unsurprising given that (IIRC) this commit makes the opposite of `--no-inline` the default. Note that this bug also affects `perf script` the same way. > could you please paste in here record/report examples > showing that kind of slow down? Are you looking for perf.data files or just invocations of `perf`? I don't expect the perf.data files to be portable but I uploaded a very tiny one (12MB) anyways. (Note that this is much, much smaller than the usual 1GB-2GB perf.data files I work with.) http://people.mpi-sws.org/~janno/public/perf.data.tgz I timed `perf script > /dev/null` on my local machine and it takes about 21s. `perf script --no-inline > /dev/null`, on the other hand, takes 0.3s. If you are *not* looking for a perf.data file but a CLI invocation of `perf`, it may be a tiny bit tricky. Ralf and I are both using `perf` to profile the Coq proof assistant, which is written in OCaml. If you can install it easily (via your distribution's package manager, for example) I could provide an example input to the proof assistant that generates profile data triggering the slowdown.
(In reply to Janno from comment #2) > I have had the same problem since March 2018. Here's my email to lkml: > > https://lkml.org/lkml/2018/3/20/622 > > `git bisect` showed that d8a88dd243a170a226aba33e7c53704db2f82aa6 is the > first bad commit. This is probably unsurprising given that (IIRC) this > commit makes the opposite of `--no-inline` the default. > > Note that this bug also affects `perf script` the same way. > > > could you please paste in here record/report examples > > showing that kind of slow down? > > Are you looking for perf.data files or just invocations of `perf`? I don't > expect the perf.data files to be portable but I uploaded a very tiny one > (12MB) anyways. (Note that this is much, much smaller than the usual 1GB-2GB > perf.data files I work with.) > > http://people.mpi-sws.org/~janno/public/perf.data.tgz > > I timed `perf script > /dev/null` on my local machine and it takes about > 21s. `perf script --no-inline > /dev/null`, on the other hand, takes 0.3s. I got it, but I think to reproduce the issue I also need 'perf archive' data to get the workload binaries, so far I dont see the slowdown thanks, jirka
Here is the archive generated by `perf archive` on the same perf.data file: http://people.mpi-sws.org/~janno/public/perf.data.tar.bz2 Thanks for looking into this! Best, Janno
I've been looking for a good profiler for Ubuntu. When I tried perf, it looked promising, but I skipped it because it took minutes, and sometimes hours, to load a recording! Eventually I came across "hotspot" and it worked really fast. When I figured that it uses "perf" under the hood, I was puzzled. Eventually I came across this Bug and indeed '--no-inline' helps a lot, making loading almost instant. Indeed something needs to be done, because default behavior is very slow and it's very non-obvious that '--no-inline' helps. The steps to reproduce are simple: ---------------------------------- 1) Use Ubuntu, at least 20.04 and 21.04 are affected 2) Configure debug infos as described in: https://wiki.ubuntu.com/Debug%20Symbol%20Packages 3) Install debug infos for GTK: $ apt-get install libgtk-3-0-dbgsym 4) Make a recording: $ perf record -F max --call-graph dwarf -- gedit Record some 10 seconds of resizing window back and forth. [ perf record: Captured and wrote 180,963 MB perf.data (22436 samples) ] 5) Try to open the recording $ perf report --call-graph After 10 minutes of waiting, I gave up. For the entire duration, it prints tons of errors like addr2line: DWARF error: could not find variable specification at offset 29448 addr2line: DWARF error: could not find variable specification at offset 29454 addr2line: DWARF error: could not find variable specification at offset 2975a addr2line: DWARF error: could not find variable specification at offset 29a63 addr2line: DWARF error: could not find variable specification at offset 29a57 addr2line: DWARF error: could not find variable specification at offset 29a8e addr2line: DWARF error: could not find variable specification at offset 29a9a addr2line: DWARF error: could not find variable specification at offset 29aa6 addr2line: DWARF error: could not find variable specification at offset 290a7 addr2line: DWARF error: could not find variable specification at offset 20e4b addr2line: DWARF error: could not find variable specification at offset 29448 addr2line: DWARF error: could not find variable specification at offset 29454 6) Try with --no-inline $ perf report --call-graph --no-inline This immediately shows the progressbar and only takes some 10 seconds to load. Some analysis: -------------- $ ps aux | grep addr2line addr2line -e /root/.debug/.build-id/07/70ad07862198d4a98e4c97626117aa669d2ddf/debug -i -f 00000000001a09a9 $ readlink /root/.debug/.build-id/07/70ad07862198d4a98e4c97626117aa669d2ddf ../../usr/lib/x86_64-linux-gnu/libgtk-3.so.0.2404.21/0770ad07862198d4a98e4c97626117aa669d2ddf
$ addr2line -e /root/.debug/.build-id/07/70ad07862198d4a98e4c97626117aa669d2ddf/debug -i -f 00000000001a09a9 addr2line: DWARF error: could not find variable specification at offset 29448 addr2line: DWARF error: could not find variable specification at offset 29454 addr2line: DWARF error: could not find variable specification at offset 2975a addr2line: DWARF error: could not find variable specification at offset 29a63 addr2line: DWARF error: could not find variable specification at offset 29a57 addr2line: DWARF error: could not find variable specification at offset 29a8e addr2line: DWARF error: could not find variable specification at offset 29a9a addr2line: DWARF error: could not find variable specification at offset 29aa6 addr2line: DWARF error: could not find variable specification at offset 290a7 addr2line: DWARF error: could not find variable specification at offset 20e4b parse_selector_pseudo_class.part.0.lto_priv.0 :?
$ addr2line --version GNU addr2line (GNU Binutils for Ubuntu) 2.36.1 Copyright (C) 2021 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) any later version. This program has absolutely no warranty.
Should be fixed by https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=be8ecc57f180415e8a7c1cc5620c5236be2a7e56 - see the write up at https://eighty-twenty.org/2021/09/09/perf-addr2line-speed-improvement
*** Bug 208009 has been marked as a duplicate of this bug. ***