Bug 216441 - [BISECTED] perf v5.19+ breaks `perf top -p` for multi-threaded processes
Summary: [BISECTED] perf v5.19+ breaks `perf top -p` for multi-threaded processes
Status: ASSIGNED
Alias: None
Product: Tracing/Profiling
Classification: Unclassified
Component: Perf tool (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Arnaldo Carvalho de Melo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-02 09:03 UTC by Echo J.
Modified: 2022-10-12 08:51 UTC (History)
3 users (show)

See Also:
Kernel Version: 6.0-rc3
Tree: Mainline
Regression: Yes


Attachments
Proposed fix (3.72 KB, patch)
2022-09-03 09:50 UTC, Adrian Hunter
Details | Diff

Description Echo J. 2022-09-02 09:03:42 UTC
Hello,

I was trying to figure out why "perf top -p" wasn't working on my Arch Linux system (returning the error "Failed to mmap with 22 (Invalid argument)") and I eventually found out that downgrading to perf v5.18 made "perf top -p" work again

Later I did some bisecting and found out ae4f8ae16a07896403c90305d4b9be27f657c1fc is the problematic commit

I confirmed this issue happens on v5.18.16 and v5.19.4/v5.19.5 Arch kernels (I even tried a v6.0-rc2 mainline kernel and the same error occurs so it should occur on v6.0-rc3 as well)

I even tried updating to perf v6.0-rc3 and the same error occurs

To reproduce this bug, you should compile and install perf with at least the v5.19 kernel tag and run `perf top -p` as an user with the PID of some multi-threaded process (web browsers are a good option)

You should get the "Failed to mmap with 22 (Invalid argument)" message on your screen

This issue doesn't happen without the "-p" argument though

A few users reported this issue on both Intel and AMD CPUs (so it's not some CPU vendor-only bug)

`perf record` works fine for the problematic processes

I just found out that "perf top -p" only breaks for multi-threaded processes (single-threaded ones are fine) and no, specifying a thread ID doesn't have any effect

And as for my system, I currently use a Ryzen 5 4600H CPU on Arch Linux with 5.19.5.arch1 kernel version
Comment 1 Adrian Hunter 2022-09-02 11:59:35 UTC
On 2/09/22 12:03, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=216441
> 
>             Bug ID: 216441
>            Summary: [BISECTED] perf v5.19+ breaks `perf top -p` for
>                     multi-threaded processes
>            Product: Tracing/Profiling
>            Version: unspecified
>     Kernel Version: 6.0-rc3
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Perf tool
>           Assignee: acme@kernel.org
>           Reporter: aidas957@gmail.com
>                 CC: adrian.hunter@intel.com, jolsa@kernel.org
>         Regression: Yes
> 
> Hello,
> 
> I was trying to figure out why "perf top -p" wasn't working on my Arch Linux
> system (returning the error "Failed to mmap with 22 (Invalid argument)") and
> I
> eventually found out that downgrading to perf v5.18 made "perf top -p" work
> again
> 
> Later I did some bisecting and found out
> ae4f8ae16a07896403c90305d4b9be27f657c1fc is the problematic commit

commit ae4f8ae16a07896403c90305d4b9be27f657c1fc
Author: Adrian Hunter <adrian.hunter@intel.com>
Date:   Tue May 24 10:54:31 2022 +0300

    libperf evlist: Allow mixing per-thread and per-cpu mmaps
    
    mmap_per_evsel() will skip events that do not match the CPU, so all CPUs
    can be iterated in any case.
    
    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

> 
> I confirmed this issue happens on v5.18.16 and v5.19.4/v5.19.5 Arch kernels
> (I
> even tried a v6.0-rc2 mainline kernel and the same error occurs so it should
> occur on v6.0-rc3 as well)
> 
> I even tried updating to perf v6.0-rc3 and the same error occurs
> 
> To reproduce this bug, you should compile and install perf with at least the
> v5.19 kernel tag and run `perf top -p` as an user with the PID of some
> multi-threaded process (web browsers are a good option)
> 
> You should get the "Failed to mmap with 22 (Invalid argument)" message on
> your
> screen
> 
> This issue doesn't happen without the "-p" argument though
> 
> A few users reported this issue on both Intel and AMD CPUs (so it's not some
> CPU vendor-only bug)
> 
> `perf record` works fine for the problematic processes

In fact *not* for multi-threaded targets with:

	perf record --per-thread -p

> 
> I just found out that "perf top -p" only breaks for multi-threaded processes
> (single-threaded ones are fine) and no, specifying a thread ID doesn't have
> any
> effect

I will see how best to fix it.
Comment 2 Arnaldo Carvalho de Melo 2022-09-02 19:19:56 UTC
If I do:

⬢[acme@toolbox perf-urgent]$ git log -2
commit dfeb0bc60782471c293938e71b1a1117cfac2cb3 (HEAD -> perf/urgent)
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Fri Sep 2 16:15:39 2022 -0300

    Revert "libperf evlist: Check nr_mmaps is correct"

    This reverts commit 4ce47d842d4c16c07b135b8a7975b8f0672bcc0e.

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

commit 78cd283f6b8ab701cb35eafd5af8140560a88f16
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Fri Sep 2 16:13:41 2022 -0300

    Revert "libperf evlist: Allow mixing per-thread and per-cpu mmaps"

    This reverts commit ae4f8ae16a07896403c90305d4b9be27f657c1fc.

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
⬢[acme@toolbox perf-urgent]$

It works again, can the reporter please try this?
Comment 3 Echo J. 2022-09-02 23:24:04 UTC
(In reply to Arnaldo Carvalho de Melo from comment #2)
> If I do:
> 
> ... 
> 
> It works again, can the reporter please try this?

The issue is gone after reverting those 2 commits as expected (the former one is only for fixing a build error though)
Comment 4 Adrian Hunter 2022-09-03 09:50:15 UTC
Created attachment 301736 [details]
Proposed fix

This is the fix I have so far.
Comment 5 Echo J. 2022-09-03 10:04:05 UTC
(In reply to Adrian Hunter from comment #4)
> Created attachment 301736 [details]
> Proposed fix
> 
> This is the fix I have so far.

I just applied your fix from the LKML and it seems to work

The LKML version had a missing commit hash though so I had to generate one from some random data
Comment 6 Sahan Fernando 2022-10-12 08:51:30 UTC
(In reply to Adrian Hunter from comment #4)
> Created attachment 301736 [details]
> Proposed fix
> 
> This is the fix I have so far.

I can also confirm that your patch works.

Note You need to log in before you can comment on or make changes to this bug.