Bug 203811 - perf record fails with "Invalid argument for event (cycles:u)"
Summary: perf record fails with "Invalid argument for event (cycles:u)"
Status: RESOLVED CODE_FIX
Alias: None
Product: Tracing/Profiling
Classification: Unclassified
Component: Perf tool (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Arnaldo Carvalho de Melo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-06-05 00:01 UTC by Konstantin Kharlamov
Modified: 2019-07-13 12:21 UTC (History)
4 users (show)

See Also:
Kernel Version: 5.1.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg 5.1.6 (88.44 KB, text/plain)
2019-06-05 00:01 UTC, Konstantin Kharlamov
Details
perf record -vv ls (7.29 KB, text/plain)
2019-06-20 19:50 UTC, Konstantin Kharlamov
Details
strace perf record ls (1.80 MB, text/plain)
2019-06-20 19:51 UTC, Konstantin Kharlamov
Details

Description Konstantin Kharlamov 2019-06-05 00:01:04 UTC
Created attachment 283107 [details]
dmesg 5.1.6

Since recently `perf record` stopped working for me, it fails with:

    $ perf record -g ls
    Error:
    The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles:u).
    /bin/dmesg | grep -i perf may provide additional information.

Dmesg prints nothing for running the command. To track down the cause I tried:

    1. Disabling "teo governor" I'm using
    2. Installing and running linux-lts 4.19.47
    3. Disabling cpu-microcode update by executing `sudo ln -s /dev/null /etc/tmpfiles.d/linux-firmware.conf`
    4. Installing an older 5.0.2 kernel (in case it's caused by a "minor" kernel update)
    5. Running perf inside Fedora 30 Live USB (it has 5.0.9 kernel)

…to no avail, perf always fails same way.

As I exhausted all simple explanations I could think of, I'm reporting it here.

# Additional information

Kernel: 5.1.6
Distro: Archlinux
Forum thread: https://bbs.archlinux.org/viewtopic.php?pid=1848603
Dmesg attached.
/proc/cpuinfo (the first paragraph as there's a bunch of text, and I can't seem to add multiple attachments simultaneously):

    processor       : 0
    vendor_id       : AuthenticAMD
    cpu family      : 18
    model           : 1
    model name      : AMD A8-3530MX APU with Radeon(tm) HD Graphics
    stepping        : 0
    microcode       : 0x3000027
    cpu MHz         : 884.768
    cache size      : 1024 KB
    physical id     : 0
    siblings        : 4
    core id         : 0
    cpu cores       : 4
    apicid          : 0
    initial apicid  : 0
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 6
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb hw_pstate vmmcall arat npt lbrv svm_lock nrip_save pausefilter
    bugs            : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2
    bogomips        : 3794.76
    TLB size        : 1536 4K pages
    clflush size    : 64
    cache_alignment : 64
    address sizes   : 40 bits physical, 48 bits virtual
    power management: ts ttp tm stc 100mhzsteps hwpstate cpb
Comment 1 Quentin Monnet 2019-06-20 10:27:40 UTC
Hi, I experience a similar error.

In my case, it is a perf bug, and I could trace it to
[commit 4e8a5c155137](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4e8a5c155137) ("perf evsel: Fix max perf_event_attr.precise_ip detection"). 
It looks like `precise_ip` should be decremented when I try to run `perf record`, but it is not because `errno` is set to `EINVAL` instead of `ENOTSUP` and we break from the loop in `perf_event_open()`.

I can fix it with the following change to perf sources:

```
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a6f572a40deb..725c5c548504 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1793,7 +1793,7 @@ static int perf_event_open(struct perf_evsel *evsel,
                 *  - sys_perf_event_open failed with ENOTSUP error,
                 *    which is associated with wrong precise_ip
                 */
-               if (!precise_ip || !evsel->precise_max || (errno != ENOTSUP))
+               if (!precise_ip || !evsel->precise_max)
                        break;
 
                /*
```

But as mentioned in the comment visible in that patch, `ENOTSUP` does seem to be related to a wrong value for `precise_ip` in most cases. I have not been able to figure out why we get `EINVAL` in this case.
Comment 2 Jiri Olsa 2019-06-20 18:41:05 UTC
I'm checking on that, could you attach output from:

  perf record -vv ...
  strace perf record ...

would be helpful,
thanks,
jirka
Comment 3 Konstantin Kharlamov 2019-06-20 19:50:48 UTC
Created attachment 283323 [details]
perf record -vv ls
Comment 4 Konstantin Kharlamov 2019-06-20 19:51:52 UTC
Created attachment 283325 [details]
strace perf record ls

(In reply to Jiri Olsa from comment #2)
> I'm checking on that, could you attach output from:
> 
>   perf record -vv ...
>   strace perf record ...
> 
> would be helpful,
> thanks,
> jirka

Sure, attached. Thanks!
Comment 5 Jiri Olsa 2019-06-23 22:16:27 UTC
> Since recently `perf record` stopped working for me, it fails with:

which recent kernel version worked for you?

the dmesg output suggests there's X86_FEATURE_PERFCTR_CORE disabled,
I'm not sure, if AMD pmu hadles that correctly.. also your description
suggest there was microcode update, I wonder that could break it :-\

I'll ask amd guys about that

jirka
Comment 6 Konstantin Kharlamov 2019-06-24 21:01:50 UTC
(In reply to Jiri Olsa from comment #5)
> > Since recently `perf record` stopped working for me, it fails with:
> 
> which recent kernel version worked for you?
> 
> the dmesg output suggests there's X86_FEATURE_PERFCTR_CORE disabled,
> I'm not sure, if AMD pmu hadles that correctly.. also your description
> suggest there was microcode update, I wonder that could break it :-\
> 
> I'll ask amd guys about that
> 
> jirka

Thanks. It was 5.0.x, but as I mentioned I tried installing 5.0.2 as well as 4.19.47; and I tried disabling microcode update — to no avail.
Comment 7 Kim Phillips 2019-06-26 17:12:22 UTC
(In reply to Jiri Olsa from comment #5)
> > Since recently `perf record` stopped working for me, it fails with:
> 
> the dmesg output suggests there's X86_FEATURE_PERFCTR_CORE disabled,
> I'm not sure, if AMD pmu hadles that correctly.. also your description
> suggest there was microcode update, I wonder that could break it :-\

Hi,

I don't have access to a family 12h (18 decimal) machine, but think I have reproduced the issue on a f17h machine by returning early in amd_core_pmu_init() as if the !boot_cpu_has(X86_FEATURE_PERFCTR_CORE) test succeeded.

I bisected the perf tool (always running on a v5.0+ kernel) to the problem and it indeed points to commit 4e8a5c155137 "perf evsel: Fix max perf_event_attr.precise_ip detection", like Quentin Monnet says.

From my analysis, one of the sources of the -EINVAL is the code added by commit cc6795aeffea "perf/core: Add PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs", and enabled by commit 2ff40250691e "perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs".

Based on the cover letter for that series:

https://patchwork.kernel.org/cover/10755815/

"Drivers that do not support any exclusion that previously reported -EPERM or -EOPNOTSUPP will now report -EINVAL - this is consistent with the majority and results in userspace perf retrying without exclusion."

Yet 4e8a5c155137 "perf evsel: Fix max perf_event_attr.precise_ip detection" paid no heed to that.

Changing it to also now ignore EINVAL, like so:

-	if (!precise_ip || !evsel->precise_max || (errno != ENOTSUP))
+	if (!precise_ip || !evsel->precise_max || (errno != ENOTSUP && errno != EINVAL))

makes my family 17h machine with its f12h-hack seemingly successfully complete the 'perf record' command, but it fails to respect the precision level coming from perf_ibs_precise_event(), which looks like it should normally accept a precise_ip of 2...
Comment 8 Jiri Olsa 2019-06-27 09:56:46 UTC
I'm still confused about the following perf record -vv output:

  ------------------------------------------------------------
  perf_event_attr:
    ...
    precise_ip                       3
    ...
  ------------------------------------------------------------
  sys_perf_event_open: pid 26057  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -95
  decreasing precise_ip by one (2)
  ------------------------------------------------------------
  perf_event_attr:
    ...
    precise_ip                       2
    ...
  ------------------------------------------------------------
  sys_perf_event_open: pid 26057  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off bpf_event
  ...

where kernel returns -EINVAL just because we decreased precise_ip by 1,

I can't see it being related to PERF_PMU_CAP_NO_EXCLUDE

I'll check on the AMD server with the hack you did to get similar
setup for family12 and let you know
Comment 9 Kim Phillips 2019-06-27 14:19:20 UTC
(In reply to Jiri Olsa from comment #8)
> I'm still confused about the following perf record -vv output:
> where kernel returns -EINVAL just because we decreased precise_ip by 1,
> I can't see it being related to PERF_PMU_CAP_NO_EXCLUDE

The only other place I saw return -EINVAL was in find_get_context, where tas_ctx_nr gets assigned perf_invalid_context - which is -1 - and so the ctxn < 0 succeeds, thereby returning the -EINVAL.  It's not clear to me what perf_invalid_context is supposed to be doing - I see some comments saying 'system-wide', and related to uncore PMCs, but making perf_ibs_{fetch,op} use perf_hw_context instead didn't make the problem go away.  I am aware that for some reason, since IBS support was added into the kernel, it instructed to use perf record with -a, which also makes this source of the -EINVAL go away.
Comment 10 Jiri Olsa 2019-07-01 07:27:21 UTC
(In reply to Kim Phillips from comment #9)
> (In reply to Jiri Olsa from comment #8)
> > I'm still confused about the following perf record -vv output:
> > where kernel returns -EINVAL just because we decreased precise_ip by 1,
> > I can't see it being related to PERF_PMU_CAP_NO_EXCLUDE
> 
> The only other place I saw return -EINVAL was in find_get_context, where
> tas_ctx_nr gets assigned perf_invalid_context - which is -1 - and so the
> ctxn < 0 succeeds, thereby returning the -EINVAL.  It's not clear to me what
> perf_invalid_context is supposed to be doing - I see some comments saying
> 'system-wide', and related to uncore PMCs, but making perf_ibs_{fetch,op}
> use perf_hw_context instead didn't make the problem go away.  I am aware
> that for some reason, since IBS support was added into the kernel, it
> instructed to use perf record with -a, which also makes this source of the
> -EINVAL go away.

yea, it's where it fails on my test system, the perf_ibs takes over
and event_init succeeds, and later it fails on context attach because
it's system wide only

perhaps we could skip the selective errno checking and try maximum
precise in any case.. it's separated fallback code, so should do
no harm, like:

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 04c4ed1573cb..5350db6ea1d4 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1799,14 +1799,8 @@ static int perf_event_open(struct perf_evsel *evsel,
                if (fd >= 0)
                        break;
 
-               /*
-                * Do quick precise_ip fallback if:
-                *  - there is precise_ip set in perf_event_attr
-                *  - maximum precise is requested
-                *  - sys_perf_event_open failed with ENOTSUP error,
-                *    which is associated with wrong precise_ip
-                */
-               if (!precise_ip || !evsel->precise_max || (errno != ENOTSUP))
+               /* Do not try less precise if not requested. */
+               if (!evsel->precise_max)
                        break;
 
                /*

it's in my perf/fixes branch in:
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
Comment 11 Kim Phillips 2019-07-01 23:17:49 UTC
That does indeed get my family 12h hacked machine past the "sys_perf_event_open() syscall returned with 22" error...
Comment 12 Konstantin Kharlamov 2019-07-07 13:53:24 UTC
Thanks everyone, and sorry for not answering. I'm still on my way to test it. I've been trying to build Archlinux kernel and apply the patch, which took some effort, lots of time, a few full rebuilds (no idea why), and now I finally tested it, that didn't seem to work, but then I figured that Archlinux separated "perf" to another package, and now I will need somehow to merge source dir from one package to another package, etc… *sigh*

Anyway, just to let you know I'm still here :)
Comment 13 Konstantin Kharlamov 2019-07-13 12:20:35 UTC
Thanks you very much, it's fixed! I had to pass -Wno-error to build since otherwise 5.1 usbip package fails to build (apparently this patch from 2018 has to fix it, but never made to the kernel. I wonder though how the distro can build it)¹, but in the end it all works now.

1: https://patchwork.kernel.org/patch/10538575/

Note You need to log in before you can comment on or make changes to this bug.