Bug 217340 - "print" event is not considered an event by trace-cmd
Summary: "print" event is not considered an event by trace-cmd
Status: RESOLVED CODE_FIX
Alias: None
Product: Tools
Classification: Unclassified
Component: Trace-cmd/Kernelshark (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: Default virtual assignee for Trace-cmd and kernelshark
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-04-14 18:30 UTC by Douglas RAILLARD
Modified: 2023-06-07 19:37 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Douglas RAILLARD 2023-04-14 18:30:31 UTC
"print" event is not considered an event by trace-cmd start, so it cannot be enabled or disabled. Combined with instances (-b instance_name), it breaks the instance abstraction as the top-level buffer will always receive the prints.

This leads to two issues:
* The buffer size is set only for the instance, and not for the top-level buffer. Yet, print events do fill the top-level buffer leading to missed events. This can be worked around by manually writing to the buffer_size_kb file but trace-cmd should allow setting the top-level buffer size as well, even when creating an instance.

* trace-cmd extract does not follow the documented behavior:

    *-B* 'buffer-name'::
      If the kernel supports multiple buffers, this will extract the trace for
      only the given buffer. It does not affect any other buffer. This may be
      used multiple times to specify different buffers. When this option is
      used, the top level instance will not be extracted unless *-t* is given.

Instead, it extracts the top-level buffer as well. I'm not sure if it's a deliberate decision given that some events always end up in top-level buffer or if it's an implementation bug but either way we have an issue:

* If the documented behavior was implemented, there would be no way of getting print events in the instance buffer.

* If trace-cmd extract also gets the top-level buffer, it breaks event isolation. The reason I was trying to use instance buffer was to have our tool collect only its own trace while unaware tools would use the top-level buffer. But this breaks since trace-cmd extract will consume all the top-level events intended for another tool.
Comment 1 Steven Rostedt 2023-04-14 19:32:50 UTC
Really the bug here is that trace-cmd extract -B touches the top level buffer (which is a bug). But I'm assuming that the "print" event is trace_printk events? (events/ftrace/print).

All events in events/ftrace/* are not true events. Hence, you will notice that they do not (and will never) have an "enable" file in their directories. That's because they are not created by the trace event infrastructure in the kernel, but instead are hardwired ring buffer events for other aspects of the kernel (like function and function_graph tracing). The "print" events are created by trace_printk(), which is an internal tracing function that does not take an instance descriptor. And note, trace_printk() is not to be in the kernel at all on any production system. They are for debugging purposes only.

But that said, there is also in the kernel trace_array_printk() that is allowed to be in a production kernel and acts just like trace_printk() but also takes an instance descriptor, which will end up in the instance and not the top level trace array.

The trace-cmd extract is a bug (so I'll leave this open), but the print event description and the title of this bugzilla is really "NOT A BUG".
Comment 2 Douglas RAILLARD 2023-04-14 19:56:22 UTC
I see, in that case I suppose that fixing trace-cmd extract will solve all issues:
* better isolation for concurrent tools
* no more trace with lost events. Even if the events are actually lost in the top-level buffer, it will not matter as they will not appear in the trace.

That does mean trace_printk() output will become harder to get, but I think our tooling can simply add the "-t" in case some pseudo-events are asked for, and hope that no other tool tries to use ftrace in that scenario. I'll checkout that trace_array_printk() but otherwise it should be relatively easy to make a "plain event" version of trace_printk() with a __data_loc field or something like that.

The issue we hit was from the simpleperf tool from Android that was dumping a bunch of info (~700 events) in trace_marker (so "print", not trace_printk()'s "bprint"). Unfortunately, the top-level buffer default size was ridiculously small for whatever reason, leading to lost events. However, these "print" events were not really asked for by the user and cannot be disabled (neither in the tool apparently nor on kernel side). So we end up with traces stating that some events were lost, and there is no robust way of telling if it's useful data (e.g. sched_switch) or the simpleperf dump. 

Given the risk of processing a trace with lost events (silent wrong result), LISA plainly reject such traces, so the user can reconfigure appropriate buffer size and re-run the experiment.
Comment 3 Steven Rostedt 2023-04-14 20:12:37 UTC
Note, if you can get the android apps to write into the instances trace_marker files, then those prints will be in those instance buffers.
Comment 4 Douglas RAILLARD 2023-04-14 20:16:31 UTC
Ah that's interesting, I didn't notice buffers each had their trace_marker thanks. We probably cannot easily change that on simpleperf but there is another case were we do that ourselves where it would help, thanks.
Comment 5 Douglas RAILLARD 2023-04-17 12:19:49 UTC
I just tried the following to silence simpleperf:

* symlink/hardlink the top-level trace_marker to the instance's trace_marker: impossible since the top-level trace_marker file cannot be removed

* Remove write permission to the file: you can set chmod 000, but that is completely ignored by ftrace. The file is still in practice writeable and the data appear in the trace.
Comment 6 Steven Rostedt 2023-06-01 04:01:48 UTC
I now see the other issue you reported here, which is a bug in trace-cmd. I was too distracted by the discussion around the "print" event.

  trace-cmd extract -B foo 

Should NOT affect the top level trace buffer, but I now see that it does.

That needs to be fixed.

Thanks!
Comment 7 Douglas RAILLARD 2023-06-01 09:26:29 UTC
While at it I realized the other day there is another implication of "print" not being a normal event: `trace-cmd record` cannot just collect "print", as it will complain no event is specified, but will also complain about "print" being specified if you try to ask for it.

It's probably just a matter of either accepting "print" and ignoring it later on or removing the check for no event being asked for. It's also fine that you cannot ask for it in isolation of e.g. bprint.

I'm happy to open another issue if you prefer to track it as a separate item.
Comment 8 Steven Rostedt 2023-06-07 19:37:34 UTC
Fixed by https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/commit/?id=9c9d5ed97fcd7e5d1308c4f589d01949c8a1b007

("trace-cmd extract: Do not extract top level unless told to")

Note You need to log in before you can comment on or make changes to this bug.