Bug 216577 - trace-cmd stream dies with "Creating stream for 0" but record+report works
Summary: trace-cmd stream dies with "Creating stream for 0" but record+report works
Status: RESOLVED CODE_FIX
Alias: None
Product: Tools
Classification: Unclassified
Component: Trace-cmd/Kernelshark (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Default virtual assignee for Trace-cmd and kernelshark
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-13 08:16 UTC by Charlemagne Lasse
Modified: 2023-06-02 08:04 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.18.16-1~bpo11+1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
trace-cmd stream: Close temp trace file to create options (624 bytes, patch)
2022-10-13 13:36 UTC, Steven Rostedt
Details | Diff

Description Charlemagne Lasse 2022-10-13 08:16:40 UTC
I have tried to use `trace-cmd stream -e qdisc:qdisc_dequeue` on OpenWrt 22.03 ipq40xx (kernel 5.10.138). But after the startup, It just died with the error "Creating stream for 0". So I've tried to switch to OpenWrt master with trace-cmd 3.1.3 - OpenWrt 22.03 still ships with  2.9.1. But it didn't change the behavior.


So I thought that it is a bug somewhere in OpenWrt and switched to my Debian amd64 system (bullseye with backported kernel 5.18.16-1~bpo11+1 but also trace-cmd 2.9.1). But the behavior was exactly the same. So I've build the current master version of trace-cmd and its dependencies. But it also didn't change the behavior.


So I was rather puzzled because this command is mentioned in various places and I am rather sure that it had it working (maybe 4 years ago or so). So I tried the non-"realtime" version by running:

trace-cmd record -e qdisc:qdisc_dequeue
trace-cmd report

And this seems to work fine on all systems I've tested.
Comment 1 Charlemagne Lasse 2022-10-13 09:21:39 UTC
I've tested with net:net_dev_xmit to have the same events which I've tested in the past. But this didn't change the behavior


After that, I've tested it on Debian stretch with kernel 4.9.228 and trace-cmd 2.6.0.1 and it worked fine. I have then installed the newest version of trace-cmd from git. And it immediately showed the same behavior. So I've started to bisect it. I've started with 2.9.1 because it didn't had external tracefs dependencies. It (of course) failed. Then I went back to 2.6.0.1 and tested if it is a good starting point - but there was not tag for it in the repository. So I've just used the commit 021710e1073. And it also worked fine. So a simple git bisect was good enough.

Unfortunately, I had a  lot of segfaults when i've tried it. If I use the segfaults as bad commit, then I end up with:

# bad: [530b1a0caef39466e16bbd49de5afef89656f03f] trace-cmd: Version 2.9.1
# good: [021710e1073fe203341b427cd1a4bac577ec899c] trace-cmd-v2.6.1
git bisect start 'trace-cmd-v2.9.1' 'HEAD'
# bad: [212ea2ae49058af11686f6ffd7738737a137a664] kernel-shark-qt: Do marker-deselect by right mouse click of the button
git bisect bad 212ea2ae49058af11686f6ffd7738737a137a664
# bad: [46112ea684c1948738a3382069d8ea2a998cf5a8] tools lib traceevent, perf tools: Rename traceevent_plugin_* APIs
git bisect bad 46112ea684c1948738a3382069d8ea2a998cf5a8
# good: [9cffb6796e379c38221545b9a15c5d031c909e0d] trace-cmd: Add option CPUCOUNT to buffer instance options
git bisect good 9cffb6796e379c38221545b9a15c5d031c909e0d
# good: [fc36abc4259749e17ba93395e8823cd9a950fb6c] trace-filter: Change the naming convention used in trace-filter-hash
git bisect good fc36abc4259749e17ba93395e8823cd9a950fb6c
# bad: [606a42ac4ae9645c0f41d3949e643c89c75ae864] kernel-shark-qt: Deal with the "no tasks" case in kshark_get_task_pids()
git bisect bad 606a42ac4ae9645c0f41d3949e643c89c75ae864
# bad: [5c561fd9f058439817176ea795fc570453f32a8e] kernel-shark-qt: Add an example showing how to load trace data
git bisect bad 5c561fd9f058439817176ea795fc570453f32a8e
# good: [ff551288621dd07e9a9da433cb5b9f539c025cb6] Change to SPDX License format
git bisect good ff551288621dd07e9a9da433cb5b9f539c025cb6
# bad: [d65db079eb3d0b528902bb16825dbe894c5807da] kernel-shark-qt: Add Cmake build system for the Qt based KernelShark
git bisect bad d65db079eb3d0b528902bb16825dbe894c5807da
# bad: [0ed686ef0e5b7106cfe3924007c06852c03bb78c] kernelshark: Have kernelshark depend on libtrace*.a files
git bisect bad 0ed686ef0e5b7106cfe3924007c06852c03bb78c
# bad: [62e82cc6cdc92234c861a5a73d96d126b61b797d] trace-cmd: Use lookup table instead of link list for pages
git bisect bad 62e82cc6cdc92234c861a5a73d96d126b61b797d
# first bad commit: [62e82cc6cdc92234c861a5a73d96d126b61b797d] trace-cmd: Use lookup table instead of link list for pages

But I doubt that the segfault is related. So I started again. If I count the segfaults as good, then I end up with:

# bad: [530b1a0caef39466e16bbd49de5afef89656f03f] trace-cmd: Version 2.9.1
# good: [62e82cc6cdc92234c861a5a73d96d126b61b797d] trace-cmd: Use lookup table instead of link list for pages
git bisect start 'trace-cmd-v2.9.1' '62e82cc6cdc92234c861a5a73d96d126b61b797d'
# good: [a9de6fa3e218ac1c0dd60c60e0f822cba5b5bcde] kernel-shark: Remove the definition of KS_CONF_DIR
git bisect good a9de6fa3e218ac1c0dd60c60e0f822cba5b5bcde
# good: [8790da96c26d2f06c73c8a54866ab44da0f2b26e] trace-cmd: Add VM tracing protocol messages
git bisect good 8790da96c26d2f06c73c8a54866ab44da0f2b26e
# good: [08b9d5076455c93c997376c8089ee12b9071785b] trace-cmd: Basic infrastructure for host - guest timestamp synchronization
git bisect good 08b9d5076455c93c997376c8089ee12b9071785b
# bad: [b3363aa572c02a371440072a8ff4586b2e926ab5] trace-cmd: Stop recording when processes traced with -P exit
git bisect bad b3363aa572c02a371440072a8ff4586b2e926ab5
# bad: [2119c1135ec105ab2d59fec8fc4624f7c7b8c6e4] trace-cmd: Fix setting triggers to all events from given system
git bisect bad 2119c1135ec105ab2d59fec8fc4624f7c7b8c6e4
# bad: [37c5dd6960b3a016d50b681c4d8f68b13f9be9a2] trace-cmd: Validate input parameters of tracecmd_get_guest_cpumap() API
git bisect bad 37c5dd6960b3a016d50b681c4d8f68b13f9be9a2
# good: [4908d03e6f24b552f8e4b966c94b7c26a4ed0d35] trace-cmd: Fix possible usage of not null-terminated string
git bisect good 4908d03e6f24b552f8e4b966c94b7c26a4ed0d35
# bad: [5d4d7ec3025595260e08f3c07b2cfb04acd23023] trace-cmd: Move reading of trace.dat options to tracecmd_read_headers()
git bisect bad 5d4d7ec3025595260e08f3c07b2cfb04acd23023
# good: [aef089e50c1f3197588c95c8c9597165d3a64bf5] trace-cmd: Fix reading of the traceid option from trace.dat file
git bisect good aef089e50c1f3197588c95c8c9597165d3a64bf5
# first bad commit: [5d4d7ec3025595260e08f3c07b2cfb04acd23023] trace-cmd: Move reading of trace.dat options to tracecmd_read_headers()


5d4d7ec3025595260e08f3c07b2cfb04acd23023 is the first bad commit
commit 5d4d7ec3025595260e08f3c07b2cfb04acd23023
Author: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Date:   Thu Apr 9 16:28:24 2020 +0300

    trace-cmd: Move reading of trace.dat options to tracecmd_read_headers()
    
    The options section from trace.dat file is logically part of trace headers.
    In some use cases it is useful first to read all headers from the file,
    analyze the information and do some processing, before reading the tracing
    data. In the current implementation, reading of options is just before
    reading the tracing data.
    Moved the reading of options and a CPU count from read_cpu_data() to
    tracecmd_read_headers(). This allows to implement APIs for trace.dat
    file reading on stages - headers stage and tracing data stage.
    
    Link: http://lore.kernel.org/linux-trace-devel/20200409132825.79475-2-tz.stoyanov@gmail.com
    
    Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

:040000 040000 279c8d17d7d4d59aef0e4664692b3b406e982871 a0ef9a49402afc1135fec25d8fa431c4d87cfa59 M      include
:040000 040000 06902a59ed34210cd292c918e1c421fdc2a06c67 9de6a3694cc2548a77ab27b295e8641c55a92f60 M      lib



And this commit makes a lot more sense as source of the problem which we are seeing with the stream command.
Comment 2 Charlemagne Lasse 2022-10-13 09:24:52 UTC
If I use version 2.9.1 and just revert the mentioned commit 5d4d7ec3025595260e08f3c07b2cfb04acd23023 then the stream command is working again
Comment 3 Steven Rostedt 2022-10-13 13:10:57 UTC
Thanks for reporting this. I'm able to reproduce it too. I see where the problem, is but it's not going to be easy to fix as it's caused by a slight design change, which means the fix is going to be another design change.

I'll see what I can do to fix it.
Comment 4 Steven Rostedt 2022-10-13 13:36:32 UTC
Created attachment 302991 [details]
trace-cmd stream: Close temp trace file to create options

Can you try this patch to see if it fixes the issue for you?
Comment 5 Charlemagne Lasse 2022-10-13 13:48:19 UTC
Yes, seems to work for me
Comment 6 Steven Rostedt 2023-06-02 08:04:28 UTC
Fixed by: 422d37425a086 ("trace-cmd stream: Close temp trace file to create")

Note You need to log in before you can comment on or make changes to this bug.