Bug 217038 - Latest trace-cmd-report seems to ignores filter (-F) parameter
Summary: Latest trace-cmd-report seems to ignores filter (-F) parameter
Status: RESOLVED CODE_FIX
Alias: None
Product: Tools
Classification: Unclassified
Component: Trace-cmd/Kernelshark (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Default virtual assignee for Trace-cmd and kernelshark
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-14 22:09 UTC by Gabriel Krisman Bertazi
Modified: 2023-05-31 12:45 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.14.21-150400.24.38-default
Subsystem:
Regression: No
Bisected commit-id:


Attachments
patch on the ML. (1.78 KB, application/mbox)
2023-03-17 21:26 UTC, Gabriel Krisman Bertazi
Details

Description Gabriel Krisman Bertazi 2023-02-14 22:09:46 UTC
I was investigating why two analysis of the same trace data yields very different results and realized it came down to the version of trace-cmd in opensuse leap (2.6.1) and Tumbleweed (3.x.x).

The issue came down to the filter function (-F parameter) apparently not filtering correctly in 3.x.x.

I first recorded a trace with 2.6.1:

# trace-cmd record -T -e 'kmem:mm_page_alloc'

Then, with trace-cmd 3.1.6:

# trace-cmd report -i trace.dat_2.6.1 -F mm_page_alloc -v -F kernel_stack | head
cpus=16
       trace-cmd-13273 [010] 320791.423556: mm_page_alloc:        page=0xffffffffad8d6ba0 pfn=0x2bf090 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-13273 [010] 320791.423565: kernel_stack:               => trace_event_raw_event_mm_page_alloc
        => __alloc_pages
        => alloc_pages_vma
        => wp_page_copy
        => __handle_mm_fault
        => handle_mm_fault
        => do_user_addr_fault
        => exc_page_fault
       trace-cmd-13273 [010] 320791.423580: mm_page_alloc:        page=0xffffffffad73e270 pfn=0x126760 order=2 migratetype=0 gfp_flags=GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO
       trace-cmd-13273 [010] 320791.423587: kernel_stack:               => trace_event_raw_event_mm_page_alloc
        => __alloc_pages
        => kmalloc_large_node
        => __kmalloc_node
        => kvmalloc_node
        => tracing_buffers_open
        => do_dentry_open
        => path_openat
       trace-cmd-13274 [004] 320791.423628: mm_page_alloc:        page=0xffffffffad71b9d8 pfn=0x103ec8 order=2 migratetype=0 gfp_flags=GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO


In  2.6.1:

# trace-cmd report -i trace.dat_2.6.1 -F mm_page_alloc -v -F kernel_stack | head
  Error: expected type 4 but read 5
cpus=16
       trace-cmd-13273 [010] 320791.423556: mm_page_alloc:        page=0x2bf090 pfn=0x2bf090 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-13273 [010] 320791.423580: mm_page_alloc:        page=0x126760 pfn=0x126760 order=2 migratetype=0 gfp_flags=GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO
       trace-cmd-13274 [004] 320791.423628: mm_page_alloc:        page=0x103ec8 pfn=0x103ec8 order=2 migratetype=0 gfp_flags=GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO
           tmux:-3067  [011] 320791.424302: mm_page_alloc:        page=0x1357e8 pfn=0x1357e8 order=0 migratetype=0 gfp_flags=GFP_KERNEL
           tmux:-3067  [011] 320791.424360: mm_page_alloc:        page=0x1357e8 pfn=0x1357e8 order=0 migratetype=0 gfp_flags=GFP_KERNEL
           tmux:-3067  [011] 320791.424419: mm_page_alloc:        page=0x1357e8 pfn=0x1357e8 order=0 migratetype=0 gfp_flags=GFP_KERNEL
     gnome-shell-1972  [008] 320791.426669: mm_page_alloc:        page=0x105fb0


Notice how the kernel_stack event gets filtered out correctly in 2.6.1, but not in 3.1.6.

I reproduced it with the latest libtraceevent, libtracefs and trace-cmd from source:

 libtraceevent: 4e51588b1f4d ("libtraceevent: version 1.7.1")
    libtracefs: af49d832f466 ("libtracefs: version 1.6.4")
     trace-cmd: 0a68daed9f58 ("trace-cmd: Version 3.1.6")


I thought it might be due to the version of the recorded file, but it
doesn't seem to matter.  Even if I record with 3.1.6, I can never filter
events with a newer trace-cmd.  In this case, I can't get any
events reported with 2.6.1, but I guess that is expected, as the format
changed.
Comment 1 Steven Rostedt 2023-02-15 00:15:57 UTC
Strange? I just tried it out (with the latest of all the libraries too)

# trace-cmd report -F mm_page_alloc -v -F kernel_stack
cpus=4
       trace-cmd-2383  [000]  7192.837686: mm_page_alloc:        page=0xffffffff936e3208 pfn=0x25658 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837698: mm_page_alloc:        page=0xffffffff936ec67f pfn=0x2eacf order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837735: mm_page_alloc:        page=0xffffffff936ecb10 pfn=0x2ef60 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837754: mm_page_alloc:        page=0xffffffff936ecc05 pfn=0x2f055 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837759: mm_page_alloc:        page=0xffffffff936ecb2e pfn=0x2ef7e order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837763: mm_page_alloc:        page=0xffffffff936ecb2f pfn=0x2ef7f order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837768: mm_page_alloc:        page=0xffffffff936ec723 pfn=0x2eb73 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837782: mm_page_alloc:        page=0xffffffff936c3110 pfn=0x5560 order=2 migratetype=0 gfp_flags=GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO
       trace-cmd-2386  [002]  7192.837794: mm_page_alloc:        page=0xffffffff936e1f2a pfn=0x2437a order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837804: mm_page_alloc:        page=0xffffffff936ecc6a pfn=0x2f0ba order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2387  [001]  7192.837816: mm_page_alloc:        page=0xffffffff936f894c pfn=0x3ad9c order=2 migratetype=0 gfp_flags=GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO
       trace-cmd-2387  [001]  7192.837847: mm_page_alloc:        page=0xffffffff936ecfc3 pfn=0x2f413 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837848: mm_page_alloc:        page=0xffffffff936ec5ca pfn=0x2ea1a order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2387  [001]  7192.837859: mm_page_alloc:        page=0xffffffff936e00f6 pfn=0x22546 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837863: mm_page_alloc:        page=0xffffffff936e77a8 pfn=0x29bf8 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2387  [001]  7192.837904: mm_page_alloc:        page=0xffffffff936e3629 pfn=0x25a79 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
          <idle>-0     [001]  7193.634294: mm_page_alloc:        page=0xffffffff936dfa18 pfn=0x21e68 order=3 migratetype=0 gfp_flags=GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP
     kworker/3:2-2327  [003]  7194.749516: mm_page_alloc:        page=0xffffffff936c3a06 pfn=0x5e56 order=0 migratetype=0 gfp_flags=GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY
          <idle>-0     [001]  7195.914313: mm_page_alloc:        page=0xffffffff936c0ba0 pfn=0x2ff0 order=3 migratetype=0 gfp_flags=GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP
          <idle>-0     [001]  7195.914326: mm_page_alloc:        page=0xffffffff936f8b78 pfn=0x3afc8 order=3 migratetype=0 gfp_flags=GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP
[root@fedora-vmware trace-cmd.git]# trace-cmd -v

trace-cmd version 3.1.6 (0a68daed9f58e4429c0f1e7818f7cc0634873112)

# trace-cmd report -F mm_page_alloc | head
cpus=4
       trace-cmd-2383  [000]  7192.837686: mm_page_alloc:        page=0xffffffff936e3208 pfn=0x25658 order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837698: mm_page_alloc:        page=0xffffffff936ec67f pfn=0x2eacf order=0 migratetype=1 gfp_flags=GFP_HIGHUSER_MOVABLE
       trace-cmd-2386  [002]  7192.837726: kernel_stack:         <stack trace >
=> trace_event_raw_event_mm_page_alloc (ffffffff922ada56)
=> __alloc_pages (ffffffff922e61ad)
=> alloc_pages_vma (ffffffff92304730)
=> wp_page_copy (ffffffff922bc5e5)
=> __handle_mm_fault (ffffffff922c3c03)
=> handle_mm_fault (ffffffff922c442f)

I can't reproduce the issue.
Comment 2 Gabriel Krisman Bertazi 2023-02-15 13:42:37 UTC
Thanks for the prompt response, Steven!

> I can't reproduce the issue.

Indeed, quite strange.  This was reported to me on Tumbleweed and I could reproduce it consistently with the builds I mentioned above. 

I reported to check with you first if it rings a bell.   But let me fire up GDB in the next days and see if I collect more information or figure out what is wrong in the affected systems.
Comment 3 Gabriel Krisman Bertazi 2023-03-17 21:25:39 UTC
(In reply to Steven Rostedt from comment #1)
> Strange? I just tried it out (with the latest of all the libraries too)
> 
> # trace-cmd report -F mm_page_alloc -v -F kernel_stack

Hey Steven,

Sorry, I could only get back to this today.

I just found the problem.  It is actually a bit silly:

This works fine:

  tracecmd/trace-cmd report -v -F kernel_stack -i trace.dat 

This doesn't:

  tracecmd/trace-cmd report -i trace.dat -v -F kernel_stack 

Note how the filter needs to come before the input file. You couldn't reproduce because you didn't pass a data file with -i. 

The argument ordering wasn't required on previous versions. I bisected it to:

955d05fc7a ("trace-cmd report: Make filter arguments match their files")

Which make a lot of sense. The commit message suggests this is unintended behavior.

I've sent you a patch on the ML.
Comment 4 Gabriel Krisman Bertazi 2023-03-17 21:26:40 UTC
Created attachment 303977 [details]
patch on the ML.

for completeness, the patch i sent.

Note You need to log in before you can comment on or make changes to this bug.