Hello, I've been trying to use trace-cmd to trace some KVM functions. Unfortunately, I often end up with the following problem: # trace-cmd record -e 'kvmmmu/*' -e 'kvm/kvm_exit' -f 'exit_reason == 48' -e 'kvm/kvm_page_fault' -e 'kvm/kvm_async_pf*' -e 'kvm/kvm_hv_flush*' -e 'kvm/kvm_emulate_insn' -e 'kvm/kvm_nested*' -e 'kvm/kvm_invlpga' -e 'kvm/kvm_mmio' -e 'kvm/kvm_page_fault' -e 'kvm/kvm_pv_tlb_flush' -e 'kvm/kvm_try_async_get_page' -e 'kvm/kvm_unmap_hva_range' --compression zstd Hit Ctrl^C to stop recording ^CCPU 2: 254611 events lost CPU 4: 296664 events lost CPU 6: 237804 events lost CPU 7: 65205 events lost CPU 8: 304900 events lost CPU 10: 174622 events lost CPU 12: 360047 events lost CPU 14: 176447 events lost CPU 16: 221807 events lost CPU 18: 171316 events lost CPU 20: 51030 events lost CPU 22: 131355 events lost CPU 24: 249424 events lost CPU 26: 116708 events lost CPU 28: 181103 events lost CPU0 data recorded at offset=0x179000 libtracecmd: Invalid argument did not match size of 2951544832 to -1343422464 Here the capture was running for ~9 hours and the outputs file size (all .cpuX files) was about 86GB. After I stopped the record, it printed the above logs and I ended up with a 185MB trace.date file. trace-cmd report only printed "cpus=128". I can easily reproduce the problem with an active VM (running stress-ng --fault 4 --sysbadaddr 2 --memcpy 2 --timeout 0) and trace-cmd record -e 'kvmmmu' -e 'kvm' -p function_graph. Usually this is enough to generate a lot of traces and make the problem appear if I let the record run for one minute. I also got the same issue if I tried to send the output over the network (using -N for the record). When the record is stopped, the server also has troubles processing the traces. Versions: - libtracefs: 1.8.1 - libtraceevent: 1.8.3 - trace-cmd: 3.3 Is this something expected when events are lost? Regarding my usage, I don't really care to lose some events (I'm tracking an issue that ends up generating millions of traces in a short amount of time), I'm mostly interested in traces before the "flood" happens. Let me know if there is anything I can provide or do to help debug this. I'm not sure I'll have the time to debug the code but I wanted to report the issue nonetheless.
On Wed, 11 Sep 2024 09:36:36 +0000 bugzilla-daemon@kernel.org wrote: > > I've been trying to use trace-cmd to trace some KVM functions. Unfortunately, > I > often end up with the following problem: > > # trace-cmd record -e 'kvmmmu/*' -e 'kvm/kvm_exit' -f 'exit_reason == 48' -e > 'kvm/kvm_page_fault' -e 'kvm/kvm_async_pf*' -e 'kvm/kvm_hv_flush*' -e > 'kvm/kvm_emulate_insn' -e 'kvm/kvm_nested*' -e 'kvm/kvm_invlpga' -e > 'kvm/kvm_mmio' -e 'kvm/kvm_page_fault' -e 'kvm/kvm_pv_tlb_flush' -e > 'kvm/kvm_try_async_get_page' -e 'kvm/kvm_unmap_hva_range' --compression zstd > Hit Ctrl^C to stop recording > ^CCPU 2: 254611 events lost > CPU 4: 296664 events lost > CPU 6: 237804 events lost > CPU 7: 65205 events lost > CPU 8: 304900 events lost > CPU 10: 174622 events lost > CPU 12: 360047 events lost > CPU 14: 176447 events lost > CPU 16: 221807 events lost > CPU 18: 171316 events lost > CPU 20: 51030 events lost > CPU 22: 131355 events lost > CPU 24: 249424 events lost > CPU 26: 116708 events lost > CPU 28: 181103 events lost > CPU0 data recorded at offset=0x179000 > libtracecmd: Invalid argument > did not match size of 2951544832 to -1343422464 Thanks for reporting. That looks like something should be 64 bit and is only 32 bit, as -1343422464 = 0xffffffffafed0000 and 0xafed0000 = 2951544832. Yep, found it. It's a bug! in trace-output.c: tracecmd_warning("did not match size of %lld to %lld", read_size, data_files[i].file_size); read_size and data_files[i].file_size are both 64 bit. But earlier in that function, we have: data_files[i].file_size = data[i].size; Where, that data[i].size is 32 bit and needs to be fixed. > > > Here the capture was running for ~9 hours and the outputs file size (all > .cpuX > files) was about 86GB. After I stopped the record, it printed the above logs > and I ended up with a 185MB trace.date file. trace-cmd report only printed > "cpus=128". BTW, You can rebuild that via: trace-cmd restore -i trace.dat -o trace-fix.dat trace.dat.cpu* Although, you may want to list the files in their actual order, otherwise cpu10 will come before cpu2. > > I can easily reproduce the problem with an active VM (running stress-ng > --fault > 4 --sysbadaddr 2 --memcpy 2 --timeout 0) and trace-cmd record -e 'kvmmmu' -e > 'kvm' -p function_graph. Usually this is enough to generate a lot of traces > and > make the problem appear if I let the record run for one minute. > > I also got the same issue if I tried to send the output over the network > (using > -N for the record). When the record is stopped, the server also has troubles > processing the traces. > > Versions: > - libtracefs: 1.8.1 > - libtraceevent: 1.8.3 > - trace-cmd: 3.3 > > Is this something expected when events are lost? Regarding my usage, I don't > really care to lose some events (I'm tracking an issue that ends up > generating > millions of traces in a short amount of time), I'm mostly interested in > traces > before the "flood" happens. Note you can also use -m option in trace-cmd record that keeps the size down, if you only care to record near where the issue happens, and you can stop the trace. > > Let me know if there is anything I can provide or do to help debug this. I'm > not sure I'll have the time to debug the code but I wanted to report the > issue > nonetheless. > I'll send a patch shortly. Thanks!
Created attachment 306862 [details] CPU data file sizes can be more than 4 GB use long long for size The cpu_data file sizes can be more than 4GB in size, but the internal structure for it uses "int", which can overflow.
Can you apply the attached patch and verify that it fixes the issue for you?
Hello, Thanks a lot for your help and the patch! Unfortunately, I'm still having the same problem once the patch is applied, although the error is a bit different now: # timeout 60 ./usr/bin/trace-cmd record -e 'kvmmmu' -e 'kvm' -p function_graph -o /root/trace.dat plugin 'function_graph' Hit Ctrl^C to stop recording CPU 0: 238040269 events lost CPU 1: 252419501 events lost CPU 2: 226593801 events lost CPU 3: 219970081 events lost CPU 4: 257495700 events lost CPU 5: 233699630 events lost CPU 6: 260549107 events lost CPU 7: 207014252 events lost CPU0 data recorded at offset=0x175000 470040327 bytes in size (2361450496 uncompressed) CPU1 data recorded at offset=0x1c1b9000 457582922 bytes in size (2332516352 uncompressed) CPU2 data recorded at offset=0x3761c000 458088453 bytes in size (2277244928 uncompressed) CPU3 data recorded at offset=0x52afb000 517014227 bytes in size (2573250560 uncompressed) CPU4 data recorded at offset=0x7180c000 433025685 bytes in size (2183168000 uncompressed) CPU5 data recorded at offset=0x8b504000 544574757 bytes in size (2715869184 uncompressed) CPU6 data recorded at offset=0xabc5d000 libtracecmd: Invalid argument did not match size of 0 to 2353856512 For the last few tries I made, I don't have a negative number anymore but always "did not match size of 0 to XXXXX". Thanks for your tips regarding the use of the tool, I'll look into it :)
Created attachment 306871 [details] Use size_t when dealing with file offsets OK, the compression logic doesn't take into account that the file being compressed could be greater than 2 gigs and blows up all over the place. I haven't tested this against large files. Could you try this? Remove the old patch first.
Hello, I've just tried the new patch and things seem to work smoothly now :) I've successfully recorded multiple traces with the reproducer above and was able to use the report command as well! I'll try to run trace-cmd on production workload next week with the patched code and see how it goes. But so far, I'd say we are good. And indeed, all the traces files are exceeding 2GB with the reproducer above. ``` # ls -lah -rw-r--r-- 1 root root 2.9G Sep 13 16:39 trace.dat.cpu0 -rw-r--r-- 1 root root 2.9G Sep 13 16:39 trace.dat.cpu1 -rw-r--r-- 1 root root 3.1G Sep 13 16:39 trace.dat.cpu2 -rw-r--r-- 1 root root 3.0G Sep 13 16:39 trace.dat.cpu3 -rw-r--r-- 1 root root 3.1G Sep 13 16:39 trace.dat.cpu4 -rw-r--r-- 1 root root 3.0G Sep 13 16:39 trace.dat.cpu5 -rw-r--r-- 1 root root 3.0G Sep 13 16:39 trace.dat.cpu6 -rw-r--r-- 1 root root 3.2G Sep 13 16:39 trace.dat.cpu7 # timeout 60 ./usr/bin/trace-cmd record -e 'kvmmmu' -e 'kvm' -p function_graph -o /root/trace.dat plugin 'function_graph' Hit Ctrl^C to stop recording CPU 0: 301537762 events lost CPU 1: 302041204 events lost CPU 2: 306999292 events lost CPU 3: 297890042 events lost CPU 4: 301408377 events lost CPU 5: 294563685 events lost CPU 6: 305865885 events lost CPU 7: 299648659 events lost CPU0 data recorded at offset=0x16f000 577844671 bytes in size (3065196544 uncompressed) CPU1 data recorded at offset=0x22883000 590766299 bytes in size (3079741440 uncompressed) CPU2 data recorded at offset=0x45bea000 630574282 bytes in size (3313508352 uncompressed) CPU3 data recorded at offset=0x6b547000 606190914 bytes in size (3168149504 uncompressed) CPU4 data recorded at offset=0x8f763000 603860979 bytes in size (3227459584 uncompressed) CPU5 data recorded at offset=0xb3746000 619928908 bytes in size (3208388608 uncompressed) CPU6 data recorded at offset=0xd867c000 598434638 bytes in size (3124633600 uncompressed) CPU7 data recorded at offset=0xfc133000 639318332 bytes in size (3334938624 uncompressed) ``` Thanks a lot for your time fixing this issue!