I observed KernelShark aborting on Raspberry Pi 3 model B+ with the following error when I clicked on the second trace record that was displayed: kernelshark: libkshark-model.c:242: ksmodel_set_upper_edge: Assertion 'row != BSEARCH_ALL_GREATER' failed. Aborted For more information, please see minute 0:57 of video I posted on my YouTube channel showing KernelShark abort issue as it happened: https://www.youtube.com/watch?v=pWRuNQTo9-8 Regards, Alan Mikhak
$ lsb_release -a No LSB modules are available. Distributor ID: Raspbian Description: Raspbian GNU/Linux 9.9 (stretch) Release: 9.9 Codename: stretch $ uname -a Linux rpi7 4.19.42-v7+ #1219 SMP Tue May 14 21:20:58 BST 2019 armv7l GNU/Linux
Hi Alan, Please attach the trace.dat file which makes KernelShark crashing. Thanks! Yordan
Created attachment 283219 [details] trace.dat file for this issue Hi Yordan, Please see the attached trace.dat file which KernelShark was processing when it raised this assertion. Regards, Alan
Hi Alan, From what I see in the video, my guess will be that something is going wrong in static void ksmodel_set_in_range_bining() https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/tree/kernel-shark/src/libkshark-model.c#n96 in this function we once again mess up size_t and uint64_t. Maybe "histo->min" and "histo->max" are getting some nonsensical values which will later lead to having the binary search failing. Unfortunately, it will be very hard for me to get a setup on which I can reproduce this problem, so I was thinking that maybe you can do some debugging using your system. We will greatly appreciate your help! Cheers, Yordan
Hi Alan, Please try applying the two patches here https://patchwork.kernel.org/project/linux-trace-devel/list/?series=131053 and check if this has any effect on the problem that you see on your Raspberry Pi. Thanks! Yordan
Hi Yordan, I built trace-cmd and KernelShark from a fresh clone of the official git repo with both patches applied. I then recorded a fresh trace.cmd file with the same command as before. $ sudo trace-cmd record -e sched ls -ltr /usr > /dev/null KernelShark still aborted on my Raspberry Pi 3 model B+ when I clicked on the second entry. However, it produced the following new error message before asserting: qt.qpa.xcb: QXcbConnection: XCB error: 148 (Unknown), sequence: 182, resource id: 0, major code: 140 (Unknown), minor code: 20 kernelshark: /home/pi/src/kernel.org/trace-cmd/kernel-shark/src/libkshark-model.c:242: ksmodel_set_upper_edge: Assertion `row != BSEARCH_ALL_GREATER' failed. Aborted Regards, Alan
Hi Yordan, I installed trace-cmd and kernelshark on my Raspberry Pi 3 model B+ using the following command: $ sudo apt install trace-cmd kernelshark The version of kernelshark installed by 'apt install' runs smoothly on my Raspberry Pi 3 model B+ and doesn't exhibit the abort issue I reported here about the version I built from sources. Regards, Alan
The Ubuntu package ships the 10 years old GTK version of KernelShark. The new version (the one that crashes on your Raspberry Pi 3) was developed from scratch recently and is based on Qt. BTW I believe I found the problem responsible for the crash. It is here: https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/tree/kernel-shark/src/libkshark-model.c#n587 void ksmodel_jump_to(struct kshark_trace_histo *histo, size_t ts) { size_t min, max, range_min; ... this must be changed to: void ksmodel_jump_to(struct kshark_trace_histo *histo, uint64_t ts) { uint64_t min, max, range_min; Please try this fix, keeping the previous two patches, and tell me if it works this time. Thanks a lot! Yordan
Hi Yordan, I had to manually apply your two patches from 2019-06-14 manually on top of your first patch from 2019-06-12. I was able to build KernelShark on my Raspberry Pi 3 model B+ and observe that your combined changes resolved the abort issue as well as compiler warnings. I also observed the same good results on my ODROID-XU3 from hardkernel.com which also runs a 32-bit armv7l kernel as well as my 96Boards ROCK960 model C which runs a 64-bit aarch64 kernel. Please see the attached patch file which shows the combined changes I manually applied as your patches intended. Regards, Alan
Created attachment 283275 [details] Combined patch for manual application of Yordan patches Combined patch for manual application of patches from Yordan.
Hi Alan, Thank you very much for helping us making KernelShark better. I hope you will stay active, reporting bugs, or why not even sending patches to the mailing list. cheers, Yordan
Hi Yordan, Glad I could do something useful. I hope to stay active. Regards, Alan
Hi Yordan, I observed this abort issue again when I cloned a fresh copy of trace-cmd git repo today to compile on a brand new Raspberry Pi 4 model B. git://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git I also observed the same compiler warnings that you had fixed in your patches. Reopening the ticket to track until your patches go live. Regards, Alan
For reference: https://lore.kernel.org/linux-trace-devel/20190614135045.17223-1-ykaradzhov@vmware.com/T/#t
Hi Alan, It was my fault, to close the issue before the fix was actually pushed upstream. Is is upstream now, so please try again and tell us if works or not. You can look here https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/log/ to see all patches that made it to the master branch of the repo. Thanks a lot! Y.
Hi Yordan, I verified that KernelShark built successfully from latest git repo without compiler warnings and didn't exhibit this abort issue on the following platforms: Raspberry Pi 3 Model B+: $ uname -a Linux rpi6 4.19.57-v7+ #1244 SMP Thu Jul 4 18:45:25 BST 2019 armv7l GNU/Linux $ lsb_release -a No LSB modules are available. Distributor ID: Raspbian Description: Raspbian GNU/Linux 9.9 (stretch) Release: 9.9 Codename: stretch Raspberry Pi 4 Model B: $ uname -a Linux rpi8 4.19.57-v7l+ #1244 SMP Thu Jul 4 18:48:07 BST 2019 armv7l GNU/Linux $ lsb_release -a No LSB modules are available. Distributor ID: Raspbian Description: Raspbian GNU/Linux 10 (buster) Release: 10 Codename: buster Nano Pi M4: $ uname -a Linux nanopi1 4.4.167 #1 SMP Sun Jun 2 18:38:17 CST 2019 aarch64 aarch64 aarch64 GNU/Linux $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.2 LTS Release: 18.04 Codename: bionic All of the above platforms required building Qt 5 from sources to resolve the per-CPU visualization issue as documented in a separate ticket. I observed the following trace-cmd compiler warning on the Raspberry Pi platforms. I was able to resolve the warning by installing the 'libaudit-dev' package as opposed to the 'libaudit-devel' package suggested by the warning: $ make ::: COMPILE trace-snapshot.o COMPILE trace-stat.o COMPILE trace-profile.o trace-profile.c:23:3: warning: #warning "lib audit not found, using raw syscalls " "(install libaudit-devel and try again)" [-Wcpp] # warning "lib audit not found, using raw syscalls " \ ^~~~~~~ COMPILE trace-stream.o COMPILE trace-restore.o COMPILE trace-check-events.o ::: $ sudo apt install libaudit-devel Reading package lists... Done Building dependency tree Reading state information... Done E: Unable to locate package libaudit-devel $ sudo apt install libaudit-dev Regards, Alan
RESOLVED as CODE_FIX