Bug 220096
Summary: | perf top command probability coredump | ||
---|---|---|---|
Product: | Linux | Reporter: | Fei Lang (601376534) |
Component: | Kernel | Assignee: | Tools.Other (tools_other) |
Status: | NEW --- | ||
Severity: | high | CC: | 601376534, irogers |
Priority: | P3 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: | |
Attachments: | Compile the perf binary with libasan based on the 5.10 kernel. |
Description
Fei Lang
2025-05-07 02:12:14 UTC
Created attachment 308147 [details]
Compile the perf binary with libasan based on the 5.10 kernel.
I cann't reproduce in the master branch,but the patch you sent is more suitable for the master branch. So i need more time to adapt to the stable-5.10 branch. Also, I went through the code today, but I still haven't confirmed how two threads avoid the problem of concurrent reads and writes, so is the problem logically still there? Look forward to your reply. Thank you. My patch just adds a missing read lock: https://lore.kernel.org/lkml/20250519224645.1810891-1-irogers@google.com/ My understanding is that the struct comm is associated/owned by a struct thread. There may be multiple comm events and those are held in a link list on the thread. Each struct comm is mainly holding a string, comm_str, which is a reference counted string done to de-duplicate comm strings - the memory saving of this may not be worth the energy. Generally the comm_str doesn't ever change for a struct comm, the exception is when a place holder value is first set. The comm_set bool on thread is tracking this - not my code and I'm not quite following its logic. Changes to the list of struct comm in thread (comm_list) must be done under the comm_lock, my changes help enforce this at compile time with -Wthread-safety. The only call to change the comm_str in comm (comm__override) is made in ____thread__set_comm and this is done with thread__comm_lock exclusively held. There could be a use of a struct comm and comm__override be called for the same struct comm, the lock is guarding the list and not the list members. Things that can be better. The thread owns struct comm but as the comm is passed around without the thread, this connection is lost and there is a possible use after free - address sanitizer (-fsanitize=address) should help catch this. The histogram code has had poor memory safety discipline [1] and this appears to be another crash caused by it. Longer term I think this kind of stuff probably belongs being written in python. The C UI stuff is massively clunky and improvements to the python API mean the code can be clean and free from memory misuse [2]. The crash certainly looks like a use-after-free so please pursue -fsanitize=address which should be fast enough for the bug to reproduce and to gather the necessary data. [1] https://lore.kernel.org/r/20250307061250.320849-2-namhyung@kernel.org [2] https://lore.kernel.org/lkml/20250519195148.1708988-8-irogers@google.com/ |