Bug 217768
Summary: | Build error | ||
---|---|---|---|
Product: | Tools | Reporter: | Douglas RAILLARD (douglas.raillard) |
Component: | Trace-cmd/Kernelshark | Assignee: | Default virtual assignee for Trace-cmd and kernelshark (tools_tracecmd_kernelshark) |
Status: | NEW --- | ||
Severity: | normal | CC: | metin.kaya, rostedt |
Priority: | P3 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: |
Description
Douglas RAILLARD
2023-08-07 10:53:45 UTC
After some digging strstrip is defined both in trace-cmd: ./tracecmd/trace-stat.c:40:char *strstrip(char *str) and in libtracefs: ./src/tracefs-utils.c:323:__hidden char *strstrip(char *str) In libtracefs.a the symbol is defined as such by readelf: 44: 0000000000000990 159 FUNC GLOBAL HIDDEN 1 strstrip "__hidden" in libtracefs would make the symbol only available to the linked binary including the tracefs-utils.o object file. This probably works fine for dynamic linking, since that binary is libtracefs.so and nothing else in it defines strstrip(). However during static linking, tracefs-utils.o ends up more or less being dumped wholesale in the binary linking to it, at which point it conflicts with the definition in trace-stat.c. The fact that one definition is STV_HIDDEN does not change the fact that both are STB_GLOBAL and as such conflict. There are two paths to fix the issue: * The use of __hidden to control scoping is broken and a static function should be used instead (or rename it). This will ensure an STB_LOCAL symbol and the behavior will be identical between static and dynamic linking. * The objects file ending up in the static archive need a post-processing pass to remove any STV_HIDDEN symbol (or turn them from STB_GLOBAL into STB_LOCAL). This can be achieved with "objcopy --localize-hidden libtracefs.a". The symbol is turned to STB_LOCAL and as such will not conflict with anything when linking. It will also not be available to other compilation units in libtracefs.a itself which makes that equivalent to using "static" in C: 21: 0000000000000990 159 FUNC LOCAL HIDDEN 1 strstrip And indeed, compiling with such option leads to: /tmp/tmpfek7roba/x86_64/source/libtracefs/src/tracefs-dynevents.c:778: undefined reference to `get_tep_event' With: ./src/tracefs-events.c:971:__hidden struct tep_event *get_tep_event(struct tep_handle *tep, ./src/tracefs-dynevents.c:778: return get_tep_event(tep, dynevent->system, dynevent->event); This is because __hidden get_tep_event() (STB_GLOBAL + STV_HIDDEN) got converted to (STB_LOCAL + STV_HIDDEN), which essentially turns it to "static get_tep_event()" and as such is not available to any compilation unit other than tracefs-events.o So since there is no way to get a visibility-based namespace in C that works for libs with more than one ELF file in it (i.e. static lib with more than one .o), the options are: 1. Change the name of one of the definitions. 2. Move that function to a 3rd library and make the other 2 depend on it. 3. pre-link all the object files in libtracefs.a into a single .o This patch implements 3. for libtracefs and fixes static build of trace-cmd. A similar patch would be needed for libtraceevent and libtracecmd even though there is currently no symbol conflict (by chance): -- >8 -- From a0deaedd52dc49548d6e0ccce262fbc3a29fa267 Mon Sep 17 00:00:00 2001 From: Douglas Raillard <douglas.raillard@arm.com> Date: Wed, 9 Aug 2023 11:44:47 +0100 Subject: [PATCH] libtracefs: Fix ELF symbol binding in static lib libtracefs currently uses __attribute__((visibility ("hidden"))) in order to make some functions globally available inside libtracefs, but hidden to the outside world. This gives such function the ELF binding STB_GLOBAL and visibility STV_HIDDEN. This works for shared objects, as all the object files are linked into a single .so and therefore the only consumers of symbols will be other components linking to libtracefs.so (e.g an executable or another .so). However, this is problematic when libtracefs is used as a static archive: the archive is made of a collection of .o, and the consumers of such __hidden function are other object files. There is unfortunately no distinction made by the linkers between other object files in the same archive and object files from another component. This leads to STV_HIDDEN being essentially useless and it results in linking errors if the symbol happens to be redefined elsewhere with STB_GLOBAL (which does happen with libtracecmd). In order to fix that, we can pre-link the object files in the static archive to resolve all references to such __hidden function, and then turn STB_GLOBAL + STV_HIDDEN into STB_LOCAL + STV_HIDDEN. This way, the symbol becomes local and will be ignored when linking to another component, restoring the same behavior as when a dynamic lib is used. --- Makefile | 3 +++ scripts/utils.mk | 4 +++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 9f377e9..74898bf 100644 --- a/Makefile +++ b/Makefile @@ -29,10 +29,13 @@ endef # Allow setting CC and AR, or setting CROSS_COMPILE as a prefix. $(call allow-override,CC,$(CROSS_COMPILE)gcc) $(call allow-override,AR,$(CROSS_COMPILE)ar) +$(call allow-override,OBJCOPY,$(CROSS_COMPILE)objcopy) $(call allow-override,PKG_CONFIG,pkg-config) $(call allow-override,LD_SO_CONF_PATH,/etc/ld.so.conf.d/) $(call allow-override,LDCONFIG,ldconfig) +export AR CC OBJCOPY PKG_CONFIG LD_SO_CONF_PATH LDCONFIG + EXT = -std=gnu99 INSTALL = install diff --git a/scripts/utils.mk b/scripts/utils.mk index 4d0f8bc..c725339 100644 --- a/scripts/utils.mk +++ b/scripts/utils.mk @@ -69,7 +69,9 @@ do_build_static_lib = \ if [ -f $@ ]; then \ mv $@ $@.rm; $(RM) $@.rm; \ fi; \ - $(AR) rcs $@ $^) + $(LD) -r $^ -o prelink.o; \ + $(AR) rcs $@ prelink.o; \ + $(OBJCOPY) --localize-hidden $@) do_compile_shared_library = \ ($(print_shared_lib_compile) \ -- 2.25.1 On Wed, 09 Aug 2023 11:54:15 +0000 bugzilla-daemon@kernel.org wrote: https://bugzilla.kernel.org/show_bug.cgi?id=217768 --- Comment #1 from Douglas RAILLARD (douglas.raillard@arm.com) --- > So since there is no way to get a visibility-based namespace in C that works > for libs with more than one ELF file in it (i.e. static lib with more than > one > .o), the options are: Hi Douglas, Thanks for looking into this and debugging it. Yeah, I figured that __hidden was going to bite me sooner or later. :-/ > > 1. Change the name of one of the definitions. > 2. Move that function to a 3rd library and make the other 2 depend on it. > 3. pre-link all the object files in libtracefs.a into a single .o > > > This patch implements 3. for libtracefs and fixes static build of trace-cmd. > A > similar patch would be needed for libtraceevent and libtracecmd even though > there is currently no symbol conflict (by chance): Actually, I prefer #1. #2 is out of the question. But trace-cmd actually depends on hidden functions being accessible during linking. That is, the trace-cmd binary uses __hidden functions in libtracecmd. The __hidden() just means that they should not be relied on. Perhaps we need to make all __hidden functions have a prefix. Hmm, perhaps just add: "_tracecmd_" or "_tracefs_" to the front of it? That is, for strstrip() in trace-cmd, rename it to: _tracecmd_strstrip() and for libtracefs: _libtracefs_strstrip() by convention, anything that starts with "tracecmd_" or "tracefs_" should be exposed and considered API. We can have that anything with "_tracecmd_" or "_tracefs_" is to be hidden and not part of the API. -- Steve On Wed, 9 Aug 2023 19:46:02 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > Perhaps we need to make all __hidden functions have a prefix. Hmm, perhaps > just add: > > "_tracecmd_" or "_tracefs_" to the front of it? > > That is, for strstrip() in trace-cmd, rename it to: _tracecmd_strstrip() > > and for libtracefs: _libtracefs_strstrip() That should have been: _tracefs_strstrip() Hi Steven,
> That is, the trace-cmd binary uses __hidden functions in libtracecmd. The
> __hidden() just means that they should not be relied on.
How does it currently work ? If the symbol is STV_HIDDEN in a DSO no linker or loader will use the definition in libtracecmd to provide a value for an undefined symbol in e.g. an executable.
Similarly, anything hidden in e.g. libtracefs will never be accessible from the outside world (unless linked statically as I experienced, but that breaks the build)
Other than that it would definitely simplify building, as the visibility shenanigans can be made to work with a Makefile as everything is "manual", but might be more tricky to apply in other build systems.
There is another hybrid option that might be overkill:
* Define functions using a macro "void HIDDEN(myfunc)(char *param)"
* For static builds, the macro expands to:
#define HIDDEN(func) _thehiddenprefix_##func
* For DSO builds, it expands to:
#define HIDDEN(func) __attribute__((visbility("hidden"))) func
So that ensures that no-one can depend on anything private in the case where it's possible to swap a DSO for another one for a patch/minor upgrade, and for the static link updating would require a rebuild anyway so it's less important to have an airtight ABI. From a build system point of view nothing needs to be changed.
Douglas
Note, trace-cmd only links to the static libtracecmd.a. As for libtracefs, nothing (but the testing and sample code) links to it statically and needs access to these functions. How do other C libraries use functions between files within the library without exposing them to the rest of the world? > How do other C libraries use functions between files within the library > without exposing them to the rest of the world? I guess they are just publicly exposed in the ABI for basic projects (and just not documented). Otherwise they are probably playing tricks like I've shown, or they use static inline functions for small utilities. AFAIR you can also filter the public ABI with symbol versioning system, but that only applies to DSO. Having a look at e.g. zlib: * The zmemcpy function has both ZLIB_INTERNAL marker and a lib prefix ("z"): void ZLIB_INTERNAL zmemcpy(dest, source, len) https://github.com/madler/zlib/blob/04f42ceca40f73e2978b50e93806c2a18c1281fc/zutil.c#L151 * But this function has no prefix: void ZLIB_INTERNAL inflate_fast(strm, start) https://github.com/madler/zlib/blob/04f42ceca40f73e2978b50e93806c2a18c1281fc/inffast.c#L50 * Then they also use the visibility trick when the platform supports it: #ifdef HAVE_HIDDEN # define ZLIB_INTERNAL __attribute__((visibility ("hidden"))) #else # define ZLIB_INTERNAL #endif On Ubuntu 20.04, I have: $ readelf -a /usr/lib/x86_64-linux-gnu/libz.a 30: 0000000000000000 0 NOTYPE GLOBAL HIDDEN UND inflate_fast And the symbol does not exist in the DSO at all, so it basically works like the current state of trace-cmd, which is, it doesn't and will blow up if someone tries to inflate_fast(). Otherwise in other languages the problem is solved for good with a module/namespace system. Hi Steven, Is there any update on this one ? I checked the git history and there seems to be support for meson, but that probably does not fix the issue. So far we haven't needed to update our binary in the last year but if we do we will be in a pickle. Do we have a path out of the current situation ? Cheers, Douglas > Is there any update on this one ?
>
> I checked the git history and there seems to be support for meson, but that
> probably does not fix the issue. So far we haven't needed to update our
> binary
> in the last year but if we do we will be in a pickle. Do we have a path out
> of
> the current situation ?
Would the renaming of the problem functions work for you?
I believe that was the only acceptable solution you proposed.
-- Steve
After re-reading the discussion, it looks like it would work indeed. I also noticed this: > But trace-cmd actually > depends on hidden functions being accessible during linking. That is, the > trace-cmd binary uses __hidden functions in libtracecmd. The __hidden() > just means that they should not be relied on. and > Note, trace-cmd only links to the static libtracecmd.a. > As for libtracefs, nothing (but the testing and sample code) links to it > statically and needs access to these functions. So if I understand correctly, only libtracecmd.a is using __hidden in a way that allows another component (trace-cmd binary) to use its hidden symbols. Other uses such as in libtracefs should use a local symbol, either by using a static function in header or by using __hidden and then used "objcopy -localize-hidden libtracefs.a" in static builds (this will happen by default for a DSO build according to [1]). So that would mean that: * libtracecmd.a needs to get its symbols renamed, since they are part of the (private) API of the component and used by other components. Only a naming convention can allow that at this level. * libtracefs and others could retain the use of __hidden if they use objcopy for static builds. However, static functions or a naming convention is the more typical way to do that. [1] https://unix.stackexchange.com/questions/701556/elf-symbol-globalhidden Douglas Published patches below to fix static build issue by renaming __hidden functions: 1. https://lore.kernel.org/linux-trace-devel/20241028122105.1594296-1-metin.kaya@arm.com/T/#u 2. https://lore.kernel.org/linux-trace-devel/20241028122143.1594614-1-metin.kaya@arm.com/T/#u 3. https://lore.kernel.org/linux-trace-devel/20241028122247.1594967-1-metin.kaya@arm.com/T/#u |