Bug 214023 - tests for libcap 2.52 testing sensitive to C and LD flags
Summary: tests for libcap 2.52 testing sensitive to C and LD flags
Status: RESOLVED IMPLEMENTED
Alias: None
Product: Tools
Classification: Unclassified
Component: libcap (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew G. Morgan
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-10 17:02 UTC by David Runge
Modified: 2021-08-24 03:16 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.12.19
Subsystem:
Regression: No
Bisected commit-id:


Attachments
libcap 2.52 build log (31.61 KB, application/octet-stream)
2021-08-10 17:02 UTC, David Runge
Details
libcap 2.52 build log (without distribution LDFLAGS) (29.80 KB, text/plain)
2021-08-11 06:32 UTC, David Runge
Details
Patch to link pam_cap.so against libpam.so (1.15 KB, patch)
2021-08-11 17:23 UTC, David Runge
Details | Diff

Description David Runge 2021-08-10 17:02:25 UTC
Created attachment 298259 [details]
libcap 2.52 build log

Hi!

I'm packaging libcap for Arch Linux.
To ensure system integration I usually run test suites if upstreams provide them.


With libcap 2.52 the test suite fails on pam_cap due to undefined symbol errors (full build log in attachment):


```
make[2]: Entering directory '/build/libcap/src/libcap-2.52/pam_cap'
make[2]: 'testlink' is up to date.
make[2]: Leaving directory '/build/libcap/src/libcap-2.52/pam_cap'
./test_pam_cap
test_pam_cap: OK! (Skipping privileged tests (uid!=0))
LD_LIBRARY_PATH=../libcap ./pam_cap.so
./pam_cap.so: symbol lookup error: ./pam_cap.so: undefined symbol: pam_get_item
make[1]: *** [Makefile:42: test] Error 127
make[1]: Leaving directory '/build/libcap/src/libcap-2.52/pam_cap'
```



This seems to have been introduced by changes that rely on unreleased changes to linux-pam, as mentioned in the changelog [1]:

"At this time, it relies on features only present in this version of libcap and HEAD of the Linux-PAM sources for the pam_unix.so module."

The changes in question seem to be available in a pull request [2]. However, given that linux-pam has not seen an update in months it is unlikely that we will add these changes to it as it either might not apply cleanly or might introduce side-effects that may not be desirable.

I am not 100% sure whether the above commit to linux-pam is related, but as it touches on the code base and is mentioned in the release notes I figured it is worth mentioning.

In the meantime it would be great if the tests passed for environments in which versions of software are installed that have a stable release (as we usually do not package pre-releases or beta/alpha releases), such as linux-pam 1.5.1.

[1] https://sites.google.com/site/fullycapable/release-notes-for-libcap#h.6pacgt7ee0ky
[2] https://github.com/linux-pam/linux-pam/pull/373
Comment 1 Andrew G. Morgan 2021-08-10 19:51:33 UTC
pam_get_item is not a new API for libpam.

This looks like a bug worth fixing. What is new is we make pam_cap.so executable.

Can you provide the full log for your build?
Comment 2 Andrew G. Morgan 2021-08-10 19:52:17 UTC
Sigh, I see the attachment now.
Comment 3 Andrew G. Morgan 2021-08-10 19:55:01 UTC
Is there somewhere I can view all the patches you are applying?
Comment 5 Andrew G. Morgan 2021-08-11 01:16:00 UTC
From your build log, I see (I've broken the lines manually to make comparison easier):

gcc -Wl,-x -shared -o pam_cap.so pam_cap.o execable.o \
  -L/build/libcap/src/libcap-2.52/pam_cap/../libcap -lcap \
  -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now \
  -L/build/libcap/src/libcap-2.52/libcap \
  -L/build/libcap/src/libcap2.52/pam_cap/../libcap \
  --entry=__so_start

On my system I see:

gcc -Wl,-x -shared -o pam_cap.so pam_cap.o execable.o \
  -L/home/morgan/gits/libcap/pam_cap/../libcap -lcap \
  -L/home/morgan/gits/libcap/pam_cap/../libcap \
  --entry=__so_start

It looks like the this is coming from somewhere in your build process:

  -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now

Given this, it must be one of these options that is generating the problem...
Manually using trial and error I find it is '-z,now' that is causing the problem.

Try building without that.
Comment 6 Andrew G. Morgan 2021-08-11 01:20:59 UTC
Browsing the Make.Rules file, I'm guessing this is coming from a LDFLAGS environment variable. As a quick hack try:

  unset LDFLAGS

inside your build() and check() etc. scripting.
Comment 7 David Runge 2021-08-11 06:27:41 UTC
> Try building without that.

I don't believe that is feasible, as those are our distribution-wide flags:

https://github.com/archlinux/svntogit-packages/blob/c9a2a242aa5174c4da92f7e69568b71aac918f77/trunk/makepkg.conf#L45
Comment 8 David Runge 2021-08-11 06:31:22 UTC
FWIW, in attachment a the build log of a build without our LDFLAGS (it fails in the same way).
Comment 9 David Runge 2021-08-11 06:32:25 UTC
Created attachment 298283 [details]
libcap 2.52 build log (without distribution LDFLAGS)
Comment 10 Andrew G. Morgan 2021-08-11 13:51:30 UTC
https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html suggests that -fno-plt might also be part of the problem.

This is sort of messy and one that seems deeply embedded in your compilation environment. You could work around it by disabling the test.

Another approach might be to -lpam on the line that links pam_cap.so . This forces pam_cap.so to link against the system library and prevents some testing methodologies. However, there are none of those in use in the build tree at present so it might work.
Comment 11 David Runge 2021-08-11 17:15:11 UTC
Okay, the suggestion in regards to explicitly linking against system pam worked.

I'll attach a simple patch for posterity.
Comment 12 David Runge 2021-08-11 17:23:39 UTC
Created attachment 298289 [details]
Patch to link pam_cap.so against libpam.so
Comment 13 Andrew G. Morgan 2021-08-12 03:29:24 UTC
I've added a comment to the Makefile to point back here to this bug for the detailed information about why we found your setup needed -lpam. Hopefully, that will help others.

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=6dea1813f269f9c03cea226fffdd75670c70ea01
Comment 14 David Runge 2021-08-13 07:25:12 UTC
If explicitly linking against libpam.so fixes this though and has no side-effects otherwise, why not leave it in?

I would rather not carry a compatibility patch to be able to run tests because of building with fairly common optimization flags and I am sure other distributions would agree here (in the case that they are actually running tests).

The distribution flags have not changed on our side between 2.51 and 2.52 (in fact have not changed since May, libcap 2.51 is from end of June): https://github.com/archlinux/svntogit-packages/commits/packages/pacman/trunk/makepkg.conf
Our gcc in use has also not changed since May FWIW: https://github.com/archlinux/svntogit-packages/commits/packages/gcc/trunks

This is a strong indicator for this being an issue introduced on libcap's side, not on our side, which is why I don't see the reason for downstreams to fix this out-of-tree.
Comment 15 Andrew G. Morgan 2021-08-14 03:54:30 UTC
We added a feature in 2.52 to make pam_cap.so executable as a stand alone binary. 

Yes, this is new. libc.so.2 has been able to do this forever but few
other libraries bother to support this. However, there are many other parts of the libcap build that appear to fail with these relatively new build options (the default .git repo build for example) so I'm not all that persuaded that this is all to be heaped on recent libcap changes.

There are a number of issues forcing linkage against other things. Mostly along the lines of the snowballing effect of link x, which links to y, which links to z and before you know it there are an alphabet of linkages which may or may not have undesirable symbol name conflicts in some future version that I don't ever want to be on the hook for debugging...

By default the minimal build yields a quite respectable collection of linkages:

$ ldd pam_cap.so 
        linux-vdso.so.1 (0x00007ffef59ef000)
        libcap.so.2 => /lib/x86_64-linux-gnu/libcap.so.2 (0x00007fa476c54000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa476a93000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fa476c6f000)

The way libpam actually loads a module, it exports all of its symbols to the module, so pam_cap.so works just fine as a module with this minimal set of linkages.

If we link against -lpam we suddenly have:

$ ldd pam_cap.so 
        linux-vdso.so.1 (0x00007ffc8ebed000)
        libcap.so.2 => /lib/x86_64-linux-gnu/libcap.so.2 (0x00007ce21d877000)
        libpam.so.0 => /lib/x86_64-linux-gnu/libpam.so.0 (0x00007ce21d866000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ce21d6a5000)
        libaudit.so.1 => /lib/x86_64-linux-gnu/libaudit.so.1 (0x00007ce21d67a000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ce21d675000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ce21d892000)
        libcap-ng.so.0 => /lib/x86_64-linux-gnu/libcap-ng.so.0 (0x00007ce21d66d000)

Most of which are, frankly, bizarre to me.

I could simply disable ./pam_cap.so being runnable on such systems. Or, not test it. After all most .so files when you try to execute them segfault. So no one would likely care.

However, I guess you have a point that it can work in the spirit of whatever distribution build environment mandates, so why not support that? As such:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=d5daba542ae15cf47752ab5430ded4cd0d0a7ce3
Comment 16 Arnout Vandecappelle 2021-08-23 23:18:44 UTC
Commit d5daba542ae15cf47752ab5430ded4cd0d0a7ce3 breaks things for the Buildroot project (and probably others as well).

First of all, executing ./lazylink.so is problematic when doing cross-compilation. In practice it will usually just fail, which causes -lpam to be linked in, which is fine. The problem is that whether it fails or not depends on the build environment: if binfmt-misc is enabled to launch qemu to run the cross-compiled binary, it will not fail. This is not great for reproducibility.

More importantly though: the commit also removes LDFLAGS from the link commands of progs and tests. We use LDFLAGS to set the RPATH correctly to point to the eventual install location, so it's pretty essential that we can set it.

I think that hardering with -z now is pretty much a given nowadays on most distros - at least Fedora, RHEL, Debian and Ubuntu (and, obviously, Arch) set it. So I wonder if it's really worth the trouble of doing all that lazylink stuff rather than simply linking with pam.

If you really want to load pam as a module, why not dlopen() it?
Comment 17 Andrew G. Morgan 2021-08-24 03:16:06 UTC
I don't understand why "hardening" applies to PAM modules. The dlopen command in libpam explicitly provides linkage for the loaded module to itself. That is how it was designed. (Just to be clear, I wrote that part of libpam.)

If you "harden" things to force link against some other libpam, you will potentially end up with duplicate symbols and a whole mess of chaos. In practice folk don't have such production systems, but think about some stress/regression/debugging contexts where dlopen()ing from some fake libpam is desired.

I've looked back at the LDFLAGS -> LDSTATIC change, and I've found a way to achieve what I wanted in a more targeted way:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=5647374b3319796957edfb25400bf4164efca6c2

Note You need to log in before you can comment on or make changes to this bug.