Bug 214203

Summary: libcap HEAD failing github build server test
Product: Tools Reporter: Andrew G. Morgan (morgan)
Component: libcapAssignee: Andrew G. Morgan (morgan)
Status: RESOLVED CODE_FIX    
Severity: blocking    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: n/a Subsystem:
Regression: No Bisected commit-id:

Description Andrew G. Morgan 2021-08-27 20:23:34 UTC
Problem #1 is I didn't get an email alerting me to this failure.
Problem #2 we got these failures.

Wed Aug 25 19:11:22 2021 -0700 7a75dbc: PASS
Wed Aug 25 19:38:13 2021 -0700 04f903f: PASS
Wed Aug 25 19:48:12 2021 -0700 935ab8f: PASS
Wed Aug 25 19:50:46 2021 -0700 a0aaea6: PASS
Wed Aug 25 21:09:19 2021 -0700 07cdff9: PASS
Thu Aug 26 20:24:47 2021 -0700 c90b5de: PASS
Thu Aug 26 21:45:27 2021 -0700 a56162c: FAIL
Thu Aug 26 22:26:56 2021 -0700 386af0e: FAIL
Fri Aug 27 10:26:59 2021 -0700 552db8f: FAIL
Fri Aug 27 10:27:04 2021 -0700 b56400f: FAIL

It started failing with this commit:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=a56162c6900d203c5ac63a2b41b46cb0c45c645f

Curiously, the crash is not consistent [386af0e]:

LD_LIBRARY_PATH=../libcap ./compare-cap
*** buffer overflow detected ***: terminated
SIGABRT: abort
PC=0x7fae3e2b218b m=0 sigcode=18446744073709551610

[the next time (386af0e) it looked like this]:

LD_LIBRARY_PATH=../libcap ./compare-cap
*** buffer overflow detected ***: terminated
SIGABRT: abort
PC=0x7fea1463e18b m=0 sigcode=18446744073709551610

[the next commit: 552db8f]

gcc -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Dlinux -Wall -Wwrite-strings -Wpointer-arith -Wcast-qual -Wcast-align -Wstrict-prototypes -Wmissing-prototypes -Wnested-externs -Winline -Wshadow -g  -fPIC -I/home/runner/work/libcap-testing/libcap-testing/clone/libcap/libcap/../libcap/include/uapi -I/home/runner/work/libcap-testing/libcap-testing/clone/libcap/libcap/../libcap/include cap_test.c cap_alloc.o cap_proc.o cap_extint.o cap_flag.o cap_text.o cap_file.o -o cap_test
./cap_test
*** buffer overflow detected ***: terminated
make[1]: *** [Makefile:146: test] Aborted (core dumped)
make[1]: Leaving directory '/home/runner/work/libcap-testing/libcap-testing/clone/libcap/libcap'
make: *** [Makefile:45: test] Error 2

[the last commit: b56400f]

make -C libcap test
make[1]: Entering directory '/home/runner/work/libcap-testing/libcap-testing/clone/libcap/libcap'
gcc -O2 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Dlinux -Wall -Wwrite-strings -Wpointer-arith -Wcast-qual -Wcast-align -Wstrict-prototypes -Wmissing-prototypes -Wnested-externs -Winline -Wshadow -g  -fPIC -I/home/runner/work/libcap-testing/libcap-testing/clone/libcap/libcap/../libcap/include/uapi -I/home/runner/work/libcap-testing/libcap-testing/clone/libcap/libcap/../libcap/include cap_test.c cap_alloc.o cap_proc.o cap_extint.o cap_flag.o cap_text.o cap_file.o -o cap_test
./cap_test
*** buffer overflow detected ***: terminated
make[1]: *** [Makefile:146: test] Aborted (core dumped)
Comment 1 Andrew G. Morgan 2021-08-27 20:26:53 UTC
I think the first two failing commits were because of the bug I introduced. One that I fixed in 552db8f. I added some new test code in 552db8f and that appears to be where the current failure is occurring.

Since I can't reproduce this failure on my personal systems, I may have to resort to some debugging commits to track down the problem(s).
Comment 2 Andrew G. Morgan 2021-08-27 20:39:02 UTC
OK, so I figured out why I wasn't getting email on the build failure.
Comment 3 Andrew G. Morgan 2021-08-27 20:58:55 UTC
A fix (not sure if it is "the" fix), plus some more debugging info:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=de1130dbfe6d4ce99422b11cac147d39448bcd40
Comment 4 Andrew G. Morgan 2021-08-28 18:21:56 UTC
Thu Aug 26 20:24:47 2021 -0700 c90b5de: PASS

Bugs introduced as a side effect of "fixing" a clang build warning:

Thu Aug 26 21:45:27 2021 -0700 a56162c: FAIL
Thu Aug 26 22:26:56 2021 -0700 386af0e: FAIL

Bug modified after a wave of static analysis fixes, one of which was an partial fix for the bug(s) introduced above:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=552db8f4116df3fad4e4ebf90a9a05a77b9486fd

Fri Aug 27 10:26:59 2021 -0700 552db8f: FAIL
Fri Aug 27 10:27:04 2021 -0700 b56400f: FAIL
Fri Aug 27 13:55:11 2021 -0700 de1130d: FAIL

Happened to notice the build failures, and figured out how to fix "the build server" to notify me of failures once more...

Instrumentation added [sadly no real insight gained]:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=43365cf01c64b530e7a3d62214247e1aa042414d

Fri Aug 27 21:01:46 2021 -0700 43365cf: FAIL

Figured out the fix [learned about how to turn on the automated buffer overflow glibc detection code, and then debugged, under gdb, the failing binary]:

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=6c38eb78d96a60a9503dc5c89ade67b65778fed9

Sat Aug 28 09:43:51 2021 -0700 6c38eb7: PASS

All good.