This bug was noted while resolving a different bug: https://bugzilla.kernel.org/show_bug.cgi?id=219880 . The issue seems to be due to the PSX mechanism failing to work in this environment. root@7964c4e70be8:/sources/libcap2-2.75/tests# make test /usr/bin/make run_psx_test run_libcap_psx_test make[1]: Entering directory '/sources/libcap2-2.75/tests' gcc -O2 -Wall -Wwrite-strings -Wpointer-arith -Wcast-qual -Wcast-align -Wstrict-prototypes -Wmissing-prototypes -Wnested-externs -Winline -Wshadow -Wunreachable-code -Dlinux -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -I/sources/libcap2-2.75/tests/../libcap/include/uapi -I/sources/libcap2-2.75/tests/../libcap/include psx_test.c -o psx_test -Wl,-rpath,../libcap -L/sources/libcap2-2.75/tests/../libcap -Wl,--no-as-needed -Wl,--whole-archive -lpsx -Wl,--no-whole-archive -Wl,--as-needed -lpthread ./psx_test iteration [1244]: 0 xxx ^Cmake[1]: *** [Makefile:72: run_psx_test] Interrupt make: *** [Makefile:57: test] Interrupt root@7964c4e70be8:/sources/libcap2-2.75/tests# ldd psx_test libpsx.so.2 => ../libcap/libpsx.so.2 (0x00007fea0a3e0000) libc.so.6 => /lib/mips64el-linux-gnuabi64/libc.so.6 (0x00007fea0a1b0000) /lib64/ld.so.1 (0x00007fea0b8b7000) I placed a printf("xxx\n") and printf("yyy\n") before and after the psx_syscall function call pthread_mutex_unlock(&mu); printf("xxx\n"); psx_syscall(SYS_prctl, PR_SET_KEEPCAPS, global_kept); printf("yyy\n"); pthread_mutex_lock(&mu); As can be seen above, the code appears to hang somewhere inside the psx_syscall() call.
This does not fail on a system image with QEMU for the mips64el build. Just when run in a container.
On the surface it looks as if the threads are not working with consistent signal blocking. The first of these is the thread that is performing the psx_syscall() and the 2nd thread here is one of the other threads that should be receiving signal 33. What is odd, is that it is actually receiving (and blocking) a different signal. That is: (gdb) p/x 1ULL <<(33-1) $4 = 0x100000000 but the pending signal is 0x800000000 root@7964c4e70be8:/sources/libcap2-2.75/tests# grep Sig /proc/5713/status SigQ: 3/255093 SigPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000400000000 SigCgt: 0000000b000004dc root@7964c4e70be8:/sources/libcap2-2.75/tests# grep Sig /proc/5715/status SigQ: 3/255093 SigPnd: 0000000800000000 SigBlk: fffffffe7ffbfa77 SigIgn: 0000000400000000 SigCgt: 0000000b000004dc
Curious. If I change the line in psx.c that defines the signal number as 33 to 30 root@7964c4e70be8:/sources/libcap2-2.75/tests# grep Sig /proc/7012/status SigQ: 3/255093 SigPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000c00000000 SigCgt: 00000003008004dc root@7964c4e70be8:/sources/libcap2-2.75/tests# grep Sig /proc/7014/status SigQ: 3/255093 SigPnd: 0000000000800000 SigBlk: fffffffe7ffbfa77 SigIgn: 0000000c00000000 SigCgt: 00000003008004dc and changing it to 31 I get: root@7964c4e70be8:/sources/libcap2-2.75/tests# grep Sig /proc/7376/status SigQ: 3/255093 SigPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000c00000000 SigCgt: 00000003010004dc root@7964c4e70be8:/sources/libcap2-2.75/tests# grep Sig /proc/7378/status SigQ: 3/255093 SigPnd: 0000000001000000 SigBlk: fffffffe7ffbfa77 SigIgn: 0000000c00000000 SigCgt: 00000003010004dc Changing it to 32, the mask jumps noticeably: root@7964c4e70be8:/sources/libcap2-2.75/tests# grep Sig /proc/7740/status SigQ: 6/255093 SigPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000800000000 SigCgt: 00000007000004dc root@7964c4e70be8:/sources/libcap2-2.75/tests# grep Sig /proc/7742/status SigQ: 6/255093 SigPnd: 0000000400000000 SigBlk: fffffffe7ffbfa77 SigIgn: 0000000800000000 SigCgt: 00000007000004dc In all the cases, the chosen signal is blocked by the thread's SigBlk mask.
Christian observes that "the problem does not manifest when host and guest architecture match." which suggests that there is something odd with QEMU acting as the target architecture vs. "native".