Bug 210533 - Calling psx with Go signal handler installed leads to dead lock
Summary: Calling psx with Go signal handler installed leads to dead lock
Status: RESOLVED CODE_FIX
Alias: None
Product: Tools
Classification: Unclassified
Component: libcap (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew G. Morgan
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-07 16:27 UTC by lmb
Modified: 2020-12-09 10:31 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
reproducer (297 bytes, text/plain)
2020-12-07 16:27 UTC, lmb
Details
Candidate patch for this issue (4.01 KB, patch)
2020-12-08 07:24 UTC, Andrew G. Morgan
Details | Diff

Description lmb 2020-12-07 16:27:52 UTC
Created attachment 294019 [details]
reproducer

It seems like the call to psx_syscall blocks forever if the Go runtime has installed a signal handler:

goroutine 1 [syscall, locked to thread]:
kernel.org/pub/linux/libs/security/libcap/psx._Cfunc_psx_syscall3(0x7e, 0xc000053e18, 0xc00001a3e0, 0x0, 0x0)
	_cgo_gotypes.go:58 +0x4e
kernel.org/pub/linux/libs/security/libcap/psx.Syscall3(0x7e, 0xc000053e18, 0xc00001a3e0, 0x0, 0x0, 0x0, 0x0)
	/home/lorenz/go/src/kernel.org/pub/linux/libs/security/libcap/psx/psx.go:92 +0x92
kernel.org/pub/linux/libs/security/libcap/cap.(*syscaller).capwcall(0x554da0, 0x7e, 0xc000053e18, 0xc00001a3e0, 0x2, 0x2, 0xffffffffffffffff, 0xffffffffffffffff)
	/home/lorenz/go/src/kernel.org/pub/linux/libs/security/libcap/cap/cap.go:196 +0x68
kernel.org/pub/linux/libs/security/libcap/cap.(*syscaller).setProc(...)
	/home/lorenz/go/src/kernel.org/pub/linux/libs/security/libcap/cap/cap.go:337
kernel.org/pub/linux/libs/security/libcap/cap.(*syscaller).setMode.func1(0xc000026080, 0x554da0)
	/home/lorenz/go/src/kernel.org/pub/linux/libs/security/libcap/cap/convenience.go:131 +0xb0
kernel.org/pub/linux/libs/security/libcap/cap.(*syscaller).setMode(0x554da0, 0x1, 0x4e9d60, 0x5675c8)
	/home/lorenz/go/src/kernel.org/pub/linux/libs/security/libcap/cap/convenience.go:138 +0x412
kernel.org/pub/linux/libs/security/libcap/cap.Mode.Set(0x1, 0x0, 0x0)
	/home/lorenz/go/src/kernel.org/pub/linux/libs/security/libcap/cap/convenience.go:182 +0x86
main.main()
	/home/lorenz/dev/cap.go:15 +0x92
My Go version: go version go1.15.5 linux/amd64

Commenting out the signal.Notify line in the reproducer will make Set() return.
Comment 1 Andrew G. Morgan 2020-12-07 20:33:15 UTC
Is this resolved with 

https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=7cfe15ee579ea83a7780c6190576fdcab3e2faac

I've tagged that module release as psx/v0.2.46-rc1 .

Thanks!

If not, do you have a test case I can debug against?
Comment 2 Andrew G. Morgan 2020-12-07 20:39:19 UTC
I see the attachment now. Thanks!
Comment 3 Andrew G. Morgan 2020-12-08 06:20:53 UTC
Thanks for that test case. Yes, I have reproduced this issue. I'll figure out what I need to do to fix it.
Comment 4 Andrew G. Morgan 2020-12-08 07:24:27 UTC
Created attachment 294033 [details]
Candidate patch for this issue

This does a couple of things, first it is more aggressive about being the first handler in line for the psx interrupt. Second, it chooses a different interrupt number to use for the psx signal. One that Go does not block.
Comment 5 Andrew G. Morgan 2020-12-08 07:25:29 UTC
I've attached a patch to this bug. I want to think some more about this, but thought you might want to give it a try in the meantime.
Comment 6 Andrew G. Morgan 2020-12-09 06:45:16 UTC
This is fixed at HEAD. I've tagged the Go package module psx/v0.2.46-rc2 which should appear here:

  https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx?tab=versions

Please give it a try. Reopen this bug if it still continues to fail.

Commit (which also includes a test inspired by your example - thanks!):

  https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=2b75e6c316d8f1a8b8549bc352858b0232f40a58
Comment 7 lmb 2020-12-09 10:31:48 UTC
Yes, the fix works. Thanks for the quick turnaround!

Note You need to log in before you can comment on or make changes to this bug.