Bug 9247 - Potential regression in -git15: can't resume stopped root shell?
Summary: Potential regression in -git15: can't resume stopped root shell?
Alias: None
Product: Process Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Serge Hallyn
Depends on:
Blocks: 9243
  Show dependency tree
Reported: 2007-10-28 13:31 UTC by Rafael J. Wysocki
Modified: 2007-11-16 14:09 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.23-git15
Regression: Yes
Bisected commit-id:

Patch to fix sigcont signal bug introduced with file capabilities. (1.75 KB, patch)
2007-10-28 21:31 UTC, Serge Hallyn
Details | Diff
Second proposed fix (1.44 KB, patch)
2007-10-31 09:57 UTC, Serge Hallyn
Details | Diff
Remove the bad matching uid check in previous patch. (1.29 KB, patch)
2007-11-02 06:27 UTC, Serge Hallyn
Details | Diff

Description Rafael J. Wysocki 2007-10-28 13:31:58 UTC
Subject         : Potential regression in -git15: can't resume stopped root shell?
Submitter       : Theodore Tso <tytso@mit.edu>
References      : http://lkml.org/lkml/2007/10/20/114
Comment 1 Rafael J. Wysocki 2007-10-28 13:34:38 UTC
Caused by:

commit b53767719b6cd8789392ea3e7e2eb7b8906898f0
Author: Serge E. Hallyn <serue@us.ibm.com>
Date:   Tue Oct 16 23:31:36 2007 -0700

    Implement file posix capabilities
    Implement file posix capabilities.  This allows programs to be given a
    subset of root's powers regardless of who runs them, without having to use
    setuid and giving the binary all of root's powers.....
Comment 2 Serge Hallyn 2007-10-28 21:31:58 UTC
Created attachment 13306 [details]
Patch to fix sigcont signal bug introduced with file capabilities.
Comment 3 Serge Hallyn 2007-10-28 21:34:16 UTC
Assuming this fixes your problem (it does on my test machine)
I will send it out for review.
Comment 4 Theodore Tso 2007-10-29 06:59:28 UTC
Yep, this does fix the problem which I reported.  I'm a little nervous about whether or not the logic is right for secid != 0, and whether there might be other ramifications when the original (non-capabilities code) where saved setuid is involved (see check_kill_permission() in kernel/signal.c) such that enabling FILE capabilities ends up changing the observed behavior in some corner cases.  But it's definitely a lot better with the SIGCONT exception!
Comment 5 Theodore Tso 2007-10-29 07:09:23 UTC
As a suggestion, a really good idea would be to run either the NIST Posix Compliance Test Suite or the LSB test suite with and without file capabilities enabled, and see if there are any changes; if there are, there may be some corner cases which aren't getting handled correctly.  
Comment 6 Serge Hallyn 2007-10-29 07:38:18 UTC
Agreed, I was considering first doing a check such that if current and
p (the kill target) are different UIDs (according to the same check used
in check_kill_permission()) then the access is always granted.  The only
hole I needed to fill with the extra check is when the uid is the same but
privilege has increased, so adding that exception is probably good.

I'm at a conference today and traveling tomorrow, but I'll try to find
the testsuites tonight and start a run overnight.  Thanks for the
Comment 7 Theodore Tso 2007-10-29 09:52:14 UTC
Well, I'm currently in Japan at a Linux Foundation meeting/conference, so that's I couldn't test your patch until tday (thanks for the fast turnaround, BTW!)

It's not just cap_task_kill() code which worried, it was all of the other additional LSM hooks that were added.  I realize they were good reasons why some of them need to be added, but I am a bit worried about what it might do to POSIX compliance.  IIRC some changes are mandated by the various draft specifications for the File capabilities, but I don't remember any of them causing an out-and-out incompatibility with POSIX.  Which is to say, it defines "appropriate privileges" (the POSIX keyword for "root") in order to achieve privilege separation, but I don't think any operation which had previously succeed w/o root privs were mandated to fail in the draft posix specifications for capability support.  So I think we're OK modulo implementation bugs, at least at a specification/design level.  (BTW, you do have a copy of the draft 17 and/or draft 16 1003.1e and 1003.2c working documents, right?)
Comment 8 Serge Hallyn 2007-10-31 09:57:08 UTC
Created attachment 13361 [details]
Second proposed fix

This version of the patch also ignores cases where uids are not the same between the two tasks.  If the uids are not the same, then the regular kill permissions will do the right thing.
Comment 9 Serge Hallyn 2007-10-31 09:59:03 UTC
There is now no difference between the NIST posix compliance testsuite
results with and without file capabilities.

(I believe all remaining failures are installation problems, unrelated
to this bug.  I will try to straighten those out as I get time)
Comment 10 Serge Hallyn 2007-11-02 06:27:38 UTC
Created attachment 13375 [details]
Remove the bad matching uid check in previous patch.

This applies on top of the previous patch (plus a style-fix patch by Andrew Morton).

With this patch the reported bug appears fixed, as reported on lkml by Ted.  So I will make this attachment, then close the bug.

Thanks for your help guys.
Comment 11 Rafael J. Wysocki 2007-11-02 06:49:47 UTC
When the fix hits the mainline, please provide the commit ID and subject.

Comment 12 Rafael J. Wysocki 2007-11-16 14:09:10 UTC
Fixed by:

commit 91ad997a34d7abca1f04e819e31eb9f3d4e20585
Author: Serge E. Hallyn <serue@us.ibm.com>
Date:   Wed Nov 14 17:00:34 2007 -0800

    file capabilities: allow sigcont within session


Note You need to log in before you can comment on or make changes to this bug.