Bug 10573 - getgrgid(3) with gid=nobody fails (not matching manpage / passing back correct errno)
Summary: getgrgid(3) with gid=nobody fails (not matching manpage / passing back correc...
Status: REJECTED INVALID
Alias: None
Product: Other
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Michael Kerrisk
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-28 12:38 UTC by Garrett Cooper
Modified: 2008-05-29 01:48 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.9-42.7.ELsmp x86_64, 2.6.23.17 i686, 2.6.24.3 x86_64
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Test C-file for getgrguid(3). (860 bytes, text/plain)
2008-04-28 12:38 UTC, Garrett Cooper
Details
Wrapper script for C-test file. (491 bytes, application/x-sh)
2008-04-28 12:39 UTC, Garrett Cooper
Details

Description Garrett Cooper 2008-04-28 12:38:04 UTC
Latest working kernel version: ?
Earliest failing kernel version: 2.6.9-42.7 x86, 2.6.9-22.0.2.ELsmp x86_64
Distribution: RHEL-5 / Gentoo

I'm running into a weird set of issues when dealing with getgrgid(3) on Linux.

It appears that there was a bug with the values returned by getgrgid(3) between kernel version 2.6.9 and 2.6.23.

The first issues is the fact that particular kernel versions (2.6.9 in this example) were looking up rguid's incorrectly.

The second issue is the fact that contemporary kernel versions do not set an appropriate errno value when an error occurs. Not being able to find a user entry should not return SUCCESS(errno=0).

The below transcript provides you with an idea of what occurred, but if I omitted anything please let me know.

This set of testing was prompted by incorrect exit codes returned by id(1).

-----------

[root@nova-infra-test1 ~]# id nobody; echo $?
uid=99(nobody) gid=99 groups=25(eng),15045(enged),99
1

After looking at id.c for coreutils-5.21 I tied down the issue to getgrgid(3). The errno makes absolutely no sense though, and it seems to have been "fixed" between various kernel versions:

[root@nova-infra-test1 ~]# ./getgr_t_wrapper.sh
42 ($i) not found in /etc/group
99 ($i) not found in /etc/group
Error encountered with getgrgid(99): No such file or directory
[root@nova-infra-test1 ~]# uname -a
Linux nova-infra-test1 2.6.9-22.0.2.ELsmp #1 SMP Thu Jan 5 17:13:01 EST 2006 i686 i686 i386 GNU/Linux

headless-horseman src # ~gcooper/getgr_t_wrapper.sh
42 ($i) not found in /etc/group
Error encountered with getgrgid(42): 0, Success
99 ($i) not found in /etc/group
Error encountered with getgrgid(99): 0, Success
headless-horseman src # uname -a
Linux headless-horseman 2.6.23.17 #3 SMP Mon Mar 24 04:34:56 PDT 2008 i686 Intel(R) Xeon(R) CPU 5140 @ 2.33GHz GenuineIntel GNU/Linux

hh-internal ~ # ./getgr_t_wrapper.sh ; uname -a
42 ($i) not found in /etc/group
Error encountered with getgrgid(42): 0, Success
99 ($i) not found in /etc/group
Error encountered with getgrgid(99): 0, Success
Linux hh-internal 2.6.24.3 #1 SMP Sat Apr 5 18:49:16 GMT 2008 x86_64 Intel(R) Xeon(R) CPU 5140 @ 2.33GHz GenuineIntel GNU/Linux

"nova-infra-test1" is on an NIS domain whereas "headless-horseman" and "hh-int" are not. The former two kernel sets are patched and the latter is not.
Comment 1 Garrett Cooper 2008-04-28 12:38:46 UTC
Created attachment 15960 [details]
Test C-file for getgrguid(3).
Comment 2 Garrett Cooper 2008-04-28 12:39:55 UTC
Created attachment 15961 [details]
Wrapper script for C-test file.

Test cases with i=42 and i=99 *should* fail (well, they did on my system ;)..) unless /etc/group has those GID entries.
Comment 3 Garrett Cooper 2008-04-28 13:28:16 UTC
The documentation (provided by debian?) available in Gentoo is also out of date and doesn't reflect this change in the code between 2.6.9 and 2.6.24.
Comment 4 Adrian Bunk 2008-04-29 12:14:52 UTC
Michael, can you look at this bug?
Comment 5 Michael Kerrisk 2008-05-20 03:02:21 UTC
Garrett,

I'm having a little trouble understanding your bug report.  The problem is, I don't see a simple statement of what you get, and what you expect.

> It appears that there was a bug with the values returned by 
> getgrgid(3) between kernel version 2.6.9 and 2.6.23.

getgrgid(3) is not a kernel interface.  It's a glibc interface.  What version of glibc are you using?

> The first issues is the fact that particular kernel versions 
> (2.6.9 in this example) were looking up rguid's incorrectly.

Can you p;ease explain this.  Which part of the kernel (i.e., what suystem call) is looking up rguids incorrectly?  What do you mean by "incorrect"?

> The second issue is the fact that contemporary kernel versions
> do not set an appropriate errno value when an error occurs. Not
> being able to find a user entry should not return SUCCESS(errno=0).

Does the following glibc bug report have relevance here?

http://sources.redhat.com/bugzilla/show_bug.cgi?id=3195

In the comments of your C program you write:

 * It's been proven that this test fails on some kernel versions
 * (2.6.9 with RHEL-5 for instance), in particular when
 * getgrguid(3) != getgeguid(3).

But there is no such API as getgeguid().
Comment 6 Garrett Cooper 2008-05-28 10:48:36 UTC
You're right. getgeguid doesn't exist. I was thinking (r = real, e = executing).

I think it's 2.3.5, but I don't have root access on the Redhat machine so I can't tell via rpm or yum.

The RHEL version is RHEL-5 I think (RHEL-4, Nahant update 4), correct?
Comment 7 Michael Kerrisk 2008-05-28 12:20:03 UTC
(In reply to comment #6)
> You're right. getgeguid doesn't exist. I was thinking (r = real, e =
> executing).
> I think it's 2.3.5, 

What is "it"?  glibc?

You left many of my other questions unanswered, which makes it hard to provide further input...

> but I don't have root access on the Redhat machine so I
> can't tell via rpm or yum.

If you are trying to determine the glibc version, then just execute the libc file, something like:

$( ldd /bin/ls | grep libc.so )

> The RHEL version is RHEL-5 I think (RHEL-4, Nahant update 4), correct?

I don't know what you are referring to here.
Comment 8 Garrett Cooper 2008-05-29 01:32:43 UTC
Regardless of the comments, the bug appears to be invalid from a kernel end since the issue appears to be with whatever library / access method is used to look up the user entry... the issue is no doubt from ncsd, or some other associated means. I say this because if you spot the actual "glibc" function reference, it's an extern to another library (just do "grep 'getgrgid(' /usr/include/grp.h".

ldd only reveals the libc.so revision (which is .6, but many versions of libc.so in the 2.3.x ~ 2.5.x series are .6 I think).
Comment 9 Michael Kerrisk 2008-05-29 01:48:17 UTC
(In reply to comment #8)

> ldd only reveals the libc.so revision (which is .6, but many versions of
> libc.so in the 2.3.x ~ 2.5.x series are .6 I think).

Hmmm -- I missed a piece in my commands:

   $( ldd /bin/ls | grep libc.so | awk '{ print $3}') 

Note that the initial $ is *not* the shell prompt -- it's part of "$("

Note You need to log in before you can comment on or make changes to this bug.