Bug 19812

Summary: two issues in unix(7)
Product: Documentation Reporter: Lennart Poettering (mzxreary)
Component: man-pagesAssignee: Michael Kerrisk (mtk.manpages)
Status: RESOLVED DOCUMENTED    
Severity: normal CC: halfline, mtk.manpages
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Lennart Poettering 2010-10-06 21:11:02 UTC
There are two smaller issues in the unix(7) man page:

1) The part about "pathname" sockets suggests usage of sizeof(sa_family_t) + strlen(sun_path) + 1 for calculating the sockaddr size. Due to alignment/padding this is probably not a good idea. Instead, one should use offsetof(struct sockaddr_un, sun_path) + strlen() + 1 or something like that.

2) The part about "abstract" sockets is misleading as it suggests that the sockaddr returned by getsockname() would necessarily have the size of sizeof(struct sockaddr), which however is not the case: getsockname() returns exactly the sockaddr size that was passed in on bind(). In particular, two sockets that are bound to the same sockaddr but different sizes are completely independant.
Comment 1 Michael Kerrisk 2010-10-10 04:51:31 UTC
Hi Lennart,

A hint for the future. The two issues here are closely related, but distinct enough that it would have been easier if they were filed as two bug reports. But, I will deal with them both here.

Thanks,

Michael
Comment 2 Michael Kerrisk 2010-10-10 05:10:37 UTC
Regarding your point 1, (In reply to comment #0)
> There are two smaller issues in the unix(7) man page:
> 
> 1) The part about "pathname" sockets suggests usage of sizeof(sa_family_t) +
> strlen(sun_path) + 1 for calculating the sockaddr size. Due to
> alignment/padding this is probably not a good idea. Instead, one should use
> offsetof(struct sockaddr_un, sun_path) + strlen() + 1 or something like that.

I agree that when it comes to portability, this is probably a good idea, since some UNIX implementations include additional fields in the structure (e.g., BSDs derivatives have sun_len). However, on Linux, I'm not sure that alignment/padding can come into it. Indeed, my reading of the kernel code is that things would break if 

(sizeof(sa_family_t) + strlen(sun_path) + 1) != (offsetof(struct sockaddr_un, sun_path) + strlen(sun_path) + 1)

And I'd imagine that some Linux userspace would break as well.

Nevertheless, I have made the change you suggest.
Comment 3 Michael Kerrisk 2010-10-10 05:29:56 UTC
(In reply to comment #0)

> 2) The part about "abstract" sockets is misleading as it suggests that the
> sockaddr returned by getsockname() would necessarily have the size of
> sizeof(struct sockaddr), which however is not the case: getsockname() returns
> exactly the sockaddr size that was passed in on bind(). In particular, two
> sockets that are bound to the same sockaddr but different sizes are
> completely
> independant.

Thanks for reporting this. I wrote the original text in this page, and obviously I misread the details. I changed the text to the following:

       *  abstract: an abstract socket address is distinguished by the
          fact that sun_path[0] is a null byte ('\0').   The  socket's
          address  in  this namespace is given by the additional bytes
          in sun_path that are covered by the specified length of  the
          address  structure.  (Null bytes in the name have no special
          significance.)  The name has no connection with file  system
          pathnames.   When  the  address  of  an  abstract  socket is
          returned by getsockname(2), getpeername(2),  and  accept(2),
          the  returned  addrlen  is  greater than sizeof(sa_family_t)
          (i.e., greater than 2), and the name of the socket  is  con-
          tained in the first (addrlen - sizeof(sa_family_t)) bytes of
          sun_path.  The abstract socket namespace  is  a  nonportable
          Linux extension.

Look okay? (Note that I consciously wrote "(addrlen - sizeof(sa_family_t))", since this is Linux specific, and I noted already that AFAICT, the kernel source is reliant on the following holding true:
sizeof(sa_family_t) == (offsetof(struct sockaddr_un, sun_path)
(E.g., see the code triggering autobind, which generates an abstract socket name if addrlen == sizeof(short).)

The changes has been pushed to git, and will be in man-pages-3.29.

PS You've also inadvertently found the first erratum for my book ;-)
Comment 4 Michael Kerrisk 2010-10-10 05:50:25 UTC
Thanks for this report Lennart.

I'll close now, since I think I've covered everything. If you notice anything wrong, please either reopen, or just send me a mail directly.
Comment 5 Lennart Poettering 2010-10-11 14:11:15 UTC
Thanks for fixing this so quickly!

It really puzzles me that alignment/padding doesn't become a problem here. After all sa_family_t is only 16bits, so it should become a problem even on 32bit. I guess the only explanation is that char arrays are aligned to 8bit addresses, which I guess is understandable though still surprising.

Abstract namespace sockets are originally from Solaris (Couldn't find that in any official docs with google, but I did find it mentioned in http://www.gossamer-threads.com/lists/perl/porters/241993 ) Hence I think it would make sense to use offsetof in this context too since you apparently can write software that is portable beyond Linux when using abstract namespace sockets. Oh, and I guess the sentence "The abstract socket namespace  is  a  nonportable Linux extension." could use some updating in this context as well.

BTW, a lot of the more modern packages nowadays use names like "/org/freedesktop/systemd", "/org/kernel/udev/udevd", "/org/freedesktop/hal/udev_event" as asbtract name space sockets. It might be an idea to suggest this naming scheme in the man pages, so that people follow a similar scheme when introducing a new abstract namespace socket for their software.

Hmm, and while we are at it it might make sense to mention that using fixed named abstract namespace sockets only really is safe for system services that start early and stay around during the entire runtime, because there are no access restrictions on these sockets for creating them: everybody can listen on any socket. That means that if a service "foo" that listens on the "/org/foo/foo" socket is started after the user logged in already, the user might have created his own socket there and thus effectively DoS'ed the foo daemon.

Reopening, so that this isn't forgotten...
Comment 6 Michael Kerrisk 2010-10-12 06:29:15 UTC
(In reply to comment #5)

> Abstract namespace sockets are originally from Solaris (Couldn't find that in
> any official docs with google, but I did find it mentioned in
> http://www.gossamer-threads.com/lists/perl/porters/241993 ) Hence I think it
> would make sense to use offsetof in this context too since you apparently can
> write software that is portable beyond Linux when using abstract namespace
> sockets. Oh, and I guess the sentence "The abstract socket namespace  is  a 
> nonportable Linux extension." could use some updating in this context as
> well.

Hi Lennart,

Do you know this just by the referred-to thread, or at first hand? I tried my abstract socket test code on Solaris. It doesn't work there, but maybe there are platform variations in the coding details. However, a quick grep of some OpenSolaris source I had lying about didn't immediately reveal support for abstract sockets.

Cheers,

Michael
Comment 7 Lennart Poettering 2010-10-14 15:21:18 UTC
This was mentioned a couple of times in discussions, but Google only revealed this one mention. Could be that it is just a myth that has been purported by people who only know it becuase it was purported by other people?

I unfortunately don't know anybody from the Solaris camp who could answer this question definitively. But if you grepped through the sources then I'd believe you more than some stuff on the internet...

Anyway, feel free to close this.