Bug 57181 - man pages for C structs
Summary: man pages for C structs
Status: RESOLVED WILL_NOT_FIX
Alias: None
Product: Documentation
Classification: Unclassified
Component: man-pages (show other bugs)
Hardware: All Linux
: P1 enhancement
Assignee: documentation_man-pages@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-27 11:19 UTC by Serge van den Boom
Modified: 2015-05-05 09:16 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Serge van den Boom 2013-04-27 11:19:48 UTC
It would be useful to be able to search for the name of a C struct to get its definition with an explanation of its fields.

If having separate pages for each struct is undesirable, these could simply be troff includes ('.so') of a page which actually describes the struct.
i.e. "man timeval" (or maybe "man 'struct timeval'") could bring up the page for 'gettimeofday'.
Comment 1 Michael Kerrisk 2014-03-15 08:22:37 UTC
Serge,

I'm undecided about this idea. I'd like to see it fleshed out a little. FOr example, list a few more structs and (existing) pages that could be linked with .so.

Thanks,

Michael
Comment 2 Serge van den Boom 2014-03-15 11:35:18 UTC
Ok, how is this for a start:

timeval -> ctime(3)
timespec -> nanosleep(2)
sockaddr -> bind(2)
sockaddr_in -> ip(7)
in_addr -> inet(3)
in_addr -> ip(7)  
in_addr_t -> inet(3)
sockaddr_in6 -> ipv6(7)
in6_addr -> ipv6(7)
addrinfo -> getaddrinfo(3)
iovec -> recv(2)
msghdr -> send(2) / recv(2)
cmsghd -> cmsg(3)
dirent -> readdir(3)
pollfd -> poll(2)
epoll_event -> epoll_ctl(2)
rlimit -> getrlimit(2)

To further illustrate why this would be useful, consider the following situation: you are coding in C and want to use sendmsg(). Before you write that line, you would define a struct msghdr, and fill its fields. But you don't recall what exactly the fields of struct msghdr should contain.
Right now, you'd go through the following steps:
1. Where would I find the definition of msghdr? I want to use it in 'sendmesg', so that's probably where it is.
2. 'man sendmesg'
3. search for msghdr
With separate pages per struct, you would be able to skip steps 1 and 2. In fact, if you use Vim, you would just enter 'K' when the cursor is on 'msghdr'.
With redirects, you'd still have step 3, but you wouldn't have the (distracting) step of deciding where you'd expect to find it.

(Also, 'struct mesghdr' includes a field of type 'struct iovec', which is not defined in send(2). This is probably a separate issue though.)

For structures like sockaddr_in, step 1 is even harder, as there is no clue to where to find a page on its definition. You would have to know somehow that it can be found in ip(7).
Comment 3 Michael Kerrisk 2014-03-15 13:28:12 UTC
(In reply to Serge van den Boom from comment #2)
> Ok, how is this for a start:
> 
> timeval -> ctime(3)
> timespec -> nanosleep(2)

So here's an example of ptroblem. 'struct timespec' is in 15 pages. Why point to this one in particular. I can see it's not a bad page to point to. But, if you are using one of the other APIs, then laning on nanosleep(2) would be confusing.


> sockaddr -> bind(2)
> sockaddr_in -> ip(7)
> in_addr -> inet(3)
> in_addr -> ip(7)  
> in_addr_t -> inet(3)
> sockaddr_in6 -> ipv6(7)
> in6_addr -> ipv6(7)
> addrinfo -> getaddrinfo(3)
> iovec -> recv(2)
> msghdr -> send(2) / recv(2)
> cmsghd -> cmsg(3)
> dirent -> readdir(3)
> pollfd -> poll(2)
> epoll_event -> epoll_ctl(2)
> rlimit -> getrlimit(2)
> 
> To further illustrate why this would be useful, consider the following
> situation: you are coding in C and want to use sendmsg(). Before you write
> that line, you would define a struct msghdr, and fill its fields. But you
> don't recall what exactly the fields of struct msghdr should contain.
> Right now, you'd go through the following steps:
> 1. Where would I find the definition of msghdr? I want to use it in
> 'sendmesg', so that's probably where it is.
> 2. 'man sendmesg'
> 3. search for msghdr
> With separate pages per struct, you would be able to skip steps 1 and 2. In
> fact, if you use Vim, you would just enter 'K' when the cursor is on
> 'msghdr'.
> With redirects, you'd still have step 3, but you wouldn't have the
> (distracting) step of deciding where you'd expect to find it.
> 
> (Also, 'struct mesghdr' includes a field of type 'struct iovec', which is
> not defined in send(2). This is probably a separate issue though.)
> 
> For structures like sockaddr_in, step 1 is even harder, as there is no clue
> to where to find a page on its definition. You would have to know somehow
> that it can be found in ip(7).

I don't really buy this reasoning. If I'm using sendmsg(), then I'm probably on that page already before step 1.

So, I don't think this approach (llinks as described above) is really helpful.

Now, whether there should be separate pages for a few structures is something to think about more, but I'd need to see a good reason in each case. (sigevent(7) had a good reason, for example.)
Comment 4 Serge van den Boom 2014-03-15 16:25:34 UTC
> So here's an example of ptroblem. 'struct timespec' is in 15 pages. Why point
> to this one in particular. I can see it's not a bad page to point to. But, if
> you are using one of the other APIs, then laning on nanosleep(2) would be
> confusing.

I am not singling out 'struct timespec'. In my opinion, we would benefit from being able to use *any* structure name as a keyword to man. My list was intended to show the wide variety of non-trivial structures which one might want to look up.

You are right that if you are using one of the other functions, that landing up on another one than the one which you are actually using, might sometimes be confusing. And in fact, I would prefer to have a separate page for each structure. However, even if you sometimes end up on a page of another function which uses the same structure, this is better than having no match at all. At the very least, it gives you a starting point for subsequent searches. And often, you *will* end up at the right page. Or even if it is the wrong page, the structure description will be usable.

For me, being able to press 'K' in Vim on some structure name to find out what fields it has, is reason enough to want the structure name to lead somewhere.

> > [...]
> 
> I don't really buy this reasoning. If I'm using sendmsg(), then I'm probably
> on that page already before step 1.

Ok, this may not be the most realistic example, but the idea holds for any of the structures. Maybe sockaddr_in works better for this example.

Also, having structures as man keywords doesn't just help with writing code, but also with reading someone else's code. If you read code in the order in which it occurs, you'll see the structure being filled before it is passed to some syscall or C library function. In fact, the actual call may be in another (user) function altogether. If you want to know more about what exactly those values being assigned to the various fields mean, you just want to lookup the structure name, instead of first having to figure out in what function it might be used.

And I sometimes use these standard structures in my own code (why reinvent the wheel?). In one real-life example, I needed to have some structure to store a time period in, and I know that various standard types exist (struct timeval, struct timespec, time_t), but I couldn't recall the details of each type. So to find out which would be the most suitable, I would have had to think of various standard functions which would use these types. (Though I might have ended up grepping in /usr/include instead; I don't recall exactly. Maybe both even.)

But even though I'm now trying to think up all sorts of past and hypothetical situations where it would be useful to be able to look up a structure by its name, the reason why I originally took the time to write the bug report (and the follow-ups now, a year later), is because I have in fact (on multiple occasions) felt the need to perform such lookups. And if I have, so will have others.

> Now, whether there should be separate pages for a few structures is something
> to think about more, but I'd need to see a good reason in each case.
> (sigevent(7) had a good reason, for example.)

I personally would like to be able to search for *every* standard structure, and for typedefs too actually. Even if the resulting pages just contain a reference to the pages of functions which actually use the type, this would be of value. But ideally, the specifics of the fields would be explained there as well.

I understand that this might be too much work to actually implement, but I do think that the value is real. Maybe I'll even submit a patch myself one day, if this is the situation.

Regards,

Serge
Comment 5 Michael Kerrisk 2014-03-17 07:39:56 UTC
So, here's an idea...

A man page, called (say) structures.7, with suitable .so links to it, that lists the structures, optionally explains them, and has links to the APIs that use the structures. Doesn't need to be done for all structures to start with, but could be implemented incrementally over time. Do you want to start writing such a page, Serge? As a proof-of-concept, I'd suggest starting with just one or two structures on the page.
Comment 6 Mike Frysinger 2014-03-19 07:20:28 UTC
is there a reason `man -K 'struct rlimit'` isn't acceptable ?  seems a lot easier than trying to hand maintain a page of links.  unless you wrote a script that grepped all the current man pages and generated the main content on the fly.
Comment 7 Michael Kerrisk 2015-05-05 09:16:05 UTC
No suitable solution proposed here, so I'm closing this.

Note You need to log in before you can comment on or make changes to this bug.