Bug 213013 - getopt.3: further clarify behaviour
Summary: getopt.3: further clarify behaviour
Status: RESOLVED DOCUMENTED
Alias: None
Product: Documentation
Classification: Unclassified
Component: man-pages (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: documentation_man-pages@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-10 09:13 UTC by James Hunt
Modified: 2021-07-29 15:32 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
getopt.3: further clarify behaviour (1.61 KB, application/mbox)
2021-05-10 09:13 UTC, James Hunt
Details
getopt.3 test program (1.43 KB, text/x-csrc)
2021-05-10 09:14 UTC, James Hunt
Details
Updated patch (2021-05-30) (1.64 KB, patch)
2021-05-30 07:28 UTC, James Hunt
Details | Diff
Updated patch (2021-07-14) (1.64 KB, patch)
2021-07-14 18:37 UTC, James Hunt
Details | Diff

Description James Hunt 2021-05-10 09:13:25 UTC
Created attachment 296703 [details]
getopt.3: further clarify behaviour

Further to https://bugzilla.kernel.org/show_bug.cgi?id=212887, the `getopt.3` man page still requires some clarification.

There is in fact one extra character that cannot be used in `optstring`: a semi-colon. But also, using `+` needs some clarification. I've attached a patch for these points.

You can confirm the behaviour wrt `;` in two ways:

1) By checking the code

This line shows that colon and semi-colon are disallowed:

https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/getopt.c;hb=HEAD#l617

```
617     if (temp == NULL || c == ':' || c == ';')
618       {
619         if (print_errors)
620           fprintf (stderr, _("%s: invalid option -- '%c'\n"), argv[0], c);
621         d->optopt = c;
622         return '?';
623       }
```


2) By testing the behaviour

Running the already attached C program and feeding it every ascii value shows that semi-colon is not allowed. Here, I use the utfout utility (https://github.com/jamesodhunt/utfout) to generate the ascii characters:

```bash
(utfout -a "\n" "\{\x21..\x7e}";echo)|while read char; do ./test-getopt "x${char}" -"$char" | grep option; done|wc -l
./test-getopt: invalid option -- ':'
./test-getopt: invalid option -- ';'
92
```

The use of `+` as an option character can also be shown using the test program. The behaviour when it is in the first position in `optstring` is already described so this is just clarifies the behaviour further:

- Error if `+` is the first byte in optstring and user attempts to specify `-+`:

```bash
$ ./test-getopt "+" -+
./test-getopt "+" -+
./test-getopt: invalid option -- '+'
option: '?' (optarg: '', optind: 2, opterr: 1, optopt: 43)
argv[0]: './test-getopt'
argv[1]: '-+'
```

- Success if `+` is not the first byte and user specifies `-+`:

```bash
$ ./test-getopt "x+" -+
option: '+' (optarg: '', optind: 2, opterr: 1, optopt: 0)
argv[0]: './test-getopt'
argv[1]: '-+'
```

Testing the posixly correct behaviour:

This shows the use of a valid `-+` option (note that argv is permuted such that `first` and `second` appear at the end of `argv` in the program output):

```bash
$ ./test-getopt "x+" first -+ second
option: '+' (optarg: '', optind: 3, opterr: 1, optopt: 0)
argv[0]: './test-getopt'
argv[1]: '-+'
argv[2]: 'first'
argv[3]: 'second'
```

Specifying a plus symbol as a posixly correct option (note that the argv arguments are correctly not permuted now):

```bash
/test-getopt "++" first -+ second
argv[0]: './test-getopt'
argv[1]: 'first'
argv[2]: '-+'
argv[3]: 'second'
```
Comment 1 James Hunt 2021-05-10 09:14:11 UTC
Created attachment 296705 [details]
getopt.3 test program

Build as:

```bash
gcc -o test-getopt test-getopt.c
```
Comment 2 Alejandro Colomar 2021-05-12 11:47:36 UTC
Could you please send the patches to the mailing list?  There they will get much more review (I think I'm the only one reading this bugzilla).

You can CC <bugzilla-daemon@bugzilla.kernel.org> with '[BUG 213013]' as a subject prefix so that a copy ends up here.


+Note that the plus symbol may also be used as an option character if it
+does not appear as the first character in
+.IR optstring .

I'd change the wording there to be more correct (it doesn't matter if it _also_ appears as the first character), and also to use syntax similar to the rest of the page (using the characters in single quotes instead of their names):

Note that \(aqx\(aq may be used as an option character
if it is in a position other than the first character in
.IR optstring.

+If
+.B POSIXLY_CORRECT
+behaviour is required in this case
+.I optstring
+will contain two plus symbols.

I don't think this is necessary.  Not sure.
Comment 3 James Hunt 2021-05-30 07:28:52 UTC
Created attachment 297043 [details]
Updated patch (2021-05-30)
Comment 4 James Hunt 2021-05-30 07:37:11 UTC
Hi Alex,

Thanks again for reviewing. I've updated the patch but I still think it needs to be "spelt out" that two plus symbols may be required since that avoids any confusion.

Also, I'm afraid I'm using a webmailer so submitting patches to a mailing list is going to be painful for me, whereas this bugzilla is quick and easy (and doesn't require me to register, format mails in particular ways, plus of course to manage a ton more emails ;) But if the real action is happening on the ML, I wonder if in the future it might be possible to forward all bugzilla activity automatically to the ML rather than _vice versa_?

Thanks again for your help.
Comment 5 Alejandro Colomar 2021-06-28 19:30:20 UTC
Hi James,

Sorry for the delay!

(In reply to James Hunt from comment #4)
> Thanks again for reviewing. I've updated the patch but I still think it
> needs to be "spelt out" that two plus symbols may be required since that
> avoids any confusion.

Okay.

> Also, I'm afraid I'm using a webmailer so submitting patches to a mailing
> list is going to be painful for me, whereas this bugzilla is quick and easy
> (and doesn't require me to register,

BTW, you don't need to subscribe to <linux-man@vger.kernel.org> to post there :-)
But, I understand the rest.

> format mails in particular ways, plus
> of course to manage a ton more emails ;) But if the real action is happening
> on the ML, I wonder if in the future it might be possible to forward all
> bugzilla activity automatically to the ML rather than _vice versa_?

Yes.  After you said these, I remembered that I asked about that some time ago.  The ML was subscribed again to this bugzilla activity :)

Now the review:

Could you please use '.IR ...' instead of '\fI...\fP...'?

See this extract from man-manpages(7):

[[
       The preferred way to write this in the source file is:

           .BR fcntl ()

       (Using this format, rather than the use of  "\fB...\fP()"
       makes it easier to write tools that parse man page source
       files.)
]]


> Thanks again for your help.

Thank you!

Alex
Comment 6 James Hunt 2021-07-14 18:37:16 UTC
Created attachment 297861 [details]
Updated patch (2021-07-14)

Thanks Alex,

Updated patch attached.

Kind regards,

James
Comment 7 Alejandro Colomar 2021-07-29 15:32:10 UTC
Hi James,

Patch applied.  Thanks,

Alex

Note You need to log in before you can comment on or make changes to this bug.