Bug 42042 - strchr(3) and memchr(3) should explain behaviour when character 'c' is '\0'.
Summary: strchr(3) and memchr(3) should explain behaviour when character 'c' is '\0'.
Status: RESOLVED CODE_FIX
Alias: None
Product: Documentation
Classification: Unclassified
Component: man-pages (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: documentation_man-pages@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-30 13:17 UTC by James Hunt
Modified: 2012-04-23 21:19 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
update to strchr.3 and memchr.3 (1.69 KB, patch)
2011-08-31 09:04 UTC, James Hunt
Details | Diff
test program showing behaviour of strchr, strrchr, memchr, strchrnul, and strstr. (4.50 KB, text/x-csrc)
2011-08-31 09:05 UTC, James Hunt
Details

Description James Hunt 2011-08-30 13:17:37 UTC
PROBLEM
-------

strchr(3) and memchr(3) do not explain the behaviour if the character to search for is specified as a null character ('\0'). According to my copy of Harbison and Steele, since the terminator is considered part of the string, a call such as:

  strchr("hello", '\0')

... will return the address of the terminating null in the specified string.

RATIONALE
---------

strchr(3) and memchr(3) are inconsistent with index(3) which states:

  "The terminating NULL character is considered to be a part of the strings."

Adding such a note to strchr(3) and memchr(3) is also important since it is not unreasonable to assume that strchr() will return NULL in this scenario. This leads to code like the following which is guaranteed to fail should get_a_char() return '\0':


  char string[] = "hello, world";
  int c = get_a_char();

  if (! strchr(string, c))
    fprintf(stderr, "failed to find character in string\n");


SUGGESTED UPDATE
----------------

I'd suggest adding something like the following to strchr(3) and memchr(3):

  If 'c' is '\0' (the terminating null character), strchr() will return the
  address of the terminating null character in 's'.
Comment 1 Andrew Morton 2011-08-30 19:36:38 UTC
Send us a patch ;)  

Documentation/SubmittingPatches has some guidelines.
Comment 2 James Hunt 2011-08-31 09:04:06 UTC
Created attachment 70952 [details]
update to strchr.3 and memchr.3
Comment 3 James Hunt 2011-08-31 09:05:48 UTC
Created attachment 70962 [details]
test program showing behaviour of strchr, strrchr, memchr, strchrnul, and strstr.
Comment 4 James Hunt 2011-08-31 09:15:31 UTC
Test program run on:

- Ubuntu Natty (11.04) system with libc6 version 2.13-0ubuntu13 (egcs).
- Fedora 15 system with glibc version 2.13.90-9.
Comment 5 James Hunt 2011-09-01 08:04:46 UTC
The BSD folk already have this behaviour documented in their man pages:

http://www.freebsd.org/cgi/man.cgi?query=strchr&apropos=0&sektion=0&manpath=FreeBSD+8.2-RELEASE&arch=default&format=html
Comment 6 Michael Kerrisk 2012-04-23 20:32:31 UTC
James,

Thanks for the detailed report, and sorry I've been slow to respond.

For strchr(3), I applied a variation of what you proposed:

--- a/man3/strchr.3
+++ b/man3/strchr.3
@@ -28,7 +28,7 @@
 .\" 2006-05-19, Justin Pryzby <pryzbyj@justinpryzby.com>
 .\"    Document strchrnul(3).
 .\"
-.TH STRCHR 3  2010-09-20 "GNU" "Linux Programmer's Manual"
+.TH STRCHR 3  2012-04-24 "GNU" "Linux Programmer's Manual"
 .SH NAME
 strchr, strrchr, strchrnul \- locate character in string
 .SH SYNOPSIS
@@ -72,6 +72,11 @@ and
 .BR strrchr ()
 functions return a pointer to
 the matched character or NULL if the character is not found.
+The terminating null byte is considered part of the string,
+so that if
+.I c
+is specified as \(aq\\0\(aq,
+these functions return a pointer to the terminator.
 
 The
 .BR strchrnul ()
Comment 7 Michael Kerrisk 2012-04-23 20:36:56 UTC
James,

Regarding memchr(), I'm not sure a change is warranted. memchr() is defined in terms of 'bytes' and 'memory areas', rather than strings, so I'd have said it would take an obtuse reading to consider that '\0' is interpreted specially (whereas, as you say, one could be left doubtful about what strchr() does when c is '\0'). 

I'd guess that this is also why POSIX and the FreeBSD man page make no statement on this pont.

So, I'm inclined not to make a change.

Thanks,

Michael
Comment 8 Michael Kerrisk 2012-04-23 20:46:33 UTC
One further note about memchr(3) though. It does talk about the argument c as a *character*, and rawmemchr() also talks about strings, which clouds the issue somewhat. I'll do some rewrites there to refer rather to "bytes" and "memory areas".
Comment 9 Michael Kerrisk 2012-04-23 21:19:03 UTC
James, all changes are pushed to kernel.org git now.

I'll close this now. Please reopen if you thing there's more to be said about memchr().

Note You need to log in before you can comment on or make changes to this bug.