PROBLEM ------- strchr(3) and memchr(3) do not explain the behaviour if the character to search for is specified as a null character ('\0'). According to my copy of Harbison and Steele, since the terminator is considered part of the string, a call such as: strchr("hello", '\0') ... will return the address of the terminating null in the specified string. RATIONALE --------- strchr(3) and memchr(3) are inconsistent with index(3) which states: "The terminating NULL character is considered to be a part of the strings." Adding such a note to strchr(3) and memchr(3) is also important since it is not unreasonable to assume that strchr() will return NULL in this scenario. This leads to code like the following which is guaranteed to fail should get_a_char() return '\0': char string[] = "hello, world"; int c = get_a_char(); if (! strchr(string, c)) fprintf(stderr, "failed to find character in string\n"); SUGGESTED UPDATE ---------------- I'd suggest adding something like the following to strchr(3) and memchr(3): If 'c' is '\0' (the terminating null character), strchr() will return the address of the terminating null character in 's'.
Send us a patch ;) Documentation/SubmittingPatches has some guidelines.
Created attachment 70952 [details] update to strchr.3 and memchr.3
Created attachment 70962 [details] test program showing behaviour of strchr, strrchr, memchr, strchrnul, and strstr.
Test program run on: - Ubuntu Natty (11.04) system with libc6 version 2.13-0ubuntu13 (egcs). - Fedora 15 system with glibc version 2.13.90-9.
The BSD folk already have this behaviour documented in their man pages: http://www.freebsd.org/cgi/man.cgi?query=strchr&apropos=0&sektion=0&manpath=FreeBSD+8.2-RELEASE&arch=default&format=html
James, Thanks for the detailed report, and sorry I've been slow to respond. For strchr(3), I applied a variation of what you proposed: --- a/man3/strchr.3 +++ b/man3/strchr.3 @@ -28,7 +28,7 @@ .\" 2006-05-19, Justin Pryzby <pryzbyj@justinpryzby.com> .\" Document strchrnul(3). .\" -.TH STRCHR 3 2010-09-20 "GNU" "Linux Programmer's Manual" +.TH STRCHR 3 2012-04-24 "GNU" "Linux Programmer's Manual" .SH NAME strchr, strrchr, strchrnul \- locate character in string .SH SYNOPSIS @@ -72,6 +72,11 @@ and .BR strrchr () functions return a pointer to the matched character or NULL if the character is not found. +The terminating null byte is considered part of the string, +so that if +.I c +is specified as \(aq\\0\(aq, +these functions return a pointer to the terminator. The .BR strchrnul ()
James, Regarding memchr(), I'm not sure a change is warranted. memchr() is defined in terms of 'bytes' and 'memory areas', rather than strings, so I'd have said it would take an obtuse reading to consider that '\0' is interpreted specially (whereas, as you say, one could be left doubtful about what strchr() does when c is '\0'). I'd guess that this is also why POSIX and the FreeBSD man page make no statement on this pont. So, I'm inclined not to make a change. Thanks, Michael
One further note about memchr(3) though. It does talk about the argument c as a *character*, and rawmemchr() also talks about strings, which clouds the issue somewhat. I'll do some rewrites there to refer rather to "bytes" and "memory areas".
James, all changes are pushed to kernel.org git now. I'll close this now. Please reopen if you thing there's more to be said about memchr().