----- Original Message -----From:"Doug McIlroy" <doug@cs.dartmouth.edu>To:<tuhs@minnie.tuhs.org>Cc:Sent:Sat, 12 Aug 2017 21:58:19 -0400Subject:[TUHS] origin of string.h and ctype.h
Jon Steinhart <jon@fourwinds.com> asked this question (not on tuhs).
It highlights some less well known contributors to research Unix.
> I'm trying to find out who came up with strcmp and the idea of
> returning -1,0,1 for a string comparison. I can see that it's not in
> my V6 manual but is in V7. Don't see anything in Algol, PL/I, BCPL, or B
The -1,0,1 return from comparisons stems from the interface of qsort,
which was written by Lee McMahon. As far as I know, the interface for
the comparison-function parameter originated with him, but conceivably
he borrowed it from some other sort utility. The negative-zero-positive
convention for the return value encouraged (and perhaps was motivated by)
this trivial comparison function for integers
int compar(a,b) { return(a-b); }
This screws up on overflow, so cautious folks would write it with
comparisons. And -1,0,1 were the easiest conventional values to return:
int compar(a,b) {
if(a<b) return(-1);
if(a>b) return(1);
return(0);
}
qsort was in v2. In v3 a string-comparison routine called "compar"
appeared, with a man page titled "string comparison for sort". So the
convention was established early on.
Compar provided the model for strcmp, one of a package of basic string
operations that came much later, in v7, under the banner of string.h
and ctype.h.
These packages were introduced at the urging of Nils-Peter Nelson, a
good friend of the Unix lab, who was in the Bell Labs comp center.
Here's the story in his own words.
I wrote a memo to dmr with some suggestions for additions to C. I asked
for the str... because the mainframes had single instructions to implement
them. I know for sure I had a blindingly fast implementation of isupper,
ispunct, etc. I had a table of length 128 integers for the ascii character
set; I assigned bits for upper, lower, numeric, punct, control, etc. So
ispunct(c) became
#define PUNCT 0400
return(qtable[c]&PUNCT)
instead of
if(c==':' || c ==';' || ...
[or
switch(c) {
default:
return 0;
case ':':
case ';':
...
return 1;
}
MDM]
dmr argued people could easily write their own but when I showed
him my qtable was 20 times faster he gave in. I also asked for type
logical which dmr implemented as unsigned, which was especially useful
when bitfields were implemented (a 2 bit int would have values -2, -1,
0, 1 instead of 0, 1, 2, 3). I requested a way to interject assembler,
which became asm() (yes, a bad idea).