Don't have much to add except to note that early FORTRANs had a
version of IF that took three statement numbers and did a (gasp) GOTO
to the first if the expression in the IF was negative, to the second
if it was 0, and to the third if it was positive. And some
mainframes had an instruction that did exactly that as well...
Steve
----- Original Message -----
From: "Doug McIlroy" <doug(a)cs.dartmouth.edu>
To:<tuhs@minnie.tuhs.org>
Cc:
Sent:Sat, 12 Aug 2017 21:58:19 -0400
Subject:[TUHS] origin of string.h and ctype.h
Jon Steinhart <jon(a)fourwinds.com> asked this question (not on tuhs).
It highlights some less well known contributors to research Unix.
I'm trying to find out who came up with strcmp
and the idea of
returning -1,0,1 for a string comparison. I can see that it's not
in
my V6 manual but is in V7. Don't see anything in
Algol, PL/I, BCPL,
or B
The -1,0,1 return from comparisons stems from the interface of qsort,
which was written by Lee McMahon. As far as I know, the interface for
the comparison-function parameter originated with him, but
conceivably
he borrowed it from some other sort utility. The
negative-zero-positive
convention for the return value encouraged (and perhaps was motivated
by)
this trivial comparison function for integers
int compar(a,b) { return(a-b); }
This screws up on overflow, so cautious folks would write it with
comparisons. And -1,0,1 were the easiest conventional values to
return:
int compar(a,b) {
if(a<b) return(-1);
if(a>b) return(1);
return(0);
}
qsort was in v2. In v3 a string-comparison routine called "compar"
appeared, with a man page titled "string comparison for sort". So the
convention was established early on.
Compar provided the model for strcmp, one of a package of basic
string
operations that came much later, in v7, under the banner of string.h
and ctype.h.
These packages were introduced at the urging of Nils-Peter Nelson, a
good friend of the Unix lab, who was in the Bell Labs comp center.
Here's the story in his own words.
I wrote a memo to dmr with some suggestions for additions to C. I
asked
for the str... because the mainframes had single instructions to
implement
them. I know for sure I had a blindingly fast implementation of
isupper,
ispunct, etc. I had a table of length 128 integers for the ascii
character
set; I assigned bits for upper, lower, numeric, punct, control, etc.
So
ispunct(c) became
#define PUNCT 0400
return(qtable[c]&PUNCT)
instead of
if(c==':' || c ==';' || ...
[or
switch(c) {
default:
return 0;
case ':':
case ';':
...
return 1;
}
MDM]
dmr argued people could easily write their own but when I showed
him my qtable was 20 times faster he gave in. I also asked for type
logical which dmr implemented as unsigned, which was especially
useful
when bitfields were implemented (a 2 bit int would have values -2,
-1,
0, 1 instead of 0, 1, 2, 3). I requested a way to interject
assembler,
which became asm() (yes, a bad idea).