On Sun, May 17, 2020 at 12:38 PM Paul Winalski <paul.winalski(a)gmail.com>
wrote:
Well, the function in question is called getchar().
And although
these days "byte" is synonymous with "8 bits", historically it meant
"the number of bits needed to store a single character".
Yep, I think that is the real crux of the issue. If you grew up with
systems that used a 5, 6, or even a 7-bit byte; you have an appreciation of
the difference. Remember, B, like BCPL, and BLISS only have a 'word' as
the storage unit. But by the late 1960s, a byte had been declared (thanks
to Fred Brooks shutting down Gene Amhadl's desires) at 8 bits, at least at
IBM.** Of course, the issue was that ASCII was using only 7 bits to store
a character.
DEC was still sort of transitioning from word-oriented hardware (a lesson,
Paul, you and I lived through being forgotten a few years later with
Alpha); but the PDP-11, unlike the 18/36 or 12 bit systems followed IBM's
lead and used the 8-bit byte and byte addressing. But that nasty 7-bit
ASCII thing messed it up a little bit. When C was created (for the 8-bit
byte addressed PDP-11) unlike B, Dennis introduced different types. As he
says "C is quirky" and one of those quirks is that he created a
"char"
type, which was thus 8 bits naturally for the PDP-11, but was storing data
following that 7-bit ASCII data with a bit leftover.
As previously said in this discussion, to me issue is that it was called a
*char,* not a *byte*. But I wonder if Dennis and team had had that
foresight, it would have in practice made that much difference? It took
many years and many lines of code and trying to encode the glyphs for many
different natural languages to get to ideas like UTF.
As someone else pointed out, one of the other quirks of C was trying to
encode the return value of a function into single 'word.' But like many
things in the world, we have to build it first and let it succeed before we
can find real flaws. C was incredibly successful and as I said before,
I'll not trade it for any other language yet it what it had allowed me and
my peers to do over the years. I humbled by what Dennis did, I doubt many
of us would have done as well. That doesn't make C perfect, or than we can
not strive to do better, and maybe time will show Rust or Go to be that.
But I suspect that may still be a long time in the future. All my CMU
professors in the 1970s said Fortran was dead then. However .. remember
that it still pays my salary and my company makes a ton of money building
hardware that runs Fortran codes - it's not even close when you look at
number one [check out: the application usage on one of the bigger HPC
sites in Europe -- I offer it because it's easy to find the data and the
graphics make it obvious what is happening:
https://www.archer.ac.uk/status/codes/ - other sites have similar stats,
but find them is harder].
Clem
** As my friend Russ Robeolen (who was the chief designer of the S/360
Model 50) tells the story, he says Amdahl was madder than a hornet about
it, but Brooks pulled rank and kicked him out of his office. The S/360 was
supposed to be an ASCII machine - Amdahl thought the extra bit for a byte was
a waste -- Brooks told him if it wasn't a power of 2, don't come back --
that is "if a byte was not a power of two he did not know how to program
for it efficiently and SW being efficient was more important that Amdahl's
HW implementation!" (imagine that). Amdahl did get a 24-bit word
type, but Brooks made him define it so that 32 bits stored
everything, which again Amdahl thought was a waste of HW. Bell would later
note that it was the single greatest design choice in the computer industry]
.