On Sat, 6 Jul 2024, sjenkin(a)canb.auug.org.au wrote:
C wasn’t the first standardised coding language,
FORTRAN & COBOL at least were before it,
so there were multi-platform source libraries and shared source, though often single
platform.
From what I know, vendor extensions of FORTAN, optimised for their hardware, were common,
making high-performance, portable source difficult or impossible. 6-bit and 8-bit chars
were the least of it.
Even without vendor extensions, writing portable Fortran code was hard.
Different floating point formats give you different results, and
architectural differences can bite you. One famous example is that the
709x required word alignment, but S/360 had 4 byte aligned floats and 8
byte aligned doubles, so this:
REAL R(100)
DOUBLE PRECISION D(10)
EQUIVALENCE (R(2), D(1))
would work fine on a 7090 but crash on a 360. That was painful enough
that one of the first things they changed on S/370 was to allow misaligned
data.
I never wrote much COBOL but it had structured data (the ancestor of C
structs) and "redefines" for overlaying structures which could bite you
when different machines had different size or alignment. There were also
a lot of different character sets which led to bugs when code had implicit
assumptions about collating sequences, e.g., do numbers come before
letters as in ASCII, or after as in EBCDIC.
The fact that everything now has 8 bit byte addressed memory with power of
two data sizes and everything is ASCII makes all these problems go away.
Is this right:
C was the first ’systems tool’ language + libraries available across many
platforms.
Notionally, source code could be ported with zero or minimal change.
It made possible portable languages like PERL, PHP, Python.
I think so. There were previous system languages like a PL/I subset on
Multics or PL/S on IBM or PL/M on micros but I don't think any of them had
multiple targets.
Secondly, portable systems tool languages with a
common 2-part design
of parser/front-end providing an abstract syntax tree
to multiple back-ends with platform specific code-generators.
Are these back-ends where most of the assembler, memory model and instruction
optimisation take place now?
That's the standard way to build a compiler. Back in the late 1950s
someone had the bright idea to invent a common intermediate language they
called UNCOL so all of the front ends could produce UNCOL and all of the
back ends could translate from UNCOL thereby reducing the NxM compiler
problem to N+M. It never worked, both because the semantic differences
between source languages are larger than they look, and the machine
architectures of the era were wildly different.
Now we have GCC and LLVM which are sort of UNCOL-ish, but mostly because
the back ends are all so similar. The instruction sets may be different
but the data formats are all the same.
R's,
John