Thanks everybody for the feedback and pointers, much appreciated!
The main point is clear: the premise that the DMR C compiler had unique (native, small
machine) code generation during most of the 70’s does not hold up.
Clean Cole is correct in observing that (certainly for the 70’s) I’m skewed to stuff from
academia with a blind spot for the commercial compilers of that era.
Doug McIlroy’s remarks on Digitek were most helpful and I’ll expand a bit on that below.
I was aware of the Digitek / Ryan-Macfarland compilers before, but in my mind they
compiled to a virtual machine (mis-understanding a description of “programmed operators”
and because their compilers for microcomputers did so in the 80’s). Digging into this more
led me to a 1970 report "Programming Languages and their Compilers, Preliminary
Notes” by John Cocke and J.T. Schwartz:
https://www.softwarepreservation.org/projects/FORTRAN/paper/Bright-FORTRANC…
It is a nearly 800 page review of then current languages and compilers and it includes
some discussion of the Digitek compilers as the state of the art for small machines and
has some further description of how they worked (pp. 233-237, 749). It also mentions their
PL/1 for Multics fiasco (for background
https://www.multicians.org/pl1.html)
- The Digitek compilers were indeed small enough to run on PDP-11 class machines and even
smaller, and they produced quite reasonable native code. In this sense, they were in the
same spot as the DMR C compiler which was hence not unique in this regard -- as Doug
points out.
- They consisted of two parts: a front end coded in “Programmed Operators" (POPS)
generating an intermediate language, and a custom coded back-end that converted the IL to
native code.
- POPS were in effect a VM for compiler construction (although expressed as assembler
operations). To move a compiler to a new machine only the POPS VM had to be recoded, which
was a very manageable job. From the description in the above book it sounds very similar
to the META 3 compiler generator setup, but expressed in a different form.
- Unfortunately, I have not been able to find a description of the POPS IL.
- The smaller Digitek compilers had a limited level of optimisations, carried out at the
code generation phase. The optimisations described sound quite similar to what the DMR C
compiler did in its c1 phase (special casing +1 and -1, combining constants, mul/div to
shift, etc.)
- Code generation seems to have been through code snippets for each IL operation,
selecting from one of 3 addressing modes: register, memory and indexed; the text isn’t
quite clear. It sounds reasonable for small machines in the 60’s.
- The later Ryan-MacFarland microcomputer compilers seem to have used the same POPS based
front-end technology, but using an interpreter to execute the IL directly.
Interestingly, the above book has a final chapter about “the self-compiling compiler”. To
quote: “The scheme to be described is one which has often been considered, and in some
cases even implemented. It involves the use of a compiler written in its own language, and
capable therefore of compiling itself over to a new machine.” It proceeds to describe such
a compiler in quite some detail, including using a table driven code generator.
Seen through this lens, the DMR C compiler could be viewed as a re-imagining of the
Digitek small system compilers using a self-compiling lexer/parser instead of POPS (or TMG
or META) and a (also self-compiling) code generator evolved to handle the richer PDP-11
addressing modes. The concept seems to have been in the air at that time.
Now I am left wondering why the IL-to-native back-ends were not more used in academic
small machine compilers in the 70’s -- but this too may be the result of a skewed view on
my part.