[TUHS] Re: On the uniqueness of DMR's C compiler

8 May 2024

I believe Ken Thompson might have been a referee for that paper. At least,
he once mentioned to me that he had reviewed a paper about the threading in
the DEC Fortran compiler.
-rob
On Wed, May 8, 2024 at 7:35 PM Paul Ruizendaal &lt;pnr(a)planet.nl&gt; wrote:
...
  Thanks for pointing that out. Here's an
interesting paper on the DEC PDP11
 Fortran compilers:
 http://forum.6502.org/download/file.php?id=1724&sid=f6a721f3e05774cff07…
 Before 1975 they used direct threading, thereafter there was a native
 compiler for the higher-end models. I think this one may have required
 split i/d, but that is not entirely clear from the text.
 I think the same holds for BCPL on the PDP11: compiling to "ocode" or
 "intcode" in the early 70s, native thereafter -- still have to find source
 for the latter.
 Still, I should have first asked: Does anyone have pointers to small
 machine native compilers from the 1970's that produced efficient assembler
 output?
 I am already aware of the 1978 Whitesmith C compiler.
 7 May 2024 23:07:58 Rob Pike &lt;robpike(a)gmail.com&gt;:
 I'm not sure I accept your starting position. There were several compilers
 for RT-11 and RSX/11-M. RSX (and perhaps RT) Fortran were threaded code,
 but I don't believe they all were. And of course there was BCPL, which was
 - and is - tiny; was it on the 11?
 And there were other small machines from other manufacturers, all of which
 had some form of Fortran and other bespoke things, such as RPG on the small
 IBMs. I think the uniqueness was in the set of conditions more than in the
 Unix C compiler itself.
 But you may be right.
 -rob
 On Wed, May 8, 2024 at 6:59 AM Paul Ruizendaal &lt;pnr(a)planet.nl&gt; wrote:
  In the last months, I've spent a little time
on curating John Walker's
 Unix clone and software stack, including an emulator to run it:
 https://gitlab.com/marinchip
 After creating a basic tool chain (edit, asm, link and a simple
 executive), John  set out to find a compiler. Among the first programs were
 a port of the META 3 compiler-generator (similar to TMG on early Unix) and
 a port of Birch-Hansen’s Pascal compiler. META was used to create a
 compiler that generated threaded code. He found neither compiler good
 enough for his goals and settled on writing his Unix-like OS in assembler.
 As the 9900 architecture withered after 1980, this sealed the fate of this
 OS early on -- had he found a good compiler, the code might have competed
 alongside Coherent, Idris, and Minix during the 80’s.
 This made me realise once more how unique the Ritchie C compiler was. In
 my view its uniqueness combines three aspects:
 1. The C language itself
 2. The ability to run natively on small hardware (even an LSI-11 system)
 3. Generating code with modest overhead versus handwritten assembler (say
 30%)
 As has been observed before, working at a higher abstraction level makes
 it easier to work on algorithms and on refactoring, often earning back the
 efficiency loss. John Walkers work may be case in point: I estimate that
 his hand-coded kernel is 10% larger than an equivalent V6 Unix kernel (as
 compiled for the 9900 architecture).
 There are three papers on DMR’s website about the history of the compiler
 and a compare-and-contrast with other compilers of the era:
 https://www.bell-labs.com/usr/dmr/www/primevalC.html
 https://www.bell-labs.com/usr/dmr/www/chist.html
 https://www.bell-labs.com/usr/dmr/www/hopl.html
 It seems to me that these papers rather understate the importance of
 generating good quality code. As far as I can tell, BCPL and BLISS came
 close, but were too large to run on a PDP-11 and only existed as
 cross-compilers. PL/M was a cross-compiler and generated poorer code.
 Pascal on small machines compiled to a virtual machine. As far as I can
 tell, during most of the 70s there was no other compiler that generated
 good quality code and ran natively on a small (i.e. PDP-11 class) machine.
 As far as I can tell the uniqueness was mostly in the “c1” phase of the
 compiler. The front-end code of the “c0” phase seems to use more or less
 similar techniques as many contemporary compilers. The “c1” phase seems to
 have been unique in that it managed to do register allocation and
 instruction selection with a pattern matcher and associated code tables
 squeezed into a small address space. On a small machine, other native
 compilers of the era typically proceeded to generate threaded code, code
 for a virtual machine or poor quality native code that evaluated
 expressions using stack operations rather than registers.
 I am not sure why DMR's approach was not more widely used in the 1970’s.
 The algorithms he used do not seem to be new and appear to have their roots
 in other (larger) compilers of the 1960’s. The basic design seems to have
 been in place from the very first iterations of his compiler in 1972 (see
 V2 tree on TUHS) and he does not mention these algorithms as being special
 or innovative in his later papers.
 Any observations / opinions on why DMR’s approach was not more widely
 used in the 1970’s?

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

[TUHS] Re: On the uniqueness of DMR's C compiler