Thank you for sharing, very interesting context info!
It would seem to me that there was only limited work on multi-processor Unix prior to
1985, with the work by Bach/Buroff/Kelleman around 1983 being the defining effort.
When I was doing my thesis with Prof. Van de Goor (who had earlier worked at DEC with the
PDP-8 and PDP-11 design teams), another student down the corridor was working on
multi-processor Unix ca. 1984-1985. There is a short paper about that work here
https://dl.acm.org/doi/pdf/10.1145/6592.6598
From today’s perspective the paper’s conclusions seem odd. It describes going through the
source code end-to-end as a huge job, but the core SysIII kernel is only some 7,000 sloc,
with maybe two dozen shared data structures. On the other hand, with the tooling of the
early 1980’s debugging this stuff must have been very hard. In contrast, on my porting
projects today it is very easy to generate a kernel trace for millions of instructions in
a few seconds. Even billions is do-able. The edit/compile/test cycle for a kernel with
some debug test code compiled in is similarly a matter of seconds. A single test must have
taken an hour or more back in the day, assuming that one had exclusive access to the
machine.
Also its observation that the Unix kernel is “not highly structured” seems unfair. I find
the 1980-era Unix kernel rather well structured, with the exception of the memory
management code which is indeed spread out over multiple files making it not so easy to
fully grasp. Maybe this was what my fellow student was referring to in his MSc thesis.
Also note that a Dutch MSc thesis only took 6-12 months.
On 6 Aug 2023, at 06:00, scj(a)yaccman.com wrote:
When I left Bell Labs in 1986, I joined Ardent Computer in California. We built a
multiprocessor Unix system with up to four processors based on ECL technology (faster than
the computer chips of the time). The CPU had a standard set of registers, and there were
four vector registers that would hold 1000 floating-point numbers and vector instructions
to do arithmetic.
So that meant that we had compiler work to do. Luckily, Randy Allen had just graduated
and signed on to do the parallelism. I took on the job of doing the assembler, loader,
parallelizing C and FORTRAN compilers, and I did the lower-level stuff: assembler,
loader,
designed the a.out format, and even wrote a bug tracking system. Randy's compiler
was excellent, but there were other problems. The Sun workstations had some quirks: from
time to time they would page in a page of all zeros due to a timing problem. Unhappily,
the zero was the halt operation! We addressed that by adding code to the Kernel the
verify that no code page was all 0's before executing. AT&T and Sun and MIPS
and all the hardware makers have problems like this with early chips. One thing I had
told the team from the beginning was that we were going to have to patch hardware problems
in the early versions.
The most serious early hardware bug in our machine was that when the MIPS chip had a page
fault, the CPU started executing the new page before it was all present. It only missed
the first two or three instructions. We settled on a strategy to generate the a.out file
so that the first 4 instructions were all No-Ops. This solved the MIPS problem.
Now we faced the problem of how do we take a standard a.out format and redo it so that
the first four instructions in each code page are NOPs. We built an "editor"
for a.out files that would read the file in, respond to a series of requests, relocate the
instructions correctly, and then branch to the line of code that it had been about to
execute. One good thing about this was that when the chip got fixed we would not have to
change any code -- it would just work.
And then we got creative. We could use the "editor" to find the basic blocks
in the code, introduce counting instructions at the head of each block, and produce a
profiler by recompiling. We probably found about 20 things we could do with this
mechanism, including optimization after loading, timing the code without having to
recompile everything, collecting parallelism statistics, etc.
---
On 2022-11-28 05:24, Paul Ruizendaal wrote:
> The discussion about the 3B2 triggered another question in my head:
> what were the earliest multi-processor versions of Unix and how did
> they relate?
> My current understanding is that the earliest one is a dual-CPU VAX
> system with a modified 4BSD done at Purdue. This would have been late
> 1981, early 1982. I think one CPU was acting as master and had
> exclusive kernel access, the other CPU would only run user mode code.
> Then I understand that Keith Kelleman spent a lot of effort to make
> Unix run on the 3B2 in a SMP setup, essentially going through the
> source and finding all critical sections and surrounding those with
> spinlocks. This would be around 1983, and became part of SVr3. I
> suppose that the “spl()” calls only protected critical sections that
> were shared between the main thread and interrupt sequences, so that a
> manual review was necessary to consider each kernel data structure for
> parallel access issues in the case of 2 CPU’s.
> Any other notable work in this area prior to 1985?
> How was the SMP implementation in SVr3 judged back in its day?
> Paul