Currently reading through IPC in CB Unix 3.0, Vr1-vax and SVr1-pdp11.
The IPC primitives in CB Unix (maus, event and msg) are quite different from those in SVr1 (although maus survives in SVr1-pdp11).
It made me wonder: is it still known who designed the IPC primitives for SysV?
> Sat Aug 31 06:21:47 AEST 2019
> John Reiser did do his own paging system for UNIX 32/V.
> I heard about it from friends at Bell Labs ca. 1982-83,
> when I was still running systems for physicists at Caltech.
> It sounded very interesting, and I would love to have had
> my hands on it--page cache unified with buffer cache,
> copy-on-write from the start.
>
> [...]
>
> It is in any case a shame that jfr's code never saw the light
> of day. I really hope someone can find it on an old tape
> somewhere and we can get it into the archive, if only because
> I'd love to look at it.
>
> Norman Wilson
> Toronto ON
Norman,
I am getting more optimistic that much of this version of 32V can be ‘triangulated’ from the surviving sources. For convenience I call this version “32V r3” (with the first swapping version being "32V r1" and the scatter loading version being "32V r2").
I’ve been reading my way through the surviving 32/V r2, SysIII-Vax, SVr1-Vax and SVr2-Vax sources. There seems to be a lot of continuous progression in these versions. From a communication with Keith Kelleman I thought that VM in SVr2 was unrelated, but that appears to have been a misunderstanding. In short, it seems that the basic paging algorithms and data structures in SVr2 (other than the region abstraction) come from 32V r3.
The strongest clue for the source link is in the SVr2 “page.h” file. In it, the union describing a page table entry matches with JFR’s description and is otherwise (partially) unused in SVr2. There are other, similar clues in the other source trees.
To explain more clearly, have a look at this code section: https://github.com/ryanwoodsmall/oldsysv/blob/master/sysvr2-vax/src/uts/vax…
In this union, the 2nd struct “pgd” is never used, nor is the bit pg_disk in the “pgm” struct ever used. It matches with JFR’s recollection:
<quote>
My internal design strategy was to use the hardware page table entry
(4 bytes per page, with "page not present" being one bit which freed
the other 31 bits for use by software) as the anchor to track everything
about a page. This resulted in a complicated union of bitfield structs,
which became a headache. When other departments took over the work,
the first thing they did was use 8 bytes per page, which meant shuffling
between the software and the hardware descriptors: its own mess.
<unquote>
In the pte as given, a pte can be in two states: (i) the pg_disk bit is reset in which case it is in “pgm” format - this format is recognized by the mmu hardware; (ii) the pg_disk bit is set in which case it is in “pgd” format. When a page is paged in, the disk form of the pte was I think saved in the page frame descriptor (the “pfdat" table) and the pte converted to the memory form. Upon page-out the reverse happened. In the SVr2 version of this code the “pgd” form is abandoned and replaced by the separate disk descriptors (see https://github.com/ryanwoodsmall/oldsysv/blob/master/sysvr2-vax/src/uts/vax…)
The “pgd” structure is a bit puzzling. There is a 19 bit device block number, which is less than the 24 bits allowed by the filesystem code and also less than the 20 bits that would fit in the pte. Maybe this is because the RP04/RP05 disks of the early 80’s worked with 19 bits. I am not sure what the “pg_iord” field is about. The comment “inode index” may be misleading. My hypothesis that it is a short form of the device code, for instance an index into the mount table; magic values could have been used to identify the swap device, “demand zero”, etc.
It seems probable to me that the paging algorithm in SVr2 was derived from the 32/V r3 algorithm. John's recollection:
<quote>
Our machine started with 512KB of RAM, but within a few months was upgraded
to 4 MB with boards that used a new generation of RAM chips.
The hardware page size was 512 bytes, which is quite small. Strict LRU
on 8,192 pages, plus Copy-On-Write, made the second reference to a page
"blindingly fast".
</unquote>
In a follow up mail conversation with JFR we (I?) arrived at the conclusion that literal “strict LRU” is not likely on VAX hardware, but that an algorithm that maintains a small working set combined with a large second chance list amounts to about the same. It seems to me that this description also applies to the surviving SVr2 implementation: every 4 seconds sweep through the page tables of all processes and pages that were not referenced are moved to the free/2nd chance list.
To implement this it seems likely to me that 32V r3 used a structure similar to this: https://github.com/ryanwoodsmall/oldsysv/blob/master/sysvr2-vax/src/uts/vax… Such a structure is also in line with viewing core as a cache for disk, similar to TENEX that JFR had in mind.
The big change from SysIII to SVr1 in kernel page table management is the introduction of the “sptmap” (https://chiselapp.com/user/pnr/repository/paging/file?name=os/machdep.c&ci=…) In 32V r2 and SysIII the user page tables are swapped in and out of kernel space along with the _u area. This makes it impractical to do the working set sweep every few seconds. The “sptmap” innovation effectively creates a global page table space that fits with the needs of the working set sweep. In SVr1 it seems to have no real benefit, and it seems likely to me that it came from 32V r3.
In general it seems plausible to me that SVr1 derives from 32V r3, but was regressed to SysIII code where necessary to keep the code PDP-11 compatible. Another clue for this is in the buffer code of SVr1: https://chiselapp.com/user/pnr/repository/paging/file?name=os/main.c&ci=fba…
Here, the disk buffers are allocated as virtual memory pages, not as an array. This too is otherwise unused in SVr1, but makes perfect sense in the context of 32V r3.
So, in summary, it would seem to me that the 32V r3 virtual memory system:
- used sptmap code similar to SVr1/r2 to manage page tables in the kernel
- used the pte structure from SVr2 and a pfdat table similar to SVr2 to manage mappings
- used page stealer and fault code similar to SVr2
Phrased the other way around: SVr2-vax seems to use the 32V r3 virtual memory code with the region abstraction added on top, and the unified buffer removed.
At the moment I have not found clear clues for the unified buffer algorithms or mmap implementation. Perhaps careful reading of the IPC shared memory code in SVr1 will yield some.
To be continued …
In 1992, 386BSD is released by Lynne and William Jolitz, starting the open
source operating system movement (Linux didn't come along under later).
-- Dave
Unlike most here, I always pronounced Mt Xinu with an
eye, not an eee. I don't know where I got that, though.
I did know Ed Gould via USENIX and DECUS, but that doesn't
make my pronunciation correct.
As an aside, anyone know where Ed is these days or how he's
doing? I last saw him at a USENIX conference, probably in
San Jose in 2013 but I'm not sure. He showed up just for the
reception; he'd retired, and had cut away most of his famous
beard because he was spending a lot of time sailing and it
got in the way.
Norman Wilson
Toronto ON
Nelson H. F. Beebe:
P.S. Jay was the first to get Steve Johnson's Portable C Compiler,
pcc, to run on the 36-bit PDP-10, and once we had pcc, we began the
move from writing utilities in Pascal and PDP-10 assembly language to
doing them in C.
======
How did that C implementation handle ASCII text on the DEC-10?
Were it a from-scratch UNIX port it might make sense to store
four eight- or nine-bit bytes to a word, but if (as I sense it
was) it was C running on TOPS-10 or TOPS-20, it would have had
to work comfortably with DEC's convention of five 7-bit characters
(plus a spare bit used by some programs as a flag).
Norman Wilson
Toronto ON
More from Yost below.
My purpose in relating this was to point out that the original unix
implementation choices were mostly fine; they just had to be tweaked a
bit. Clearly an independent implementation such as in Linux would veer
off in a different direction, done in a different era and with different prior
experience. I was a bit surprised that Bruce didn't make this same
tweak to cblock size but no way of knowing his reasons now.
> Begin forwarded message:
>
> From: Dave Yost
> Subject: Re: [TUHS] 386BSD released
> Date: July 16, 2021 at 9:21:53 AM PDT
> To: Bakul Shah
>
> Plz forward this
> thanks
>
> This was in early 1983 or late 1982.
>
> We got the serial driver to go 19200 out and 9600 in.
>
> I did 2 things in the Fortune Systems 68k serial driver:
> • hand-coded asm pseudo-DMA, suggested by Robert P Warnock III
> • cblock size 128 bytes instead of 8, count ’em, 8.
>
> From Lyons,
> https://cs3210.cc.gatech.edu/r/unix6.pdf <https://cs3210.cc.gatech.edu/r/unix6.pdf>
> the unix v6 serial driver used a clist of cblocks, like this:
>
>
> The pseudo-DMA interrupt handler was a function made up of a few hand-coded 68k instructions, entered into C code as hex data. That code transferred one byte into or out of a cblock, and at the end of the cblock it grabbed the next cblock from a queue and rang the “doorbell” hardware interrupt, which caused a “software interrupt” at lower priority for further processing. Rob put the doorbell into the architecture with a couple of gates on the board because he was well aware of this software interrupt trick, which was already used in bsd. For some reason I didn’t look at the bsd code, probably because Rob’s explanation was lucid and sufficient.
>
> I once had occasion to mention this, and specifically the relaxing of the draconian 8 byte cblock size, to Dennis Ritchie. He said, sure, why not, the 8 byte cblock size was just a neglected holdover from early days.
>
> This approach was just an interrupt version of what I had proposed to Rick Kiessig as a first project at Fortune Systems: to get a 30x speed up when writing to the Fortune Systems memory-mapped character display hardware. I had done the same thing a few years earlier in Z80 in C code in a serial CRT terminal. It’s simple and obvious: make the inner loop do as little as possible. The most primitive operation needs to be a block operation, not a byte-at-a-time operation.
Doug McIlroy asks about the Rosetta Stone table relating TOPS-20
commands to Unix command in my ``Unix for TOPS-20 Users'' document:
>> I was puzzled, though, by the Unix command "leave", which is
>> not in the manuals I edited, nor is it in Linux. What does
>> (or did) it do?
I reread that 1987 document this morning, and found a few small
mistakes, but on the whole, I still agree with what I wrote 34 years
ago, and I'm pleased that almost everything there about Unix still
applies today.
I confess that I had forgotten about the TOPS-20 ALERT command and its
Unix equivalent, leave. As Doug noted, leave is not in Linux systems,
but it still exists in the BSD world, in DragonFlyBSD, FreeBSD,
NetBSD, OpenBSD, and their many derivatives. From a bleeding-edge
FreeBSD 14 system, I find
% man leave
LEAVE(1) FreeBSD General Commands Manual LEAVE(1)
NAME
leave – remind you when you have to leave
SYNOPSIS
leave [[+]hhmm]
DESCRIPTION
The leave utility waits until the specified time, then reminds you that
you have to leave. You are reminded 5 minutes and 1 minute before the
actual time, at the time, and every minute thereafter. When you log off,
leave exits just before it would have printed the next message.
...
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------