> Date: Sat, 11 Apr 2020 08:44:28 -0700
> From: Larry McVoy
>
> On Sat, Apr 11, 2020 at 11:38:44AM -0400, Norman Wilson wrote:
>> -- Stream I/O system added; all communication-device
>> drivers (serial ports, Ethernet, Datakit) changed to
>> work with streams. Pipes were streams.
>
> How was performance? Was this Dennis' streams, not Sys V STREAMS?
It was streams, not STREAMS.
> I ported Lachmans/Convergents STREAMS based TCP/IP stack to the
> ETA 10 Unix and SCO Unix and performance just sucked. Ditto for
> the Solaris port (which I did not do, I don't think it made any
> difference who did the port though).
STREAMS are outside the limited scope I try to restrain myself to, but I’m intrigued.
What in the above case drove/caused the poor performance?
There was a debate on the LKML in the late 1990’s where Caldera wanted STREAMS support in Linux and to the extent the arguments were technical *), my understanding of them is that the main show stopper was that STREAMS would make ‘zero copy’ networking impossible. If so, then it is a comment more about the underlying buffer strategy than STREAMS itself.
Did STREAMS also perform poorly in the 1986 context they were developed in?
Paul
*) Other arguments pro- and con included forward maintenance and market need, but I’m not so interested in those aspects of the debate.
> Indeed the Unix manuals were available as printed books. Volume One was
the manual pages and Volume Two the articles from /usr/doc. I remember
seeing soft-cover bound copies of the 7th Edition manuals, ...
> I think the next time this happened in the exact same way was with the
"Unix Research System Tenth Edition" books published by Saunders College
Publishing in 1990.
Those were the only two that were published as trade books. I still use
the 10th Ed regularly. The 7th Ed was a debacle. The publisher didn't
bother to send us galleys because they had printed straight from troff.
It turned out they did not have the full troff character set, and put
an @ sign in place of each missing character. The whole print run was
done before we saw a copy. Not knowing whether they ever fixed it, I'd
be interested to hear whether or not the botch made it to bookstores.
Doug
Oops - pressed send too soon - apologies
—
Many thanks for the below notes!
Some comments in line below:
> The initial user-mode environment was a mix of 32V,
> subsequent work within 1127, and imports from 4.1BSD.
> I don't know the exact heritage: whether it was 1127's
> work with 4.1BSD stuff added or vice-versa.
Looking at the organisation of the source tree I’d say it is more likely that the base was V32 with bits of 4.1BSD imported than the other way around. If it was the other way around somebody would have spent considerable time to reorganise the source tree back to a form consistent with 32V. I think that such an effort would have been remembered even 40 years later.
> The kernel was a clean break, however: 4.1xBSD for some
> value of x (probably 4.1a but I don't remember which)
> with Research changes.
I don’t mean disrespect, but I think the surviving sources support Rob’s recollection that it was a gradual, ongoing effort.
As a first approximation looking at the top comments of a file gives its origin: the BSD derived files still have an SCCS-type marker. For example the file https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys/sys/vmmem.c still has the top comment "/* vmmem.c 4.7 81/07/09 */“, even though it was last touched in 1985. (By the way, who knows which tool generated these comments? Is it early SCCS?)
For the VM code, the BSD version stamp comment strings are consistent with the 4.1BSD release. For the TCP/IP stack they are consistent with 4.2BSD; it would seem probable to me that this code was imported multiple times during the development of 8th Edition.
As far as I can tell 4.1aBSD was released in March or April 1982. Unfortunately no source code tape of it has surfaced, and SCCS coverage at this point is still very partial. I think 4.1b with the initial FFS implementation followed late summer 1982, I don’t have a more precise date (yet).
> -- Berkeley FFS replaced by Weinberger's bitmapped
> file system: essentially the V7 file system except
> the free list was a bitmap and the blocksize was 4KiB.
Thank you for pointing this out. With my focus on networking I had completely missed that.
> Hacky implementation, depending on a flag bit in the
> minor device number; didn't use the file system switch.
> Old 512-byte-block file systems had to be supported
> partly to ease the changeover, partly because the first
> version had a limited bitmap size so file systems larger
> than about 120MiB wouldn't work.
For those interested, some of the relevant files are:
https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys/h/param.h (middle bit)
https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys/h/filsys.h (note the union)
https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys/sys/alloc.c (note 'if(BITFS(dev))’)
And indeed the bitmap was fitted inside the 4KB superblock, 961 longs.
961 x 32 bits x 4KB = 120MB
I’m not sure I understand the link between cluster and page size that is mentioned in param.h
> This limit was removed
> later. (In retrospect I'm surprised I didn't then insist
> on converting any remaining old-format file systems in
> our domain and then removing the old-format code from
> the kernel, since user-mode tools--including a user-mode
> file server!--could be used to access any old disks
> discovered later.)
Many thanks for the below notes!
Some comments in line below:
> The initial user-mode environment was a mix of 32V,
> subsequent work within 1127, and imports from 4.1BSD.
> I don't know the exact heritage: whether it was 1127's
> work with 4.1BSD stuff added or vice-versa.
Looking at the organisation of the source tree I’d say it is more likely that the base was V32 with bits of 4.1BSD imported than the other way around. If it was the other way around somebody would have spent considerable time to reorganise the source tree back to a form consistent with 32V. I think that such an effort would have been remembered even 40 years later.
> The kernel was a clean break, however: 4.1xBSD for some
> value of x (probably 4.1a but I don't remember which)
> with Research changes.
I don’t mean disrespect, but I think the surviving sources support Rob’s recollection that it was a gradual, ongoing effort.
As a first approximation looking at the top comments of a file gives its origin: the BSD derived files still have an SCCS-type marker. For example the file https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys/sys/vmmem.c still has the top comment "/* vmmem.c 4.7 81/07/09 */“, even though it was last touched in 1985. (By the way, who knows which tool generated these comments? Is it early SCCS?)
For the VM code, the BSD version stamp comment strings are consistent with the 4.1BSD release. For the TCP/IP stack they are consistent with 4.2BSD; it would seem probable to me that this code was imported multiple times during the development of 8th Edition.
As far as I can tell 4.1aBSD was released in March or April 1982. Unfortunately no source code tape of it has surfaced, and SCCS coverage at this point is still very partial. I think 4.1b with the initial FFS implementation followed late summer 1982, I don’t have a more precise date (yet).
> -- Berkeley FFS replaced by Weinberger's bitmapped
> file system: essentially the V7 file system except
> the free list was a bitmap and the blocksize was 4KiB.
Thank you for pointing this out. With my focus on networking I had completely missed that.
> Hacky implementation, depending on a flag bit in the
> minor device number; didn't use the file system switch.
> Old 512-byte-block file systems had to be supported
> partly to ease the changeover, partly because the first
> version had a limited bitmap size so file systems larger
> than about 120MiB wouldn't work.
For those interested, some of the relevant files are:
https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys/h/param.h (middle bit)
https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys/h/filsys.h (note the union)
https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/sys/sys/alloc.c (note 'if(BITFS(dev))’)
And indeed the bitmap was fitted inside the 4KB superblock,
> This limit was removed
> later. (In retrospect I'm surprised I didn't then insist
> on converting any remaining old-format file systems in
> our domain and then removing the old-format code from
> the kernel, since user-mode tools--including a user-mode
> file server!--could be used to access any old disks
> discovered later.)
>
> For the purposes of Paul's note it probably suffices
> just to say that there was a restart with a 4.1-series
> kernel with changes as he describes, except also the
> new file system format.
Warren has been nice enough to put 8th, 9th and 10th edition on the TUHS “Unix Tree” web page.
There is the following question on each entry web page: “Who wants to write something here?”
Below my suggested draft text for Eight Edition. All suggestions for improvement welcome.
===
Shortly after the release of 7th Edition, the VAX became the base machine for further Unix development. The initial code base was the 32V port, enhanced with selected elements from 4.1BSD, such as support for virtual memory and later the TCP/IP stack. From there the code further evolved: Eighth Edition of Unix was released by Bell Laboratories in February 1985, six years after Seventh Edition.
Key innovations in 8th Edition include ‘streams’ and the 'file system switch’, which allowed the “everything is a file” approach to be extended to new areas. Three notable applications built on these were the ‘/proc’ file system and new debugger API, a unified approach to networking over Datakit, TCP/IP and phone lines, and a network file system.
Eighth Edition is also at the root of graphical user interfaces on Unix, being the platform used for the development of the ‘Blit’ graphical terminal.
Several of the new ideas from Eigth Edition found their way into the 3rd release of System V, although in a much modified way.
===
Anybody feel like a text/voice chat on the ClassicCmp Discord server in about 13 hours, say 2200 UTC?
#coff and the General voice channel.
I'll pop on for an hour but start whenever you feel like.
Cheers, Warren
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
Minor corrections to the material in Paul's text.
This is meant to be a laundry-list of facts, not a
suggested set of words; I'm feeling too prolix this
morning to produce the latter, and figure those on the
list may be interested in the petty details anyway.
The initial user-mode environment was a mix of 32V,
subsequent work within 1127, and imports from 4.1BSD.
I don't know the exact heritage: whether it was 1127's
work with 4.1BSD stuff added or vice-versa.
The kernel was a clean break, however: 4.1xBSD for some
value of x (probably 4.1a but I don't remember which)
with Research changes. By the time of V8, that means:
-- All trace of BSD's original network interfaces removed,
except that select(2) remained in a slightly-different
form.
-- Stream I/O system added; all communication-device
drivers (serial ports, Ethernet, Datakit) changed to
work with streams. Pipes were streams.
-- File system switch added, supporting Killian's /proc
and Weinberger's first-generation (neta) network file
system.
-- Berkeley FFS replaced by Weinberger's bitmapped
file system: essentially the V7 file system except
the free list was a bitmap and the blocksize was 4KiB.
Hacky implementation, depending on a flag bit in the
minor device number; didn't use the file system switch.
Old 512-byte-block file systems had to be supported
partly to ease the changeover, partly because the first
version had a limited bitmap size so file systems larger
than about 120MiB wouldn't work. This limit was removed
later. (In retrospect I'm surprised I didn't then insist
on converting any remaining old-format file systems in
our domain and then removing the old-format code from
the kernel, since user-mode tools--including a user-mode
file server!--could be used to access any old disks
discovered later.)
For the purposes of Paul's note it probably suffices
just to say that there was a restart with a 4.1-series
kernel with changes as he describes, except also the
new file system format.
Norman Wilson
Toronto ON
Doug McIlroy:
The v8 manual was printed in 1985, but the system was
not "released" in the ordinary sense until a couple of
years ago. Some v8 features made it out into the world
via USG; some were described in open literature or
Usenix presentations, but I believe none were formally
shipped out of the company.
I'm surprised; I thought copies of the V8 manual existed
when I arrived at the Labs in mid-1984, but the date on
the title page is indeed February 1985.
There was no general release of V8 like those for earlier
Research systems, but there was a quasi-official V8 tape
sent to a handful of universities under a special letter
agreement. I remember working on that with Dennis,
checking that everything compiled and worked properly
in a chroot environment before the tape was written.
I think that happened in the summer of 1985.
I don't remember our doing that work, to make a single
coherent consistency-checked release tape, for any
subsequent system; just one-off caveat-emptor snapshots.
Norman Wilson
Toronto ON
The v8 manual was printed in 1985, but the system was
not "released" in the ordinary sense until a couple of
years ago. Some v8 features made it out into the world
via USG; some were described in open literature or
Usenix presentations, but I believe none were formally
shipped out of the company.
Doug
I ran a search for ‘Datakit’ on the archive of this maling list and came across the below message from Norman Wilson (Sep 2017). Having spent quite a bit of time recently on figuring out Datakit details and 8th Edition source, I now much better understand what he was saying — or at least I think I do.
It made me take another look at the /dev/pk[0123].c files in the V7 source code. I’d seen it before, but always thought it was UUCP code.
Now I’m wondering. It looks like UUCP packet "protocol g” is maybe much the same as the original (“Chesson”) packet algorithm for Datakit, and if so it would be “dual use”. It would seem that in V7 line discipline ‘0’ was normal tty handling, discipline ‘1’ was PK protocol over serial and line discipline ‘2’ was PK protocol over something with CRC in the driver - whatever that was.
If the above thought is correct, then it shines a light on network buffering in V7: it uses buffer space in blocks of n*32 bytes, carved out from a pool of disk buffers (see pk3.c); it pre-allocates space for one full receive window.
I have not fully figured it out, but at first glance it seems that the PK line discipline was only integrated with the DH-11 driver in the public V7 source. That would make sense in a networking context, as that board offered input buffering / DMA output to reduce the interrupt load. In 1979 Datakit seems to have connected over a DR-11C board, but there is no driver for that in the V7 source tree.
Am I on the right track?
=====
The point of the stream I/O setup with
stackable line disciplines, rather than the old single
line-discipline switch, was specifically to support networking
as well as tty processing.
Serial-device drivers in V7 used a single line-discipline
driver, used variously for canonical-tty handling and for
network protocols. The standard system as used outside
the labs had only one line discipline configured, with
standard tty handling (see usr/sys/conf/c.c). There were
driver source files for what I think were internal-use-only
networks (dev/pk[12].c, perhaps), but I don't think they
were used outside AT&T.
The problem Dennis wanted to solve was that tty handling
and network protocol handling interfered with one another;
you couldn't ask the kernel to do both, because there was
only one line discipline at a time. Hence the stackable
modules. It was possible to duplicate tty handling (probably
by placing calls to the regular tty line discipline's innards)
within the network-protocol code, but that was messy. It also
ran into trouble when people wanted to use the C shell, which
expected its own special `new tty' line discipline, so the
network code would have to know which tty driver to call.
It made more sense to stack the modules instead, so the tty
code was there only if it was needed, and different tty
drivers could exist without the network code knowing or caring.
When I arrived at the Labs in 1984, the streams code was in
use daily by most of us in 1127. The terminals on our desks
were plugged into serial ports on Datakit (like what we call
a terminal server now). I would turn on my terminal in the
morning, tell the prompt which system I wanted to connect to,
and so far as I could tell I had a direct serial connection.
But in the remote host, my shell talked to an instance of the
tty line module, which exchanged data with a Datakit protocol
module, which exchanged data with the low-level Datakit driver.
If I switched to the C shell (I didn't but some did), csh would
pop off the tty module and push on the newtty module, and the
network code was none the wiser.
Later there was a TCP/IP that used the stream mechanism. The
first version was shoehorned in by Robert T Morris, who worked
as a summer intern for us; it was later cleaned up considerably
by Paul Glick. It's more complicated because of all the
multiplexers involved (Ethernet packets split up by protocol
number; IP packets divided by their own protocol number;
TCP packets into sessions), but it worked. I still use it at
home. Its major flaw is that details of the original stream
implementation make it messy to handle windows of more than
4096 bytes; there are also some quirks involving data left in
the pipe when a connection closes, something Dennis's code
doesn't handle well.
The much-messier STREAMS that came out of the official System
V people had fixes for some of that, but at the cost of quite
a bit more complexity; it could probably be done rather better.
At one point I wanted to have a go at it, but I've never had
the time, and now I doubt I ever will.
One demonstration of virtue, though: although Datakit was the
workhorse network in Research when I was there (and despite
the common bias against virtual circuits it worked pretty well;
the major drawback was that although the underlying Datakit
fabric could run at multiple megabits per second, we never had
a host interface that could reliably run at even a single megabit),
we did once arrange to run TCP/IP over a Datakit connection.
It was very simple in concept: make a Datakit connection (so the
Datakit protocol module is present); push an IP instance onto
that stream; and off you go.
I did something similar in my home V10 world when quickly writing
my own implementation of PPP from the specs many years ago.
The core of that code is still in use in my home-written PPPoE code.
PPP and PPPoE are all outside the kernel; the user-mode program
reads and writes the serial device (PPP) or an Ethernet instance
that returns just the desired protocol types (PPPoE), does the
PPP processing, and reads and writes IP packets to a (full-duplex
stream) pipe on the other end of which is pushed the IP module.
All this is very different from the socket(2) way of thinking,
and it has its vices, but it also has its virtues.
Norman Wilson
Toronto ON