>> I don't know the answer to Ctrl-D.
The Unix command "man ascii" has the answer:
Oct Dec Hex Char Oct Dec Hex Char
------------------------------------------------------------------------
000 0 00 NUL '\0' 100 64 40 @
001 1 01 SOH (start of heading) 101 65 41 A
002 2 02 STX (start of text) 102 66 42 B
003 3 03 ETX (end of text) 103 67 43 C
004 4 04 EOT (end of transmission) 104 68 44 D
....
Ctrl-D signifies end of transmission. Some other O/Ses have used
Ctrl-Z for that purpose, presumably because Z is the final letter
of numerous alphabets.
There is a good book about the history of character sets (pre-Unicode)
in the book described at this URL:
http://www.math.utah.edu/pub/tex/bib/master.html#Mackenzie:1980:CCS
Bob Bemer (1920--2004), known as Dr. ASCII to some of us, was a key
person in the standardization of character sets:
https://en.wikipedia.org/wiki/Bob_Bemerhttps://en.wikipedia.org/wiki/ASCII
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------
What are the 1970’s & 1980’s Computing / IT skills “our grandkids won’t have”?
Whistling into a telephone while the modem is attached, because your keyboard has a stuck key
- something I absolutely don’t miss.
Having a computer in a grimy wharehouse with 400 days of uptime & wondering how a reboot might go?
steve j
=========
9 Skills Our Grandkids Will Never Have
<https://blog.myheritage.com/2022/06/9-skills-our-grandkids-will-never-have/>
1: Using record players, audio cassettes, and VCRs
2: Using analog phones [ or an Analog Clock ]
3. Writing letters by hand and mailing them
4. Reading and writing in cursive
5. Using manual research methods [ this is a Genealogy site ]
6. Preparing food the old-fashioned way
7. Creating and mending clothing
8. Building furniture from scratch
9. Speaking the languages of their ancestors
--
Steve Jenkin, IT Systems and Design
0412 786 915 (+61 412 786 915)
PO Box 38, Kippax ACT 2615, AUSTRALIA
mailto:sjenkin@canb.auug.org.au http://members.tip.net.au/~sjenkin
Warner Losh:
Alcatel-Lucent gave an official grant to V8, V9 and V10. See
https://www.tuhs.org/Archive/Distributions/Research/Dan_Cross_v8/statement_…
====
Quite so. I believe this was announced on a mailing list called TUHS.
Those here who are interested in such things might want to subscribe;
I have and find it quite useful and interesting, with occasional
disappointment.
Norman Wilson
Toronto ON
(typing this on a train in Texas)
> I understand UNIX v7 is under this BSD-style license by Caldera Inc.
> https://www.tuhs.org/Archive/Caldera-license.pdf
The eqn document by Kernighan and Cherry also appears in the v10
manual, copyright by AT&T and published as a trade book. Wouldn't the
recent release of v10 also pertain to the manual?
Doug
Following an insightful post by Norman Wilson (https://minnie.tuhs.org/pipermail/tuhs/2022-June/025929.html) and re-reading a few old papers (https://minnie.tuhs.org/pipermail/tuhs/2022-June/026028.html) I was thinking about similarities and differences between the various Unix networking approaches in the 1975-1985 era and came up with the following observations:
- First something obvious: early Unix was organised around two classes of device: character based and block based. Arguably, it is maybe better to think of these classes conceptually as “transient” and “memoizing”. A difference between the two would be wether or not it makes conceptual sense to do a seek operation on them and pipes and networks are in the transient class.
- On the implementation side, this relates two early kernel data structures: clists and disk buffers. Clists were designed for slow, low volume traffic and most early Unix network code creates a third kind: the mbufs of Arpanet Unix, BBN-TCP Unix and BSD, the packets of Chesson's V7 packet driver, Ritchie's streams etc. These are all the same when seen from afar: higher capacity replacements for clists.
- Typically devices are accessed via a filter. At an abstract level, there is not much difference between selecting a line discipline, pushing a stream filter or selecting a socket type. At the extreme end one could argue that pushing a TCP stack on a network device is conceptually the same as mounting a file system on a disk device. Arguably, both these operations could be performed through a generalised mount() call.
- Another implementation point is the organisation of the code. Is the network code in the kernel, or in user land? Conceptually connection management is different from stream management when connected (e.g. CMC and URP with Datakit, or RTP and BSP in Xerox Pups). In the BSD lineage all is in the kernel, and in the Research lineage connection management is done in a user space daemon.
Arpanet Unix (originally based on V5) had a curious solution: the network code was organised in a single process, but with code both in kernel mode and in user mode. The user code would make a special system call, and the kernel code would interact with the IMP driver, manage buffers and deliver packets. Only when a state-changing event happened, it would return to user mode and the user code would handle connection management (followed by a new call into kernel mode). Interestingly, this approach mostly hid the IMP connection, and this carried through to the BSD’s where the network devices were also buried in the stack. Arpanet Unix made this choice to conserve kernel address space and to minimize the amount of original kernel code that had to be touched.
- Early Unix has three ways to obtain a file descriptor: open, creat and pipe. Later also fifo. In this context adding more (like socket) does not seem like a mortal sin. Arguably, all could be rolled into one, with open() handling all cases. Some of this was done in 4.2BSD. It is possible to combine socket() & friends into open() with additional flags, much as was done in Arpanet Unix and BBN-TCP Unix.
- Network connections have different meta data than disk files, and in sockets this handled via specialised calls. This seems a missed opportunity for unified mechanisms. The API used in BBN-TCP handles most of this via ioctl. However, one could (cheekily!) argue that V7 unix has a somewhat byzantine meta data API, with the functionality split over seek, ioctl, fcntl, stat and fstat. These could all be handled in a generalised ioctl. Conceptually, this could also be replaced by using read/write on a meta data file descriptor, which could for example be the regular descriptor with the high bit set. But this, of course, did never exist.
- A pain point in Arpanet Unix was that a listening connection (i.e. a server endpoint) would block until a client arrived and then turn into the connection with the client. This would fork out into a service process and the main server process would open a new listening socket for the next client. In sockets this was improved into a rendez-vous type server connection that would spawn individual client connections via ‘accept’. The V8/V9 IPC library took a similar approach, but also developed the mechanism into a generalized way to (i) create rendez-vous points and (ii) ship descriptors across local connections.
- The strict blocking nature of IO in early Unix was another pain point for writing early network code. The first solution to that were BBN’s await and capac primitives, which worked around the blocking nature. With SysIII, non-blocking file access appeared and 4.1a BSD saw the arrival of 'select’. Together these offer a much more convenient way to deal with multiple tty or network streams in a single threaded process (although it did modify some of the early Unix philosophy). Non-blocking IO and select() also appeared in the Research lineage with 8th edition.
- The file system switch (FSS) arrived around 1983, during the gestation of 8th edition. This was just 1 or 2 years after the network interfaces for BSD and Datakit got their basic shape. Had the FSS been part of V7 (as it well could have been), probably the networking designs would have been a bit different, using virtual directories for networking connections. The ‘namei hack’ in MIT’s CHAOS network code already points in this direction. A similar approach could have been extended to named pipes (arriving in SysIII), where the fifo endpoint could have been set up through creating a file in a virtual directory, and making connections through a regular open of such a virtual file (and 9th edition appears to implement this.)
oOo
To me it seems that the V1-V7 abstractions, the system call API, etc. were created with the experience of CTSS, Multics and others fresh in mind. The issues were understood and it combined the best of the ideas that came before. When it came to networking, Unix did not have this advantage and was necessarily trying to ride a bike whilst inventing it. Maybe in a different time line it would have been possible to pick the best ideas in this area as well and combine these into a coherent framework.
I concur with the observation that this list should be about discussion of what once was and only tangentially about what might have been, so it is only after considerable hesitation that I write the below.
Looking at the compare and contrast above (and having been tainted by what became dominant in later decades), I would say that the most “Unixy” way to add networking to V7/SysIII era Unix would have been something like:
- Network access via open/read/write/close, in the style of BBN-TCP
- Network namespace exposed via a virtual file system, a bit like V9
- Meta data via a generalised ioctl, or via read/write on a meta data descriptor
- Connection rendez-vous via a generalised descriptor shipping mechanism, in the style of V8/V9
- Availability of non-blocking access, together with a waiting primitive (select/poll/etc.), in the style of BSD
- Primary network device visible as any other device, network protocol mounted similar to a file system.
- Both connection management and stream management located in kernel code, in the style of BSD
i remember a fellow student debugging an lsi11 kernel using a form of analogue vectorscope.
i think it had a pair of DACs attached to the upper bits of the address bus. it generated a 2d pattern which you could recognise as particular code - interrupts are here, userspace is there, etc.
the brightness of the spot indicated the time spent, so you got a bit of profiling too - and deadlocks became obvious.
anyone remember these, what where they called? i think it was an HP or Tek product.
-Steve
Wanted to post my notes as plain text, but the bullets / sub-bullets get lost.
Here is a 2 page PDF with my notes on Research Datakit:
https://www.jslite.net/notes/rdk.pdf
The main takeaway is that connection build-up and tear-down is considerably more expensive than with TCP. The first cost is in the network, which builds up a dedicated path for each connection. Bandwidth is not allocated/reserved, but a path is and routing information is set up at each hop. The other cost is in the relatively verbose switch-host communication in this phase. This compares to the 3 packets exchanged at the hosts’ driver level to set up a TCP connection, with no permanent resources consumed in the network.
In compensation, the cost to use a connection is considerably lower: the routing is known and the host-host link protocol (“URP") can be light-weight, as the network guarantees in-order delivery without duplicates but packets may be corrupted or lost (i.e. as if the connection is a phone line with a modem). No need to deal with packet fragmentation, stream reassembly and congestion storms as in the TCP of the early 80’s.
Doing UDP traffic to a fixed remote host is easily mapped to using URP with no error correction and no flow control. Doing UDP where the remote host is different all the time is not practical on a Datakit network (i.e. a virtual circuit would be set up anyway).
A secondary takeaway is that Research Datakit eventually settled on a three-level ascii namespace: “area/trunk/switch”. On each switch, the hosts would be known by name, and each connection request had a service name as parameter. In an alternate reality we would maybe have used “ca/stclara/mtnview!google!www” to do a search.
> From: Rob Pike
> having the switch do some of the call validation and even maybe
> authentication (I'm not sure...) sounds like it takes load off the host.
I don't have enough information to express a judgement in this particular
case, but I can say a few things about how one would go about analyzing
questions of 'where should I put function [X]; in the host, or in the
'network' (which almost inevitably means 'in the switches')'.
It seems to me that one has to examine three points:
- What is the 'cost' to actually _do_ the thing (which might be in
transmission usage, or computing power, or memory, or delay), in each
alternative; these costs obviously generally cannot be amortized across
multiple similar transactions.
- What is the 'cost' of providing the _mechanism_ to do the thing, in each
alternative. This comes in three parts. The first is the engineering cost of
_designing_ the thing, in detail; this obviously is amortized across muiple
instances. The second is _producing_ the mechanism, in the places where it is
needed (for mechanisms in software, this cost is essentially zero, unless it
needs a lot of memory/computes/etc); this is not amortized across many. The
third is harder to measure: it's complexity.
This is probably a book by itself, but it has costs that are hard to
quantify, and are also very disparate: e.g. more complex designs are more
likely to have unforseen bugs, which is very different from the 'cost' that
more complex designs are probaly harder to evolve for new uses.
So far I haven't said anything that isn't applicable across a broad range of
information sytems. The last influence on where one puts functions is much
more common in communication systems: the Saltzer/Clark/Reed 'End-to-end
Arguments in System Design' questions. If one _has_ to put a function in the
host to get 'acceptable' performace of that function, the
operation/implementation/design cost implications are irrelevant: one has to
grit one's teeth and bear them.
This may then feed back to design questions in the other areas. E.g. the
Version 2 ring at MIT deliberately left out hardware packet checksums -
because it was mostly intended for use with TCP/IP traffic, which provided a
pseudo-End-to-End checksum, so the per-unit hardware costs didn't buy enough
to be worth the costs of a hardware CRC. (Which was the right call; I don't
recall the lack of a hardware checksum ever causing a problem.)
And then there's the 'techology is a moving target' point: something that
might be unacceptably expensive (in computing cost) in year X might be fine
in year X+10, when we're lighting our cigars with unneeded computing power.
So when one is designing a communication system with a likely lifetime in
many decades, one tends to bias one's judgement toward things like End-to-End
analysis - because those factors will be forever.
Sorry if I haven't offered any answer to your initial query: "having the
switch do some of the call validation ... sounds like it takes load off the
host", but as I have tried to explain, these 'where should one do [X]'
questions are very complicated, and one would need a lot more detail before
one could give a good answer.
But, in general, "tak[ing] load off the host" doesn't seem to rate
highly as a goal these days... :-) :-(
Noel
> From: Paul Ruizendaal
> Will read those RFC's, though -- thank you for pointing them out.
Oh, I wouldn't bother - unless you are really into routing (i.e. path
selection).
RFC-1992 in particular; it's got my name on it, but it was mostly written by
Martha and Isidro, and I'm not entirely happy with it. E.g. CSC mode and CSS
mode (roughly, strict source route and loose source route); I wasn't really
sold on them, but I was too tired to argue about it. Nimrod was complicated
enough without adding extra bells and whistles - and indeed, LSR and SSR are
basically unused to this day in the Internet (at least, at the internet
layer; MPLS does provide the ability to specify paths, which I gather is used
to some degree). I guess it's an OK overview of the architecture, though.
RFC-1753 is not the best overview, but it has interesting bits. E.g. 2.2
Packet Format Fields, Option 2: "The packet contains a stack of flow-ids,
with the current one on the top." If this reminds you of MPLS, it should!
(One can think of MPLS as Nimrod's packet-carrying subsystem, in some ways.)
I guess I should mention that Nimrod covers more stuff - a lot more - than
just path selection. That's because I felt that the architecture embodied in
IPv4 was missing lots of things which one would need to do the internet layer
'right' in a global-scale Internet (e.g. variable length 'addresses' - for
which we were forced to invent the term 'locator' because many nitwits in the
IETF couldn't wrap their minds around 'addresses' which weren't in every
packet header). And separation of location and identity; and the introduction
of traffic aggregates as first-class objects at the internet layer. Etc, etc,
etc.
Nimrod's main focus was really on i) providing a path-selection system which
allowed things like letting users have more input to selecting the path their
traffic took (just as when one gets into a car, one gets to pick the path
one's going to use), and ii) controlling the overhead of the routing.
Of course, on the latter point, in the real world, people just threw
resources (memory, computing power, bandwidth) at the problem. I'm kind of
blown away< that there are almost 1 million routes in the DFZ these days.
Boiling frogs...
Noel
I thought this comment was very good.
I went looking for “Clem’s Law” (presume Clem Cole) and struck out.
Any hints anyone can suggest or history on the comment?
steve j
==========
Larry McVoy wrote Fri Sep 17 10:44:25 AEST 2021
<https://minnie.tuhs.org/pipermail/tuhs/2021-September/024424.html>
Plan 9 is very cool but I am channeling my inner Clem,
Plan 9 didn't meet Clem's law.
It was never compelling enough to make the masses love it.
Linux was good enough.
==========
--
Steve Jenkin, IT Systems and Design
0412 786 915 (+61 412 786 915)
PO Box 38, Kippax ACT 2615, AUSTRALIA
mailto:sjenkin@canb.auug.org.au http://members.tip.net.au/~sjenkin