TUHS November 2017

tuhs@tuhs.org

82 participants
124 discussions

by Paul Ruizendaal

I'm trying to figure out how tcp/ip networking worked in 8th edition Unix. I'm starting from dmr's paper about streams (http://cm.bell-labs.co/who/dmr/st.html) the V8 man pages (http://man.cat-v.org/unix_8th/3/) and browsing the source code (tarball here http://www.tuhs.org/Archive/Distributions/Research/Dan_Cross_v8/) In the below I use 'socket' to mean a file descriptor bound to a network connection. My current understanding is like this: - The hardware interface is exposed as a character device; for tcp/ip only ethernet is supported. Directly accessing this device reads/writes ethernet frames. - One can push an 'ip' module (only) onto an ethernet device; this module also handles ARP. Once this is done IP messages are queued to the virtual ip devices, /dev/ipXX. The device minor number XX equals the protocol number, i.e. the ip packets are demultiplexed into separate virtual devices. IP packets from multiple ethernet cards all end up on the same virtual ip devices. I'm not sure if one can still read/write to the ethernet device after pushing the ip module, but I think you can, creating a raw IP interface so to say. - On /dev/ip6 one can push a TCP module. The TCP module handles the TCP protocol and demultiplexes incoming traffic to the virtual /dev/tcpXX devices. On /dev/ip17 one can push a UDP module. The UDP module handles the UDP protocol and demultiplexes incoming traffic to the virtual /dev/udpXX devices. Not sure wether the ip6 and ip17 devices can still be read/written after pushing these disciplines. - There are 100 udp devices, /dev/updXX. To open a UPD socket, one opens an unused udp device (through linear search). This socket accepts binary commands ('struct upduser') through the read()/write() system calls. There is a command to set the local port (effectively 'bind') and a comment to also set the foreign address and port (effectively 'bind+connect'). As long as the socket is not connected actual datagrams are preceded by a command header with the address/port information (effectively 'sendto'/'recvfrom'). Once the socket is connected, it is no longer possible to send further commands, but each write/read is a datagram. For udp sockets it is not possible to specify the local address: it is chosen by the system to match with the given foreign address. - There are 100 tcp devices /dev/tcpXX. Initial connection is always over an odd numbered device. To open a TCP socket, one opens an unused tcp device (through linear search). This socket accepts binary commands ('struct tcpuser') through the read()/write() system calls. There is a command to actively connect (effectively 'connect' with optional 'bind'), and a command to passively listen (effectively 'bind'+'listen'). If the connect command is sent, one can read one more response block and then the socket becomes a regular tcp socket. If the listen command is sent, one can read multiple response blocks, one for each new client (effectively 'accept'). Those response blocks contain a device number for the new client connection, i.e. one has to subsequently open device /dev/tcpXY to talk to the client. This number is always even, i.e. locally initiated tcp connections are over odd numbered tcp devices, and remotely initiated connections are over even numbered tcp devices - not sure what the significance of this is. - The above seems to be modeled on the Datakit setup, where the network is exposed as 520 virtual devices, one for each channel, as /dev/dk/dkXXX. These channels than also seem to accept binary command blocks through the read()/write() interface, with a 'connect' type command changing the connection into a data only channel. Anybody on the list with 8th edition experience who can confirm that the above understanding is about correct? Paul

7 years, 7 months

Re: [TUHS] Harvard and Von Neumann Architectures and Unix

by jnc＠mercury.lcs.mit.edu

> From: Will Senn <will.senn(a)gmail.com> > I am curious about how the Harvard Architecture relates to Unix, > historically. If the Harvard Architecture is predicated on the > separation of code from data in order to prevent self-modifying code (my > interpretation) That's not the 'dictionary' definition, which is 'separate paths for instructions and data'. But let's go with the 'no self-modifying code' one for the moment. The thing is that self-modifying code is pretty much an artifact of the dawn of computers, before the economics of gates moved from that of tubes, to transistors, and also before people understood how important good support for subroutines was. (This latter is a reference to how Whirlwind did subroutines, with self-modifying code.) Once people had index registers, and lots of registers in general, self-modifying code (except for a few small, special hacks like bootstraps which had to fit in tiny spaces) became as dead as the dodo. It's just a Bad Idea. > then it would seem to me to be somewhat at odds with a Unix philosophy > of extreme abstraction (code, data, it's all 0's and 1's, after all). The people who built Unix were fundamentally very practical. Self-modifing code is not 'practical'. (And note that Unix from V4: http://minnie.tuhs.org/cgi-bin/utree.pl?file=V4/nsys/ken/text.c onward has support for pure text - for practical reasons). > the PDP-11 itself, with the Unibus and apparently agnostic ISA seem to > summarily reject the Harvard Architecure... You could say that of a zillion computers. The only recent computer I can think of offhand with separate instruction and data paths was the AMD 42K (nice chip, I used it in a product we built at Proteon). They had separate ports for instructions and data purely for performance reasons. (Our card had a pathway which allowed the CPU to write the instruction memory, needed during booting, obviously; the details as to how we did it escape me now.) > From: Jon Steinhart > For all intents and purposes instructions were separate from data from > the PDP 11/70 on. s/70/45/. And the other -11 memory management (as on the /40, /23, etc) does allow for execute-only 'segments' (they call them 'pages' in the later versions of the manual, but they're not) - again, separating code from data. Unix used this for shared pure texts. And note that those machines with separate I+D space don't meet the dictionary definition either, because they only have one bus from the CPU to memory, shared between data and instruction fetches. Noel

7 years, 7 months

Re: [TUHS] Harvard and Von Neumann Architectures and Unix

by jnc＠mercury.lcs.mit.edu

> From: Doug McIlroy > Optimal code for bitblt (raster block transfers) in the Blit Interesting case. I'm not familiar with BitBLT codes, do they actually modify the existing program, or rather do they build small custom ones? Only the former is what I was thinking of. Noel

7 years, 7 months

Re: [TUHS] Harvard and Von Neumann Architectures and Unix

by Doug McIlroy

>From the discussion of self-modifying code: >> Optimal code for bitblt (raster block transfers) in the Blit > > Interesting case. I'm not familiar with BitBLT codes, do they actually modify > the existing program, or rather do they build small custom ones? Only the > > former is what I was thinking of. > It built small custom fragments of code. But if that had been in D space, it couldn't have been executed. >> Surely JIT compiling must count as self-modifying code. > > If it does, then my computer just runs one program from when I turn it > on. It switches memory formats and then is forever extending itself and > throwing chunks away. Exactly. That is the essence of stored-program computers. The exec system call is self-modification with a vengeance. Fill memory-and-execute is the grandest coercion I know. What is data one instant is code the next. It's all a matter of viewpoint and scale. Where is the boundary between changing one instruction and changing them all? Or is this boundary a figment of imagination? Doug

7 years, 7 months

Re: [TUHS] TTY8

by jnc＠mercury.lcs.mit.edu

> From: "Ron Natalie" > Every PDP-11 UNIX I ever used had the console KL-11 as /dev/tty8. > The question is why. Blast! I have this memory of reading an explanation for that somewhere - but I cannot remember what it was, or where! I've done a grep through my hoard of Unix documents, looking for "tty8", but no hits. Noel

7 years, 7 months

Re: [TUHS] Harvard and Von Neumann Architectures and Unix

by Doug McIlroy

> The thing is that self-modifying code is pretty much an artifact of the dawn > of computers, [...] > > It's just a Bad Idea. Surely JIT compiling must count as self-modifying code. Optimal code for bitblt (raster block transfers) in the Blit

7 years, 7 months

Re: [TUHS] Spell - was tmac: Move macro diagnostics away from `quotes'

by Doug McIlroy

Repeat, slightly modified, of a previous post that got shunted to the attachment heap. > I am curious if anyone on the list remembers much > about the development of the first spell checkers in Unix? Yes, intimately. They had no relationship to the PDP 10. The first one was a fantastic tour de force by Bob Morris, called "typo". Aside from the file "eign" of the very most common English words, it had no vocabulary. Instead it evaluated the likelihood that any particular word came from a source with the same letter-trigram frequencies as the document as a whole. The words were then printed in increasing order of likelihood. Typos tended to come early in the list. Typo, introduced in v3, was very popular until Steve Johnson wrote "spell", a remarkably short shell script that (efficiently) looks up a document's words in the wordlist of Webster's Collegiate Dictionary, which we had on line. The only "real" coding he did was to write a simple affix-stripping program to make it possible to look up plurals, past tenses, etc. If memory serves, Steve's program is described in Kernighan and Pike. It appeared in v5. Steve's program was good, but the dictionary isn't an ideal source for real text, which abounds in proper names and terms of art. It also has a lot of rare words that don't pull their weight in a spell checker, and some attractive nuisances, especially obscure short words from Scots, botany, etc, which are more likely to arise in everyday text as typos than by intent. Given the basic success of Steve's program, I undertook to make a more useful spelling list, along with more vigorous affix stripping (and a stop list to avert associated traps, e.g. "presenation" = pre+senate+ion"). That has been described in Bentley's "Programming Pearls" and in http://www.cs.dartmouth.edu/~doug/spell.pdf. Morris's program and mine labored under space constraints, so have some pretty ingenious coding tricks. In fact Morris has a patent on the way he counted frequencies of the 26^3 trigrams in 26^3 bytes, even though the counts could exceed 255. I did some heroic (and probabilistic) encoding to squeeze a 30,000 word dictionary into a 64K data space, without severely affecting lookup time. Doug

7 years, 7 months

Re: [TUHS] Spell - was tmac: Move macro diagnostics away from `quotes'

by Nelson H. F. Beebe

BibTeX entries for the complete contents of the Bell System Technical Journal family are in the TeX User Group archives at http://www.math.utah.edu/pub/tex/bib/bstj1970.bib [change 1970 to other decades, and .bib to .html for live hyperlinks]. The PDF URLs for bstj.bell-labs.com no longer work, and the ones for www.alcatel-lucent.com, such as http://www.alcatel-lucent.com/bstj/vol57-1978/articles/bstj57-6-2155.pdf now redirect to an HTML page. Otherwise, articles are available from the Wiley site at http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1538-7305/issues/ but are behind a paywall. There are also copies in the IEEE eXplore database at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?reload=true&punumber=6731002 I tried to find the URLs at https://web.archive.org/, but it does appear to have them. ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu - - 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------

7 years, 7 months

Re: [TUHS] Spell - was tmac: Move macro diagnostics away from `quotes'

by jnc＠mercury.lcs.mit.edu

> From: "Nelson H. F. Beebe" > The PDF URLs for bstj.bell-labs.com no longer work, and the ones for > www.alcatel-lucent.com ... now redirect to an HTML page. With any luck, someone scraped them before they went. I've gotten in the habit of scraping all the Web content I look at, since it has (as above) a distressing tendency to vapourize. Noel

7 years, 7 months

Re: [TUHS] A man easter-egg (gimme gimme gimme)

by Nemo

On 24 November 2017 at 10:11, Nemo <cym224(a)gmail.com> wrote: > On 22 November 2017 at 03:48, <arnold(a)skeeve.com> wrote (in part): >>> As a former developer and manager, I would be really pissed off if my >>> programmers wasted their time on writing useless frippery instead of >>> quality code, and I would certainly have a little chat with them... >> >> I think that this is totally appropriate for code being developed >> for a paid product. > > I would say this is context-sensitive (industry, customers, ...). One > version of MS Word had an animation of a cartoon monster crushing "WP' > (somewhere in the credits, I recall). > > N. I really must be more careful with replies. The above was meant for TUHS, not just Arnold. N.

7 years, 7 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

TUHS November 2017