Hi all, to quickly answer one recent question. If you want to upload
something Unix-related for me to archive, you can anonymous ftp upload to
ftp://minnie.tuhs.org/incoming/
Nobody can list the directory contents, so it's good for sensitive files.
If you upload something called xyz, can you also add xyz_Readme which might
describe e.g. what the thing is, where it came from, file format (e.g.
floppy images), how to install it, any other useful information.
If you think it can be added to the public Unix Archive at
http://www.tuhs.org/Archive/, or if the file definitely can't be added
and I should move it to the hidden archive, also say so. Also feel free
not to disclose your identity.
Cheers, Warren
P.S Work has become busy this year. I might call for people to help
out with the curation. Any volunteers? Discretion is a pre-requisite.
> From: Atindra Chaturvedi
> including the Mt. Xinu Mach 386 distro. I still have it and will happily
> send it to the archives
Oh, that's fantastic. It's so important that everyone who has these chunk of
computing history make sure they make it into repositories!
> I have my books and the software for all the cool stuff as it came out
> in those days - some day I will compile it and send it to where it can
> be better used or archived as history.
Please do! And everyone else, please emulate! (I'm already doing my bit! :-)p
Noel
> OK, we're starting to get through all the clearances needed to release
> the non-MIT Unix systems
We have now completed (as best we can) the OK's for the 'BBN TCP/IP V6 Unix',
and I finally bestirred myself to add in the documentation I found for it,
and crank out a tarball, available here:
http://ana-3.lcs.mit.edu/~jnc/tech/pdp11/tmp/bbn.tar
It includes all the documentation files I found for the Rand and BBN code (in
the ./doc directory); included are the original NROFF source to the two Rand
publications about ports, and several BBN reports.
This is an early TCP/IP Unix system written at BBN. It was not the first
TCP/IP Unix; that was one done at BBN in MACRO-11, based on a TCP done in
MACRO-11 by Jim Mathis at SRI for the TIU (Terminal nterface Unit).
This networking code is divided into three main groups. First there is
code for the kernel, which includes IPC enhancements to Unix, including
Rand ports, as well as further extensions to that done at BBN for the
earlier TCP - the capac() and await() calls. It also includes a IMP
interface driver (the code only interfaced to the ARPANET at this point in
time). Next, TCP is implemented as a daemon which ran as a single process
which handled all the connections. Finally, other programs implement
applications; TELNET is the only one provided at this point in time.
The original port code was written by Steven Zucker at Rand; the extensions
done at BBN were by Jack Haverty. The TCP was mostly written by Mike
Wingfield, apparently with some assistance by Jon Dreyer. Dan Franklin
apparently wrote the TELNET.
Next, I'll be working on the MIT-CSR machine. That's going to take quite a
while - it's a whole system, with a lot of applications. It does include FTP,
SMTP, etc, though, so it will be a good system for anyone who wants to run V6
with TCP on a /23. We'll have to write device drivers for whatever networking
cards are out there, though.
Noel
> From: Larry McVoy
> Are you sure? Someone else said moshi was hi and mushi was bug. Does
> mushi have two meanings?
Yes:
http://www.nihongodict.com/?s=mushi
Actually, more than two! Japanese is chock-a-block with homonyms. Any
given Japanese word will probably have more than one meaning.
There's some story I don't quite recall about a recent Prime Minister who
made a mistake of this sort - although now that I think about it, it was
probably the _other_ kind of replication, which is that a given set of kanji
(ideograms) usually has more than one pronunciation. (I won't go into why,
see here:
http://mercury.lcs.mit.edu/~jnc/prints/glossary.html#Reading
for more.) So he was reading a speech, and gave the wrong reading for a word.
There is apparently a book (or more) in Japanese, for the Japanese, that lists
the common ones that cause confusion.
A very complicated language! The written form is equally complicated; there
are two syllabaries ('hiragana' and 'katakana'), and for the kanji, there are
several completely different written forms!
Noel
Follow-up to Larry's "Mushi! Mushi!" story
(http://minnie.tuhs.org/pipermail/tuhs/2017-February/008149.html)
I showed this to a Japanese acquaintance, who found it hilarious for a
different reason. He told me that a s/w bug is "bagu" -- a
semi-transliteration -- and "mushi" is "I ignore you". So corporate
called, asked for status, and the technical guy said "I am going to
ignore you!" and then hung up.
N.
I have found a video by Sandy Fraser from 1994 which discusses the Spider network (but not the related Unix software). The first 30 min or so are about Spider and the ideas behind it, then it moves on to Datakit and ATM:
https://www.youtube.com/watch?v=ojRtJ1U6Qzw
Although the thinking behind them is very different, the "switch" on the Spider network seems to have been somewhat similar to an Arpanet IMP.
Paul
==
On page 3 of the Research Unix reader (http://www.cs.dartmouth.edu/~doug/reader.pdf)
"Sandy (A. G.) Fraser devised the Spider local-area ring (v6) and the Datakit switch (v7) that have served in the lab for over a decade. Special services on Spider included a central network file store, nfs, and a communication package, ufs."
I do not recall ever seeing any SPIDER related code in the public V6 source tree. Was it ever released outside Bell Labs?
From a bit of Googling I understand that SPIDER was a ATDM ring network with a supervisor allocating virtual circuits. Apparently there was only ever one SPIDER loop with 11 hosts connected, although Fraser reportedly intended to create multiple connected loops as part of his research.
The papers that Fraser wrote are hard to find: lots of citations, but no copies, not even behind pay walls. The base report seems to be:
A. G. FRASER, " SPIDER-a data communication experiment", Tech Report 23 , Bell Lab, 1974.
Is that tech report available online somewhere?
Tanks!
Paul
> From: Random832
> You could return the address of the last character read, and let the
> user code do the math.
Yes, but that's still 'design the system call to work with interrupted and
re-started system calls'.
> If the terminal is in raw/cbreak mode, the user code must handle a
> "partial" read anyway, so returning five bytes is fine.
As in, if a software interrupt happens after 5 characters are read in, just
terminate the read() call and have it return 5? Yeah, I suppose that would
work.
> If it's in canonical mode, the system call does not copy characters into
> the user buffer until they have pressed enter.
I didn't remember that; that TTY code makes my head hurt! I've had to read it
(to add 8-bit input and output), but I can't remember all the complicated
details unless I'm looking at it!
> Maybe there's some other case other than reading from a terminal that it
> makes sense for, but I couldn't think of any while writing this post.
As the Bawden paper points out, probably a better example is _output_ to a
slow device, such as a console. If the thing has already printed 5 characters,
you can't ask for them back! :-)
So one can neither i) roll the system call back to make it look like it hasn't
started yet (as one could do, with input, by stuffing the characters back into
the input buffer with kernel ungetc()), or ii) wait for it to complete (since
that will delay delivery of the software interrupt). One can only interrupt
the call (and show that it didn't complete, i.e. an error), or have
re-startability (i.e. argument modification).
Noel
> From: Paul Ruizendaal
> There's an odd comment in V6, in tty.c, just above ttread():
> ...
> That comment is strange, because it does not describe what the code
> does.
I can't actually find anyplace where the PC is backed up (except on a
segmentation fault, when extending the stack)?
So I suspect that the comment is a tombstone; it refers to what the code did
at one point, but no longer does.
> The comment isn't there in V5 or V7.
Which is consistent with it documenting a temporary state of affairs...
> I wonder if there is a link to the famous Gabriel paper
I suspect so. Perhaps they tried backing up the PC (in the case where a system
call is interrupted by a software interrupt in the user's process), and
decided it was too much work to do it 'right' in all instances, and punted.
The whole question of how to handle software interrupts while a process is
waiting on some event, while in the kernel, is non-trivial, especially in
systems which use the now-universal approach of i) writing in a higher-level
stack oriented language, and ii) 'suspending' with a sub-routine call chain on
the kernel stack.
Unix (at least, in V6 - I'm not familiar with the others) just trashes the
whole call stack (via the qsav thing), and uses the intflg mechanism to notify
the user that a system call was aborted. But on systems with e.g. locks, it
can get pretty complicated (try Googling Multics crawl-out). Many PhD theses
have looked at these issues...
> Actually, research Unix does save the complete state of a process and
> could back up the PC. The reason that it doesn't work is in the syscall
> API design, using registers to pass values etc. If all values were
> passed on the stack it would work.
Sorry, I don't follow this?
The problem with 'backing up the PC' is that you 'sort of' have to restore the
arguments to the state they were in at the time the system call was first
made. This is actually easier if the arguments are in registers.
I said 'sort of' because the hard issue is that there are system calls (like
terminal I/O) where the system call is potentially already partially executed
(e.g. a read asking for 10 characters from the user's console may have
already gotten 5, and stored them in the user's buffer), so you can't just
simply completely 'back out' the call (i.e. restore the arguments to what they
were, and expect the system call to execute 'correctly' if retried - in the
example, those 5 characters would be lost).
Instead, you have to modify the arguments so that the re-tried call takes up
where it left off - in the example above, tries to read 5 characters, starting
5 bytes into the buffer). The hard part is that the return value (of the
number of characters actually read) has to count the 5 already read! Without
the proper design of the system call interface, this can be hard - how does
the system distinguish between the _first_ attempt at a system call (in which
the 'already done' count is 0), and a _later_ attempt? If the user passes in
the 'already done' count, it's pretty straightforward - otherwise, not so
much!
Alan Bawden wrote a good paper about PCLSR'ing which explores some of these
issues.
Noel
There's an odd comment in V6, in tty.c, just above ttread():
/*
* Called from device's read routine after it has
* calculated the tty-structure given as argument.
* The pc is backed up for the duration of this call.
* In case of a caught interrupt, an RTI will re-execute.
*/
That comment is strange, because it does not describe what the code does. The comment isn't there in V5 or V7.
I wonder if there is a link to the famous Gabriel paper about "worse is better" (http://dreamsongs.com/RiseOfWorseIsBetter.html) In arguing its points, the paper includes this story:
---
Two famous people, one from MIT and another from Berkeley (but working on Unix) once met to discuss operating system issues. The person from MIT was knowledgeable about ITS (the MIT AI Lab operating system) and had been reading the Unix sources. He was interested in how Unix solved the PC loser-ing problem. The PC loser-ing problem occurs when a user program invokes a system routine to perform a lengthy operation that might have significant state, such as IO buffers. If an interrupt occurs during the operation, the state of the user program must be saved. Because the invocation of the system routine is usually a single instruction, the PC of the user program does not adequately capture the state of the process. The system routine must either back out or press forward. The right thing is to back out and restore the user program PC to the instruction that invoked the system routine so that resumption of the user program after the interrupt, for example, re-enters the system routine. It is called PC loser-ing because the PC is being coerced into loser mode, where loser is the affectionate name for user at MIT.
The MIT guy did not see any code that handled this case and asked the New Jersey guy how the problem was handled. The New Jersey guy said that the Unix folks were aware of the problem, but the solution was for the system routine to always finish, but sometimes an error code would be returned that signaled that the system routine had failed to complete its action. A correct user program, then, had to check the error code to determine whether to simply try the system routine again. The MIT guy did not like this solution because it was not the right thing.
The New Jersey guy said that the Unix solution was right because the design philosophy of Unix was simplicity and that the right thing was too complex. Besides, programmers could easily insert this extra test and loop. The MIT guy pointed out that the implementation was simple but the interface to the functionality was complex. The New Jersey guy said that the right tradeoff has been selected in Unix -- namely, implementation simplicity was more important than interface simplicity.
---
Actually, research Unix does save the complete state of a process and could back up the PC. The reason that it doesn't work is in the syscall API design, using registers to pass values etc. If all values were passed on the stack it would work. As to whether it is the right thing to be stuck in a read() call waiting for terminal input after a signal was received...
I always thought that this story was entirely fictional, but now I wonder. The Unix guru referred to could be Ken Thompson (note how he is first referred to as "from Berkeley but working on Unix" and then as "the New Jersey guy").
Who can tell me more about this? Any of the old hands?
Paul
> From: Lars Brinkhoff
> Nick Downing <downing.nick(a)gmail.com> writes:
>> By contrast the MIT guy probably was working with a much smaller/more
>> economical system that didn't maintain a kernel stack per process.
I'm not sure I'd call ITS 'smaller'... :-)
> PCLSRing is a feature of MIT' ITS operating system, and it does have a
> separate stack for the kernel.
I wasn't sure if there was a separate kernel stack for each process; I checked
the ITS source, and there is indeed a separate stack per process. There are
also three other stacks in the kernel that are used from time to time (look
for 'MOVE P,' for places where the SP is loaded).
Oddly enough, it doesn't seem to ever _save_ the SP - there are no 'MOVEM P,'
instructions that I could find!
Noel
On page 3 of the Research Unix reader (http://www.cs.dartmouth.edu/~doug/reader.pdf)
"Sandy (A. G.) Fraser devised the Spider local-area ring (v6) and the Datakit switch (v7) that have served in the lab for over a decade. Special services on Spider included a central network file store, nfs, and a communication package, ufs."
I do not recall ever seeing any SPIDER related code in the public V6 source tree. Was it ever released outside Bell Labs?
From a bit of Googling I understand that SPIDER was a ATDM ring network with a supervisor allocating virtual circuits. Apparently there was only ever one SPIDER loop with 11 hosts connected, although Fraser reportedly intended to create multiple connected loops as part of his research.
The papers that Fraser wrote are hard to find: lots of citations, but no copies, not even behind pay walls. The base report seems to be:
A. G. FRASER, " SPIDER-a data communication experiment", Tech Report 23 , Bell Lab, 1974.
Is that tech report available online somewhere?
Tanks!
Paul
> we just read the second tape, which read without error. ... at this
> point we have access to everything that was on that machine.
OK, we're starting to get through all the clearances needed to release the
non-MIT Unix systems on the machine. (The MIT one is going to take more
work - I have to curate out all the personal files.)
We have now completed the OK's for the 'Network Unix' (the one done at the
University of Illinois for use on the ARPANET, with NCP). A tarball is
available here:
http://ana-3.lcs.mit.edu/~jnc/tech/pdp11/tmp/nosc.tar
(It's called 'nosc.tar' because it came through NOSC, and then SRI,
on the way to MIT.)
In addition to all the UIllinois code, it also contains early versions of the
MH mail reader (from Rand) and the MMDF mailer (from UDel).
Enjoy!
Noel
With no offense intended, I can't help noting the irony of the
following paragraph appearing in a message in the company of
others that address Unix "bloat".
>'\cX' A mechanism that allows usage of the non-printable
> (ASCII and compatible) control codes 0 to 31: to cre-
> ate the printable representation of a control code the
> numeric value 64 is added, and the resulting ASCII
> character set code point is then printed, e.g., BEL is
> '7 + 64 = 71 = G'. Whereas historically circumflex
> notation has often been used for visualization pur-
> poses of control codes, e.g., '^G', the reverse
> solidus notation has been standardized: '\cG'. Some
> control codes also have standardized (ISO 10646, ISO
> C) alias representations, as shown above (e.g., '\a',
> '\n', '\t'): whenever such an alias exists S-nail will
> use it for display purposes. The control code NUL
> ('\c@') ends argument processing without producing
> further output.
Except for the ISO citations, this paragraph says the same
thing more succinctly.
'\cX' represents a nonprintable character Y in terms of the
printable character X whose binary code is obtained
by adding 0x40 (decimal 64) to that for Y. (In some
historical contexts, '^' plays the role of '\c'.)
Alternative standard representations for certain
nonprinting characters, e.g. '\a', '\n', '\t' above,
are preferred by S-nail. '\c@' (NUL) serves as a
string terminator regardless of following characters.
And this version, 1/3 the length of the original, tells all
one really needs to know.
'\cX' represents a nonprintable character Y in terms of the
printable character X whose binary code is obtained
by adding 0x40 (decimal 64) to that for Y. '\c@'
(NUL) serves as a string terminator regardless of
following characters.
Doug]
On 2017-02-09 20:55, corey(a)lod.com (Corey Lindsly) wrote:
>
>> In spite of that, I'm typing away to you all, I'm 3ms away from 8.8.8.8
>> (Google's dns server). Go wireless. It's pretty remarkable to be here
>> and have decent net connectivity.
>>
>> I do not yearn for the days of SLIP.
>> --
>> ---
>> Larry McVoy lm at mcvoy.comhttp://www.mcvoy.com/lm
> 3ms? Really? I'm impressed, and I'd like to see your traceroute. We peer
> directly with Google and I get 4-5ms. Do share.
Meh. From Uppsala in Sweden I seem to have about 2ms ping time to 8.8.8.8...
Psilocybe:update/bqt> ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_req=1 ttl=56 time=2.10 ms
64 bytes from 8.8.8.8: icmp_req=2 ttl=56 time=1.93 ms
64 bytes from 8.8.8.8: icmp_req=3 ttl=56 time=2.05 ms
64 bytes from 8.8.8.8: icmp_req=4 ttl=56 time=1.89 ms
64 bytes from 8.8.8.8: icmp_req=5 ttl=56 time=2.02 ms
64 bytes from 8.8.8.8: icmp_req=6 ttl=56 time=2.05 ms
64 bytes from 8.8.8.8: icmp_req=7 ttl=56 time=2.00 ms
64 bytes from 8.8.8.8: icmp_req=8 ttl=56 time=1.97 ms
64 bytes from 8.8.8.8: icmp_req=9 ttl=56 time=2.03 ms
64 bytes from 8.8.8.8: icmp_req=10 ttl=56 time=2.10 ms
^C
--- 8.8.8.8 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9011ms
rtt min/avg/max/mdev = 1.894/2.020/2.108/0.067 ms
Psilocybe:update/bqt> traceroute 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
1 r1.n.it.uu.se (130.238.19.254) 1.986 ms 2.324 ms 2.717 ms
2 l-uu-1-b1.uu.se (130.238.6.251) 0.288 ms 0.680 ms 0.646 ms
3 uu-r1.sunet.se (130.242.6.148) 0.686 ms 0.685 ms 0.673 ms
4 uppsala-upa-r1.sunet.se (130.242.4.138) 0.672 ms 0.661 ms 0.657 ms
5 stockholm-fre-r1.sunet.se (130.242.4.26) 3.503 ms 3.468 ms 3.483 ms
6 se-fre.nordu.net (109.105.102.9) 24.456 ms 24.532 ms 24.153 ms
7 se-kst2.nordu.net (109.105.97.27) 1.934 ms 1.902 ms 1.891 ms
8 as15169-te-tc1.sthix.net (192.121.80.47) 2.204 ms 2.189 ms
72.14.196.42 (72.14.196.42) 1.872 ms
9 216.239.40.29 (216.239.40.29) 1.862 ms 1.941 ms 216.239.40.27
(216.239.40.27) 1.995 ms
10 209.85.251.233 (209.85.251.233) 2.398 ms 209.85.245.61
(209.85.245.61) 2.778 ms 72.14.234.85 (72.14.234.85) 2.385 ms
11 google-public-dns-a.google.com (8.8.8.8) 2.372 ms 2.366 ms 2.337 ms
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt(a)softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
> Lots of commands are now little shells
...
> Linux today is much more like the systems
> Unix displaced than it is like Unix
So depressingly true!
Doug
Thanks a lot for the tip Paul. It's great that others are working in
this area. Although I must say that as a kind of a "historian" I try
to go to primary sources where possible. Although I had already
converted a fair bit of code in the manner you describe, I am actually
re-converting a fair bit of it since I now have a semi-automated
system for doing so, that way I get pretty consistent results that
aren't reliant on ad-hoc decisions made during porting. Well, good
judgement is still needed, but I have a set of mental algorithms for
fixing exactly the kinds of questionable constructs you describe,
which lead to pretty consistent results. Using my scripts I converted
bin, usr.bin and lib of 4.3BSD in a few weeks, although a fair bit of
that time was spent on "bin/as" and "bin/sh" and "bin/csh" and other
pathological cases. When I have time I will proceed to ucb. I did all
subdirectories of bin (things like sed which are multi-module
programs) but not usr.bin yet.
So what I'll probably do when I get to looking at LSX is to re-convert
and then compare against your work, since either of us could quite
well have found questionable constructs missed by the other. Also,
earlier today I was looking at Noel's page about improving V6:
http://mercury.lcs.mit.edu/~jnc/tech/ImprovingV6.html
Anyway, I'm much more of a V7 guy and I think I would find V6 strange
and compromised, so I am thinking I will definitely have to apply some
of these patches, or at least check how much they increase the code
size by. At the very least, lseek() and mdate() have to go in, I'm not
sure about stdio since having a suite of the standard commands that
don't use stdio and hence are smaller/slower might be OK. But probably
my preferred approach is to calculate a patch V6 -> Mini Unix or V6 ->
LSX and then try to apply that on top of V7. Hmm.
As to moving to a V7 kernel and then adding TCP/IP I'm not sure if
this is adviseable, as I was saying earlier I think it might be best
to keep that functionality outboard from the kernel. The question in
my mind is (1) does the Mini Unix / LSX system have to be a fully
participating node on the network or can it be a leaf node without any
routing, and (2) does it have to respond to ping or incoming
connections at any time. Since my scenario is a simple SLIP link to my
home server, (1)=No for me. As to (2), I see two scenarios, (a) the
machine is used as a development machine, where I run "ed" and "cc"
and so on, and occasionally "ftp" or "rcp" as a client only, or (b)
the machine is used as a remote node for something like say data
logging or web serving, where it runs the same application all the
time, and I connect to it to retrieve results and/or download software
updates. In case (a) there are only outgoing connections. In case (b)
there are incoming connections, but the machine runs the same
application all the time, so there's no disadvantage to having TCP in
userspace. I don't envisage a more complicated scenario where it runs
inetd in the background and a console in the foreground, due to RAM
limits.
cheers, Nick
On Thu, Feb 9, 2017 at 12:56 AM, Paul Ruizendaal <pnr(a)planet.nl> wrote:
> Nick,
>
> If you want to work with LSX, you might have a look at the LSX port I did for the TI990 mini computer: http://1587660.websites.xs4all.nl/cgi-bin/9995/dir?ci=1c38b1fc8792c80b&name…
> It is a further development from the work that was done for BKUNIX by Leonid Broukhis (https://sourceforge.net/p/bkunix/code/HEAD/tree/)
>
> You get stuff converted to a dialect of C acceptable by modern compilers, and some kludges like using 'char*' for 'unsigned' and 'int a[2]' for 'long a' are cleaned up.
>
> The repository also has a V6 kernel with similar clean up and some V7 compatibility ('lseek' instead of 'seek', etc.). My next step would be to move to a V7 kernel and add TCP/IP capability. This is how I got interested in the history of sockets and TCP/IP. I have found that the BSD stack (as found in e.g. ULTRIX-11) will not fit in 64KB (note: just the network stack). The BBN stack appears to fit in 56KB, with 15KB of buffers available; I think it could be integrated with a V7 kernel as a second kernel process.
>
> Paul
>
> On 8 Feb 2017, at 12:21 , Nick Downing wrote:
>
>> Yes, NetBSD and 386BSD are interesting. They could well form a good
>> basis for a minimal but fully functional OS for a modern platform. One
>> reservation I have though, is as follows: When 386BSD and its
>> derivatives like OpenBSD, NetBSD, FreeBSD came out, Unix was still
>> encumbered and thus they had to be based on 4.4BSD Lite (not even
>> NET/2 was safe). Nobody made an unencumbered version of say 4.3BSD or
>> even NET/2, even though it was theoretically possible, by examining
>> what had to be taken out/added to produce 4.4BSD Lite.
>>
>> Given that Unix is now unencumbered I believe there is no point
>> adopting 4.4, or even 4.3Reno, because the main thing done in those
>> releases as far as I know, is to add partial POSIX compliance. But if
>> you want POSIX compliance you will not achieve minimality. As an
>> example consider the BSD sigvec() routine. POSIX calls this
>> sigaction(), the old SV_ONSTACK flag becomes SA_ONSTACK, the old
>> integer mask becomes a sigset_t and so on... and in any reasonable
>> POSIX compliant BSD the two calls are gonna have to coexist, etc.
>>
>> As to 32V, I appreciate your idea, as I was having some similar
>> thoughts myself. However I personally wouldn't use 32V as a basis for
>> any serious porting work, because I would assume V7 and 4.3 are much
>> more stable and well tested, since they ran in a lot of installations
>> over a long time. That's not to denigrate the huge achievement in
>> porting V7 to the VAX and producing 32V, but it probably has some
>> rough edges that were smoothed out over time to produce the 4BSDs. The
>> situation is a bit different for kernel/toolchain/other userspace.
>>
>> As to the kernel I have been trying to make a detailed comparison
>> between 32V and the BSDs, but have been a bit put off by the fact that
>> all files moved around and may have been renamed, I will figure it all
>> out eventually though. As to the toolchain I have compared it quite
>> carefully with 4.3BSD, and I conclude you should use the later
>> toolchain as there is no disadvantage and some advantages to doing so.
>> As to the rest of the userspace, I BELIEVE that it's stock V7 with any
>> 32-bit issues fixed, but I suspect somewhat hastily and superficially.
>>
>> Producing a 32V-like kernel from 4.3BSD sources would probably be
>> quite difficult, it would be relatively easy to disable added system
>> calls, but harder to disable things like setpgrp() or the associated
>> support in the TTY drivers. More seriously the memory management would
>> have to change, I am planning however to make virtual memory optional
>> in the 4.3BSD kernel, by maybe putting the 32V code back in, protected
>> by #ifndef VM or some such (somewhat like Steven Schultz has done in
>> porting 4.3BSD to PDP-11 to produce 2.11BSD).
>>
>> On the other hand producing a 32V-like userland from the 4.3BSD
>> sources would be pretty easy, I think just delete the sources for any
>> executables that weren't distributed with 32V and possibly, if any of
>> the tools seem particularly bloated, comment out any command line
>> switches or features that weren't in 32V. Going to this level of
>> effort would likely be pointless though. Another option would be
>> taking V7 and re-porting it (except the toolchain) over to a 32-bit
>> environment and kernel. I have developed over the past months, tools
>> that make this relatively straightforward, and in the process would
>> rediscover any 32-bit issues that were fixed in creating 32V. So I
>> wouldn't use 32V.
>>
>> On a slightly different tack, I also have been for some time
>> investigating the idea of an Apple II port of Unix, there are massive
>> technical issues to be solved, but I think I got a bit closer the
>> other night when I decided to collect all information and resources I
>> could find about Mini-Unix and LSX (LSI Unix). Both are
>> self-supporting V6-based environments, the compiler can only compile
>> small programs but it is nonetheless possible for each Unix to
>> recomple itself. LSX I believe could run from floppies (dunno about
>> 140K floppies) in less than 64K.
>>
>> So, you know, true minimality is a relative term. We want something
>> LSX-like for an Apple II, something 2.11BSD-like for an IBM PC/XT or
>> 286 (as Peter Jeremy noted, it's a good fit, and I'd be interested to
>> know more), something 4.3BSD-like for a VAX or 386... something a bit
>> more fully featured for a modern 64-bit multi-gigabyte system... but
>> just not as bloated as what we currently rely on. Hmm well it's hard.
>> What I do know, is that I have a lot of old hardware with <16M RAM and
>> Linux won't run on it anymore, and this annoys me very greatly.
>>
>> In talking about FreeBSD/Linux bloat I forgot to mention the packet
>> filter, iptables (Linux) or pf (FreeBSD). I have a bit of experience
>> with this, since I regularly used to put 2 Ethernet cards in my home
>> server and make it Internet facing through one of them and share the
>> connection using NAT through the other card. But I've come to the
>> conclusion this is stupid, and moreover, that putting a complete
>> mini-language into the kernel just for this purpose is utterly stupid.
>> Programming the thing from userspace is incredibly intricate, and all
>> this complexity serves no purpose.
>>
>> I recently found out about SLIRP (userspace packet rewriting) and I
>> think this would be a good way to implement NAT on servers or routers
>> that actually need to do NAT -- just make a userspace process that
>> runs something SLIRP-like and has a raw socket to the second Ethernet
>> card, and Bob's your uncle.
>>
>> But this got me thinking along pretty productive lines in terms of the
>> tiny Apple II port -- I have been wanting to put the 2.11BSD network
>> stack into an Apple II for a long time, but I've now realized this is
>> not necessary. A much better approach for those Mini-Unix or LSX or
>> even V7 systems, would be to have a userspace library that does SLIP
>> and contains its own TCP, UDP, IP drivers, resolver and so on. Then if
>> you run a userspace program like say, ftp, which is linked to the
>> userspace TCP library, it would basically just open /dev/ttyXX, bring
>> up the SLIP link itself, do any necessary downloads etc, then close
>> the TTY and stop responding to any IP stuff coming over the SLIP link
>> whilst you quit to the prompt, until another program reopens it.
>>
>> cheers, Nick
>>
>> On Wed, Feb 8, 2017 at 2:56 PM, Jason Stevens
>> <jsteve(a)superglobalmegacorp.com> wrote:
>>> What about NetBSD 1.1 or even 386BSD?
>>>
>>> There never was a 4.2 or 4.3 for i386 was there?
>>>
>>> I'd guess the 32v userland could be built on early 4.4BSD Lite/NET2 greatly
>>> reducing its footprint.
>>>
>>>
>>>
>>>
>>> On February 8, 2017 11:47:03 AM GMT+08:00, Nick Downing
>>> <downing.nick(a)gmail.com> wrote:
>>>>
>>>> This is an issue that interests me quite a bit, since I was running
>>>> FreeBSD in an effort to get around Linux bloat problems discussed.
>>>> Well not that I really mind Linux as a user interface / runtime
>>>> environment / main development machine, but I think it probably
>>>> shouldn't be used as a "least common denominator" for development
>>>> since you end up introducing unwanted dependencies on a whole lot of
>>>> stuff.
>>>>
>>>> So I was running FreeBSD as a more minimal *nix. I did quite a lot of
>>>> interesting stuff with FreeBSD such as setting up diskless
>>>> workstations in my home, etc. I spent a lot of time tinkering around
>>>> in the kernel code. I was planning to do some serious development on
>>>> 4.4BSDLite or FreeBSD to create an operating system more to my liking.
>>>> So, I was looking carefully at differences since ancient *nixes.
>>>>
>>>> And, I can say that FreeBSD is pretty bloated. Umm well they've added
>>>> SMP, at the time it was using the Giant Lock although that could be
>>>> fixed by now. They've added VFS and NFS of course. They've added an
>>>> entire subsystem for block devices IIRC that handles partitioning and
>>>> possibly some other sophisticated stuff, which I believe is their own
>>>> design. Umm the kqueues and I believe they have their own
>>>> implementation of kernel threading or lightweight processes including
>>>> some sort of idle daemon. The network stack is heavily upgraded, to
>>>> the extent I looked into it, the added features are things you would
>>>> want (syncookies = DOS protection, etc) but also could not possibly be
>>>> called minimal, and would preclude running it on other than a
>>>> multi-megabyte machine. They have multiple ABIs so the kernel can
>>>> accept Linux or BSD syscalls or whatever else (I used it to run
>>>> Acrobat Reader Linux on my FreeBSD desktop). Umm I am pretty sure they
>>>> have kernel modules ala Linux. Lots and lots and lots of stuff... and
>>>> that's only considering the kernel. If you look in the ports
>>>> collection you see they have incredible amounts of bloat there too...
>>>> for instance GNOME, Libreoffice, LATEX, gcc, python... not that I'm
>>>> denigrating these tools, since they do invaluable work and I use them
>>>> every day, but the point is, you CANNOT call them minimal.
>>>>
>>>> The quest for a clean minimal system goes on ->. FreeBSD is not the
>>>> answer. In fact I believe 4.3BSD-Reno and 4.4 go strongly offtrack.
>>>>
>>>> cheers, Nick
>>>>
>>>> On Wed, Feb 8, 2017 at 1:55 PM, Greg 'groggy' Lehey <grog(a)lemis.com>
>>>> wrote:
>>>>>
>>>>> On Tuesday, 7 February 2017 at 15:38:40 -0800, Steve Johnson wrote:
>>>>>>
>>>>>> Looking back, the social dynamics of the Unix group helped a lot in
>>>>>> keeping the bloat small. The rule was, whoever touches something
>>>>>> last becomes its owner. Of course, we were all free to complain
>>>>>> about things, and did, but the amalgamation of tinkerings that
>>>>>> characterizes most of the Linux commands just didn't happen.
>>>>>
>>>>>
>>>>> Out of interest: where do you (or others) consider that the current
>>>>> BSD projects it in this comparison?
>>>>>
>>>>> Greg
>>>>> --
>>>>> Sent from my desktop computer.
>>>>> Finger grog(a)lemis.com for PGP public key.
>>>>> See complete headers for address and phone numbers.
>>>>> This message is digitally signed. If your Microsoft mail program
>>>>> reports problems, please read http://lemis.com/broken-MUA
>>>
>>>
>>> --
>>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
DEL, sometimes labeled RUBOUT has a very important feature. It's all ones. When punching a paper tape, if you make a mistake you can mechanically backspace the tape (there's a button on the punch rather than the actual backspace key), you can then press RUBOUT which overpunches the incorrect character. Presumably, whatever system that this was designed for disregards those characters when encountered.
Amusingly, we though the HERE IS key was just to generate leaders because none of our teletypes had the drum programmed. Later I found out that you could break the tabs on the drum and have HERE IS send a short string of characters. ^E (called ENQ or sometimes WRU ...for who are you) triggers this to be sent in response.
To get back to the UNIX tie in, I actually had for years a Model 37 teletype. This was one of the few terminals that you didn't have to set the nl mode mapping for. It had a large key marked NEWLINE where RETURN usually is and sent ^J (\n) and responded to it the way UNIX expected. In addition it handles all the ESC-8 ESC-9 etc... codes that nroff sent by default without needing a filter. Mine was an ASR so it had the tape unit. It lacked the "greek box" that the one at JHU had to print greek characters after an ^N (shift in). The thing was amusing as it didn't turn on the motor until the modem came ready and when carrier detect was asserted a big green PROCEED light lit on the front.
It was quaint, but when I finally got a higher speed modem, I switched back to using a CRT. The Model 37 was a screaming 150 baud.
I finally "donated" it to RS who dumped the thing behind someone's car somewhere.
> From: Michael Kjorling
> That wouldn't have anything to do with how ^@ is a somewhat common
> representation of 000, would it? .. I've always kind of wondered where
> that notation came from.
Well, CTRL-<*> is usually just the <*> character with the high bits cleared.
So, to have a printing representation of NULL, you have two character choices
- SPACE, and '@'. Printing "^ " is not so hot, so "^@" is better.
Also, if you look at an ASCII table, usually people just take the @-_ column,
and use that, since every character in that column has a printing
representation. The ' '-? column is missing the ' ', and `-<DEL> is missing
the DEL. So if you just take a CTRL character and set the 0100 bit, and print
it as "^<char>", you get something readable.
(Note that CTRL-' ' _is_ usually used when one needs to _input_ a NUL
character.)
Noel
Inspired by:
> Stephen Bourne after some time wrote a cron job that checked whether an
> update in a binary also resulted in an updated man page and otherwise
> removed the binary. This is why these programs have man pages.
I want to tell a story about working at Sun. I feel like I've sent this
but I can't find it in my outbox. If it's a repeat chalk it up to old
age.
I wanted to work there, they were the Bell Labs of the day, or as close
as you could get.
I got hired as a contractor through Lachman (anyone remember them?) to do
POSIX conformance in SunOS (the 4.x stuff, not that Solaris crap that I
hate).
As such, I was frequently the last guy to touch any file in the kernel,
my fingerprints were everywhere. So when there was a panic, it was
frequently laid at my doorstep.
So here is how I got a pager and learned about source management.
Sun had two guys, who will remain nameless, but they were known as
"the SCSI twins". These guys decided, based on feedback that "people
can interrupt sun install", to go into the SCSI tape driver and disable
SIGINT, in the driver. The kernel model doesn't allow for drivers messing
with your signal mask so on exit, sometimes, we would get a "panic: psig".
Somehow, I sure was because of the POSIX stuff, I ended up debugging this
panic. It had nothing to with me, I'm not a driver person (I've written
a few but I pretty much suck at them), but it landed in my lap.
Once I figured it out (which was not easy, you had to hit ^C to trigger it
so unless you did that, and who does that during an install) I tracked down
the code to SCSI twins.
No problem, everyone makes mistakes. Oh, wait. Over the next few months
I'm tracking down more problems, that were blamed on me since I'm all over
the kernel, but came from the twins.
Suns integration machines were argon, radon, and krypton. I wrote
scripts, awk I think, that watched every update to the tree on all
of those machines and if anything came from the SCSI twins the script
paged me.
That way I could go build and test that kernel and get ahead of the bugs.
If I could fix up their bugs before the rest of the team saw it then I
wouldn't get blamed for them.
I wish I could have figured out something like Steve did that would have
made them not screw up so much but this was the next best thing. I actually
got bad reviews because of their crap. My boss at the time, Eli Lamb, just
said "you are in kadb too much".
--lm
> From: Paul Ruizendaal
> The best one seems to have been the 3Com stack, which puts IP in the
> kernel and TCP in a daemon.
I've gotta get the MIT V6 one online.
Incoming demux is in the kernel; all of the TCP, and everything else, is in
processes along with the application - one process per application instance.
It sounds like it might be clunky, but it's not: there are a couple of
different TCP's (a small, low performance one for things like User TELNET,
timer servers, yadda-yadda; a bigger, higher-performance one for things like
FTP), and the application just calls them as subroutine libraries
(effectively). Since there's no IPC overhead, the performance is pretty good.
Unfortumately, a lot of the stuff never migrated from personal directories to
the system folder, so I have to go curate out the personal files (or, more
likely, move them all to a system folder) before I can release it all.
> Perhaps economizing on fragmentation and and window management is
> better.
Fragmentation, perhaps - but good window management is a must.
> I wonder if just putting the code for this state in the kernel and
> handling only the state changes and other states in a daemon is perhaps
> the best split on constrained hardware.
I don't think that's a good idea; cutting the TCP in two parts, which have to
communicate over an interface is going to be _really_ ugly: do you have one
copy of the connection state data (in which case one them has to go through an
interface to get to it), or two (synchronization issues). If you want a small
kernel footprint, take the MIT approach.
Noel
> I'm fairly certain it was originally in BCPL.
>
> You could just drop a note to Bjarne Stroustrup and ask. :-)
On page 44 of _The Design and Evolution of C++_ (Addison-Wesley, 1994), Stroustrup says:
“However, only C, Simula, Algol68, an in one case BCPL left noticeable traces in C++ as released in 1985. Simula gave classes, Algol68 operating overloading, references, and the ability to declare variables anywhere in a block, and BCPL gave // comments.”
He says a bit more about // comments on page 93, including an example of how they introduced an incompatibility with C.
> From: Nick Downing
> I'm much more of a V7 guy and I think I would find V6 strange and
> compromised
De gustibus. I used it for many years, and am quite at home in it. I think
it's a marvel of functionality/size - at the time it came out, not much bigger
than DEC PDP-11 OS's, but with a 'big system' feel to it (which they
_definitely_ did not) - in fact, _better_ than most big systems of the day.
But I can see it would be rather too simple (and in the kernel inelegant,
code-wise, by today's standards - see below) for many. V7 is not that
different, in terms of user experience, from V6, though.
> I am thinking I will definitely have to apply some of these patches, or
> at least check how much they increase the code size by.
Sorry, that page is kind of a mish-mosh. Most of the stuff that's talked about
there is for user commands, not the kernel.
There are only a few kernel changes (lseek() and mdate(), and param.c so that
the new 'si' command can get thing from param.h without having to have it
compiled in), and they are all small.
> But probably my preferred approach is to calculate a patch V6 -> Mini
> Unix or V6 -> LSX and then try to apply that on top of V7.
I'm a little confused as to what your goal is here. Get V6 running on some
other architecture? Upgrade V6 for some goal which I am not aware of? I know
you probably said something in an earlier email, sorry, I don't recall.
Anyway, if you're going to do anything with V6 kernel code, you need to be
aware that it's really idiosyncratic - a lot of its written in a very early
dialect of C, and while things like 'a =+ b' -> 'a += b' and 'int a 1' -> 'int
a = 1' are pretty easy to fix, there are lots of intances of int's being used
as pointers to several different kinds of structures, etc, etc.
If you want to move an early, small Unix to something other than a PDP-11, V7
is probably a much better bet.
> As to moving to a V7 kernel and then adding TCP/IP I'm not sure if this
> is adviseable, as I was saying earlier I think it might be best to keep
> that functionality outboard from the kernel.
There are a couple of early TCP/IP's which ran outside the kernel, but I think
the standard Berkeley one might be a handful to move out.
Noel
> From: Charles Anthony
> Sigh. That tops my Multics bug report.
No way! You actually got the fix approved by an MCB! Much cooler! :-)
> BSD4.1 is circa 1890?
Well, it's old, but not _that_ old!! :-)
Noel