OK, here's another one that's good for chest thumping...
I am not a fan of texinfo. It doesn't provide any benefits (to me) over man.
I suppose that it was trailblazing in that it broke manual pages up into
sections that couldn't easily be viewed concurrently long before the www and
web pages that broke things up into multiple pages to make room for more ads.
Any benefits that texinfo might have are completely lost by the introduction
of multiple non-intersecting ways to find documentation.
This is a systemic problem. I have a section in my book-in-progress where I
talk about being a "good programming citizen". One of the things that I say
is:
Often there is a tool that does most of what you need but is lacking
some feature or other. Add that feature to the existing tool;
don't just write a new one. The problem with writing a new one
is that, as a tool user, you end up having to learn a lot of tools
that perform essentially the same function. It's a waste of time
an energy. A good example is the make utility (invented by Stuart
Feldman at Bell Labs in 1976) that is used to build large software
packages. As time went on, new features were needed. Some were
added to make, but many other incompatible utilities were created that
performed similar functions. Don't create burdens for others.
Improve existing tools if possible.
A funny example of this is when I was consulting for Xilinx in the late 80s
on a project that had to run on both Suns and PCs. Naturally, I did the
development on a Sun and ported to the PC later. When it came time to do
the port, a couple of the employees proudly came to me and told me about
this wonderful program that they wrote that was absolutely necessary for
doing the PC build. I was completely puzzled and told them that I already
had the PC build complete. They told me that that couldn't be possible
since I didn't use their wonderful utility. Turns out that their utility
wrote out all of the make dependencies for the PC. I, of course, wrote a
.c.obj rule which was all that it took. They were excessively angry at me
for inadvertently making them look like fools that they were.
Another example is a more recent web-based project on which I was advising.
I'm a big fan of jQuery; it gets the job done. Someone said "Why are you
using that instead of angular?" I did a bit of research before answering.
Turns out that one of the main reasons given for angular over jQuery was
that "it's fresh". That was a new one for me. Still unclear why freshness
is an attribute that would trump stability.
So, I'm sure that many of you have stories about unnecessary tools and
packages that were created by people unable to RTFM. Would be amused
to hear 'em.
Jon
I started using Unix in ~1977 at UC Santa Barbara. At some point
around then we decided to host a Unix users meeting in the U-Cen
student union building. We asked the facilities people to prepare
a sign pointing to the meeting room.
Imagine my reaction when I walked into the building and saw
the following sign:
"Eunuchs Group Meeting - Room 125"
I don't know if any eunuchs actually showed up.
Jon Forrest
Warner Losh <imp(a)bsdimp.com> kindly corrected my statement that kcc
compiler on the PDP-10 was done by Ken Harrenstien, pointing out that
it was actually begun by Kok Chen (whence, the name kcc).
I've just dug into the source tree for the compiler, and found this
leading paragraph in kcc5.vmshelp (filesystem date of 3-Sep-1988) that
provides proper credits:
>> ...
>> KCC is a compiler for the C language on the PDP-10. It was
>> originally begun by Kok Chen of Stanford University around 1981 (hence
>> the name "KCC"), improved by a number of people at Stanford and Columbia
>> (primarily David Eppstein, KRONJ), and then adopted by Ken Harrenstien
>> and Ian Macky of SRI International as the starting point for what is now
>> a complete and supported implementation of C. KCC implements C as
>> described by the following references:
>>
>> H&S: Harbison and Steele, "C: A Reference Manual",
>> HS1: (1st edition) Prentice-Hall, 1984, ISBN 0-13-110008-4
>> HS2: (2nd edition) Prentice-Hall, 1987, ISBN 0-13-109802-0
>> K&R: Kernighan and Ritchie, "The C Programming Language",
>> Prentice-Hall, 1978, ISBN 0-13-110163-3
>>
>> Currently KCC is only supported for TOPS-20, although there is
>> no reason it cannot be used for other PDP-10 systems or processors.
>> The remaining discussion assumes you are on a TOPS-20 system.
>> ...
I met Ken only once, in his office at SRI, but back in our TOPS-20
days, we had several e-mail contacts.
----------------------------------------
P.S. In these days of multi-million line compilers, it is interesting
to inspect the kcc source code line count:
% find . -name '*.[ch]' | xargs cat | wc -l
80298
A similar check on a 10-Oct-2016 snapshot of the actively-maintained
pcc compiler for Unix systems found 155896 lines.
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------
Gentlemen,
Below some additional thoughts on the various observations posted
about this. Note that I was not a contemporary of these developments,
and I may stand corrected on some views.
> I'm pretty sure the two main System V based TCP/IP stacks were STREAMS
> based: the Lachman one (which I ported to the ETA-10 and to SCO Unix)
> and the Mentat one that was done for Sun. The socket API was sort of
> bolted on top of the STREAMS stuff, you could get to the STREAMS stuff
> directly (I think, it's been a long time).
Yes, that is my understanding too. I think it goes back to the two
roots of networking on Unix: the 1974 Spider network at Murray Hill and
the 1975 Arpanet implementation of the UoI.
It would seem that Spider chose to expose the network as a device, whereas
UoI chose to expose it as a kind of pipe. This seems to have continued in
derivative work (Datakit/streams/STREAMS and BBN/BSD sockets respectively).
When these systems were developed networking was mostly over serial lines,
and to use serial drivers was not illogical (i.e. streams->STREAMS). By 1980
fast local area networks were spreading, and the idea to see the network as
a serial device started to suck.
Much of the initial modification work that Joy did on the BBN code was to
make it perform on early ethernet -- it had been designed for 50 kbps arpanet
links. Some of his speed hacks (such as trailing headers) were later
discarded.
Interestingly, Spider was conceived as a fast network (1.5 Mbps); the local
network at Murray Hill operated at that speed, and things were designed to
work over long distance T1 connections as well. This integrated fast LAN/WAN
idea seems to have been abandoned in Datakit. I have a question out to Sandy
Fraser to ask about the origins of this, but have not yet received a reply.
> The sockets stuff was something Joy created to compete with the CMU Accent
> networking system. [...] CMU was developing Accent on the Triple Drip
> PascAlto (aka the Perq) and had a formal networking model that was very clean
> and sexy. There were a lot of people interested in workstations, the Andrew
> project (MIT is about to start Athena etc). So Bill creates the sockets
> interface, and to show that UNIX could be just as modern as Accent.
I've always thought that the Joy/Leffler API was a gradual development of
the UoI/BBN API. The main conceptual change seem to have been support for
multiple network systems (selectable network stack, expansion
of the address space to 14 bytes).
I don't quite see the link to Accent and Wikipedia offers little help here
https://en.wikipedia.org/wiki/Accent_kernel
Could you elaborate on how Accent networking influenced Joy's sockets?
> * There's no reason for
> a separate listen() call (it takes a "backlog" argument but
> in practice everyone defaults it and the kernel does strange
> manipulations on it.)
Perhaps there is. The UoI/BBN API did not have a listen() call;
instead the open() call - if it was for a listening connection - blocked until
a connection occurred. The server process would then fork of a worker process
and re-issue the listening open() call for the next connection. This left a
time gap where the server would not be 'listening'.
The listen() call would create up to 'backlog' connection blocks in the
network code, so that this many clients could connect simultaneously
without user space intervention. Each accept() would hand over a (now
connected) connection block and add a fresh unconnected one to the backlog
list. I think this idea came from Sam Leffler, but perhaps he was inspired
by something else (Accent?, Chaos?)
Of course, this can be done with fewer system calls. The UoI/BBN system
used the open() call, with a pointer to a parameter data block as the 2nd
argument. Perhaps Joy/Leffler were of the opinion that parameter data
blocks were not very Unix-y, and hence spread it out over
socket()/connect()/bind()/listen() instead.
The UoI choice to overload the open() call and not create a new call
(analogous to the pipe() call) was entirely pragmatic: they felt this
was easier for keeping up with the updates coming out of Murray Hill
all the time.
> In particular, I have often thought that it would have been a better
> and more consistent with the philosophy to have it implemented as
> open("/dev/tcp") and so on.
I think this is perhaps an orthogonal topic: how does one map network names
to network addresses. The most ambitious was perhaps the "portal()" system
call contemplated by Joy, but soon abandoned. It may have been implemented
in the early 90's in BSD, but I'm not sure this was fully the same idea.
That said, making the the name mapping a user concern rather than a kernel
concern is indeed a missed opportunity.
Last and least, when feeling argumentative I would claim that connection
strings like "/dev/tcp/host:port" are simply parameter data blocks encoded
in a string :^)
> This also knocks out the need for
> SO_REUSEADDR, because the kernel can tell at the time of
> the call that you are asking to be a server. Either someone
> else already is (error) or you win (success).
Under TCP/IP I'm not sure you can. The protocol specifies that you must
wait for a certain period of time (120 sec, if memory serves my right)
before reusing an address/port combo, so that all in-flight packets have
disappeared from the network. Only if one is sure that this is not an
issue one can use SO_REUSEADDR.
> Also, the profusion of system calls (send, recv, sendmsg, recvmsg,
> recvfrom) is quite unnecessary: at most, one needs the equivalent
> of sendmsg/recvmsg.
Today that would indeed seem to make sense. Back in 1980 there seems
to have been a lot of confusion over message boundaries, even in
stream connections. My understanding is that originally send() and
recv() were intended to communicate a borderless stream, whereas
sendmsg() and recvmsg() were intended to communicate distinct
messages, even if transmitted over a stream protocol.
Paul
> On Sep 23, 2017, at 3:06 PM, Nelson H. F. Beebe <beebe(a)math.utah.edu> wrote:
>
> Not that version, but I have the 4.4BSD-Lite source tree online with
> these files in the path 4.4BSD-Lite/usr/src/usr.bin/uucp:
Thanks, but I have the 44BSD CDs.
> If they look close enough to what you need, I can put
> a bundle online for you.
I'm looking for the seismo/uunet version that Rick hacked on for so many years. It started off as the 4.3BSD version, but grew to embrace the volume of traffic uunet handled in its heyday. It wasn't your daddy's uucico ;-)
--lyndon
Dario Niedermann <dario(a)darioniedermann.it> wrote on Sat, 23 Sep 2017
11:17:04 +0200:
>> I just can't forgive FreeBSD for abandoning the proc filesystem ...
It can be there, if you wish.
Here are two snippets from a recent log of a recent "pkg update -f ;
pkg upgrade" run on a one of my many *BSD family systems, this one
running FreeBSD 11.1-RELEASE-p1:
Message from openjdk8-8.131.11:
======================================================================
This OpenJDK implementation requires fdescfs(5) mounted on
/dev/fd and procfs(5) mounted on /proc.
If you have not done it yet, please do the following:
mount -t fdescfs fdesc /dev/fd mount -t procfs proc
/proc
To make it permanent, you need the following lines in
/etc/fstab:
fdesc /dev/fd fdescfs rw 0 0 proc /proc procfs rw 0 0
======================================================================
Message from rust-1.18.0:
======================================================================
Printing Rust backtraces requires procfs(5) mounted on /proc .
If you have not already done so, please do the following:
mount -t procfs proc /proc
To make it permanent, you need the following lines in /etc/fstab:
proc /proc procfs rw 0 0
======================================================================
I've seen such messages in many package installations in the *BSD
family, and I generally add the suggested lines to /etc/fstab.
Perhaps others more familiar with BSD internals might comment on
whether it is many non-BSD software, like the Java Developer's Kit,
and Mozilla's rust language, that mostly would like /proc support, or
whether there are plenty of native-BSD packages that expect it too.
The second edition of
Marshall Kirk McKusick, George V. Neville-Neil, and Robert N. M. Watson
The Design and Implementation of the FreeBSD Operating System
ISBN 0-201-70245-2 (hardcover), 0-321-96897-2 (hardcover)
has 5 pages with mention of the /proc filesystem, and nothing that
suggests that it is in any way deprecated.
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------
Sadly no longer with us (he exited in 2011), he was forked in 1941. Just
think, if it wasn't for him and Ken, we'd all be running Windoze, and
thinking it's wonderful.
A Unix bigot through and through, I remain,
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
Tom Ivar Helbekkmo:
Why should anyone need to? Of all the mailing lists I'm on, this one is
the only one that has this problem.
=====
Beware tunnel vision. Another mailing list I'm on has exactly
the same problem, made worse because it's being run by a central
Big Company Mailing List Provider so the rules keep changing under
foot and it's up to the poor-sod list maintainer (who is not a
programmer) to cope.
To bring the focus back to this mailing list, not every program
runs on a little-endian computer with arbitrary word alignment
and pointers that fit in an int.
Norman Wilson
Toronto ON
Does anyone have a copy of Rick's uunet version of the 4.3BSD UUCP source? The disk I had it on seized up, and I can't figure out a fine-grained-enough set of search keywords to find it through a web search :-(
--lyndon
Lyndon Nerenberg:
I really like mk. 8ed was where it first rolled out? I remember
reading about it in the 10ed books. It's a joy to use in Plan 9.
======
Later than that. I was around when Andrew wrote mk, so it
definitely post-dated the 8/e manual.
mk does a number of things better, but to my mind not quite
enough to justify carrying it around. Just as I decided long
ago (once I'd come out of the ivory hothouse of Murray Hill)
that I was best off writing programs that hewed to the ISO C
and POSIX interfaces (and later put some work into bringing
my private copy of post-V10 nearer to the standards), because
that way I didn't have to think much about porting; so I
decided eventually that it is better just to use make.
As with any other language, of course, it's best to use it
in as simple a way as possible. So I don't care much whether
it's gmake or pmake or qmake as long as it implements more
or less the 7/e core subset without breaking anything.
Larry McVoy:
I do wish that some simple make had stuffed a scripting language in there.
Anything, tcl, lua, even (horrors, can't believe I'm saying this) a little
lisp. Or ideally a built in shell compat language. All those backslashes
to make shell scripts work get old.
======
This is something mk got right, and it's actually very simple to do:
every recipe is a shell script. Not a collection of lines handed
one by one to the shell, but a block of text. No backslashes (or
extra semicolons) required for tests. Each script is run with sh -e,
so by default one failed command will abort the rest, which is
usually what one wants; but if you don't want that you just insert
set +e
(So it's not that I dislike mk. Were it available as an easy
add-on package on all modern systems, rather than something I'd
have to carry around and compile, I'd be happy to use it.)
Norman Wilson
Toronto ON
I tried running my own server on mcvoy.com but eventually gave up, the
spam filtering was a non-ending task.
If someone has a plug and chug setup for MX I'd love to try it.
Thanks,
--lm
This question is motivated by the posters for whom FreeBSD is not Unix
enough :-)
Probably the best known contribution of the Berkeley branch of Unix is
the sockets API for IP networking. But today, if for no other reason
than the X/Open group of standards, sockets are the preferred networking
API everywhere, even on true AT&T derived UNIX variants. So they must
have been merged back at some point, or reimplemented. My question is,
when and how did that happen?
And if there isn't a simple answer because it happened at different
times and in different ways for each variant, all the better :-)
--
Please don't Cc: me privately on mailing lists and Usenet,
if you also post the followup to the list or newsgroup.
Do obvious transformation on domain to reply privately _only_ on Usenet.
I run my own mail server, on systems in my basement.
It is a setup that no one in their right mind would
replicate, but the details may actually be proper for
this list.
A firewall/gateway system runs a custom SMTP server,
which can do simple filtering based on the SMTP envelope,
SMTP commands, calling IP address and hostname. It is
also able to call external commands to pass judgement on
a caller or a particular message.
If mail is accepted, it is passed through a simple
MTA and a stupidly-simple queueing setup (the latter
made of shell scripts) to be sent via SMTP to a
different internal system, which uses the same SMTP
server and MTA to deliver to local mailboxes.
Outbound mail is more or less the obvious inverse.
I have put off naming names for dramatic effect. The
two systems in question are MicroVAX IIIs running
my somewhat-hacked-up version of post-10/e Research
UNIX. The MTA is early-1990s-vintage upas. The SMTP
server, SMTP sender, and queuing stuff are my own.
I wrote the SMTP server originally not long after I left
Bell Labs; I was now in a world where sendmail was the
least-troublesome MTA, but in those days every month
brought news of a new sendmail vulnerability, so I wrote
my own simple server to act as a condom. Over time it
grew a bit, as I became interested in problems like
what sorts of breakin attempts are there in real life
(back then one received occasional DEBUG or WIZ commands,
but I haven't seen any since the turn of the century);
what sorts of simple filtering at the SMTP level will
get rid of most junk mail. The code is more complicated
than it used to be, but is still small enough that I am
reasonably confident that it is safe to expose to the
network.
The SMTP sender and the queueing scripts came later,
when I decided to host my own mail. Both were designed
in too much of a hurry.
There is no official spam filtering (no bogofilter or
the like). A few simple rules that really just enforce
aspects of the SMTP standard seem to catch most junk
callers: HELO argument must contain at least one . (standard
says it must be your FQDN) and must not be *.* (I see dozens
of those every day!); sender must not speak until my server
has issued a complete greeting (I follow Wietse Venema in
this: send a line with a continuation marker first, then
sleep five seconds or so, then send a finish). I also
have a very simple, naive greylisting implementation that
wouldn't work well for a site with lots of users, but is
fine for my personal traffic. The greylisting is implemented
with a pair of external shell scripts.
I have had it in mind for a long time to consult the Spamhaus
XBL too. It would be easy enough to do with another plug-in
shell script. There are stupid reasons having to do with my
current DNS setup that make that impractical for now.
The mail setup works, but is showing its age, as is the
use of Research UNIX and such old, slow hardware as a network
gateway. One of these years, when I have the time, I'd like
first to redo the mail setup so that mailboxes are stored
on my central file server (a Sun X2200 running Solaris 10,
or perhaps something illumos-based by the time I actually
do all this); then set up a new gateway, probably based on
OpenBSD. Perhaps I should calculate how much hardware I
could buy from the power savings of turning off just one of
the two MicroVAXes for a year.
I have yet to see an MTA that is spare enough for my taste,
but the old upas code just doesn't quite do what I want any
more, and is too messy to port around. (Pursuant to the
conversation earlier here about autoconf: these days I try
to need no configuration magic at all, which works as long
as I stick to ISO C and POSIX and am careful about networking.
upas was written in messier days.) At the moment I'm leaning
toward qmail, just because for other reasons I'm familiar with
it, though for my personal use I will want to make a few changes
here and there. But I'll want to keep my SMTP server because
I am still interested in what goes on there.
Norman Wilson
Toronto ON
> When you say MIT you think about ITS and Lisp. That is why emacs IMHO
> was against UNIX ideals. RMS was thinking in different terms than Bell
> Labs hackers.
Very different. Once, when visiting the Lisp machine, I saw astonishingly
irrelevant things being done as first class emacs commands, and asked
how many commands there were. The instant answer was to have emacs
print the list. Nice, but it scrolled way beyond one screenful. I
persisted: could the machine count them? It took several minutes of
head-scratching and false starts to do a task that was second nature
to Unix hands.
With hindsight, I realize that the thousand emacs commands were but a
foretaste of open-source exuberance--witness this snippet from Linux:
!ls /usr/share/man/man2|wc
468 468 6766
Even a "kernel" is as efflorescent as a tropical rainforest.
On Tue, Sep 19, 2017, at 10:42, Larry McVoy wrote:
> slib.c:1653 (bk-7.3): open failed: permission denied
>
> which is way way way more useful than just permission denied.
Random832 replied:
Well. It's less useful in one way - it doesn't say what file it was
trying to open. You could pass the filename *instead* of "open failed",
but that still omits the issue I had pointed out: what were you trying
to open the file for (at the very least, were you trying to read, write,
or exec it). Ideally the function would have a format and arguments.
====
Exactly.
The string interpretation of errno is just another
item of data that goes in an error message. There is
no fixed place it belongs, and it doesn't always
belong there, because all that is error does not
fail from a syscall (or library routine).
I do often insert a function of the form
void errmsg(char *, ...)
in my C programs. It takes printf-like arguments.
Normally they just get passed to vfprintf(stderr, ...),
though sometimes there is something more esoteric,
and often fprintf(stderr, "%s: ", progname) ends up
in front.
But errmsg never knows anything about errno. Why
should it? It's supposed to send complaints to
a standard place; it's not supposed to invent the
complaints for itself! If an errno is involved,
I write something like
errmsg("%s: cannot open: %s", filename, strerror(errno));
(Oh, yes, errmsg appends a newline too. The idea
is to avoid cluttering code with minutiae of how
errors are reported.)
I don't print the source code filename or line number
except for `this shouldn't have happened' errors.
For routine events like the user gave the wrong
filename or it had the wrong permissions or his
data are malformed, pointers to the source code are
just unhelpful clutter, like the complicated
%JARGON-OBSCURE-ABBREVIATION prefixes that accompanied
every official error message in VMS.
Of course, if the user's data are malformed, he
should be told which file has the problem and
where in the file. But that's different from
telling him that line 193 of some file he doesn't
have and will probably never see contains the system
call that reported that he typed the wrong filename.
Norman Wilson
Toronto ON
I received a private request for info on my Postfix config. I'm happy to
post to list.
This is the interesting bit:
https://pastebin.com/tNceD6zM
Running under Debian 8, soon to be upgraded to Debian 9.
Postgrey is listening on TCP/10023.
As an aside I just saw this in my mail queue:
# mailq
-Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
2182087EA 1618 Thu Sep 21 10:41:07 robert(a)timetraveller.org
(host aneurin.horsfall.org[110.141.193.233] said: 550 5.7.1
<dave(a)horsfall.org>... No reporting address for linode.com; see RFC 2142
(in reply to RCPT TO command))
dave(a)horsfall.org
That is aggressive standards compliance ;)
Rob
All, sorry for the test post. Grant Taylor has been helping me resolve
the mail bounces, which we think are due to the mailing list preserving the
existing DKIM information when forwarding to e-mail.
This e-mail is going to a test address which should strip the inbound
DKIM stuff before passing to the TUHS list. Hopefully we can observe
the result and check the logs.
Warren
And ... we now bring the threads on current Unix-like systems and current
mail configuration to a close, and remind the group that the mailing list
is about _old_ things :-)
Mind you, if the list lasts another 25 years, these two threads will
meet that criterion.
Thanks, Warren
I use Exchange 5.5 & MacOS + Outlook... I know very un-unixy but it's more
so a test bed for a highly modified version of Basilisk II (more so to test
appletalk of all things)
I route it through Office 365, since I use that for my company, and they
have a 'connector' to route a domain through their spam filters and then
drop it to my legacy Exchange server. I gave up on the SPAM fight, it
really was far too much of a waste of my time. That and this email address
is in far far too many databases... :|
I'm on the fence if it's really worth the effort though. I wanted to setup
some kind of UUCP / Exchange relay, and maybe go full crazy with X.25 but at
some point I need to maybe let some of this old stuff just die... It's the
same reason I don't run ATM at home.
> ----------
> From: Larry McVoy
> Sent: Thursday, September 21, 2017 12:25 AM
> To: TUHS main list
> Subject: [TUHS] Who is running their own mail server and what do you
> run?
>
> I tried running my own server on mcvoy.com but eventually gave up, the
> spam filtering was a non-ending task.
>
> If someone has a plug and chug setup for MX I'd love to try it.
>
> Thanks,
>
> --lm
>
Maybe I'm the odd one out here, but at home I've only got a Windows/10
notebook :-)
Mind you, at work I play with
. aDEC 400xP, DECpc MTE, Prioris XL server running SCO UNIX 3.2V4.2
. AlphaServer DS10 running Digital Unix 4.0g
. AlphaServer DS15 running Tru64 Unix 5.1B
. HP(E) rx-servers rx1620, rx2620, rx2660 running HP-UX 11.23
. HP(E) rx-servers rx2800 i2/i4 running HP-UX 11.31
. DOS 6.22, Windows/Xp, Windows/7 clients
Maintaining applications which were conceived late 80s is fun :-)
I worked on, and co-managed, TOPS-20 on DECsystem 20/40 and 20/60
systems with the PDP-10 KL-10 CPU from September 1978 to 31 October
1990, when our 20/60 was retired. (A second 20/60 on our campus in
the Department of Computer Science had been retired a year or two
earlier).
There were two C compilers on the system, Ken Harrenstien's kcc, and
Steve Johnson's pcc, the latter ported to TOPS-20 by my late friend
Jay Lepreau (1952--2008).
pcc was a straightforward port intended to make C programming, and
porting of C software, fairly easy on the PDP-10, but without
addressing many of the architectural features of that CPU.
kcc was written by Ken Harrenstien from scratch, and designed
explicitly for support of the PDP-10 architecture. In particular, it
included an O/S system call interface (the JSYS instruction), and
support for pointers to all byte sizes from 1 to 36. Normal
addressing on the PDP-10 is by word, with an 18-bit address space.
Thus, two 18-bit fields fit in a 36-bit word, ideally suited for
Lisp's CAR and CDR (contents of address/decrement register, used for
first and rest addressing of lists). However, PDP-10 byte pointers
encode the byte size and offset in the second half of a word.
Pointer words could contain an indirect bit, which caused the CPU to
automatically load a memory word at that address, and repeat if that
word was found to be an indirect pointer. That processing was handled
by the LOAD instructions, so it worked for all programming languages.
Characters on the ten-or-so different PDP-10 operating systems were
normally 7-bit ASCII, stored left to right in a word, with the
right-most low-order bit set to 0, UNLESS the word was intended to be
a 5-decimal-digit line number, in which case, that bit was set to 1.
Compilers and some other tools ignored line-number words.
As the need to communicate with other systems with 8-, 16-, and 32-bit
words grew, we had to accommodate files with 8-bit characters, which
could be stored as four left-adjusted characters with 4 rightmost zero
bits, or handled as 9 consecutive 8-bit characters in two adjacent
36-bit words. That was convenient for binary file transfer, but I
don't recall ever seeing 9-bit characters used for text files.
By contrast, on the contemporary 36-bit Univac 11xx systems running
EXEC-8, the O/S was extended from 6 six-bit Fieldata chararacters per
word to 9-bit extended ASCII (and ISO 8859-n Latin-n) characters: the
reason was that the Univac CPU had quarterword access instructions,
but not arbitrary byte-size instructions like the PDP-10. I don't
think that there ever was a C compiler on those Univac systems.
On the PDP-10, memory locations 0--15 are mapped to machine registers
of those numbers: short loops could be copied into those locations and
would then run about 3x faster, if there weren't too many memory
references. Register 0 was not hardwired to a zero value, so
dereferencing a NULL pointer could return any address, and could even
be legitimate in some code. The kcc documentation reports:
>> ...
>> The "NULL" pointer is represented internally as a zero word,
>> i.e. the same representation as the integer value 0, regardless of
>> the type of the pointer. The PDP-10 address 0 (AC 0) is zeroed and
>> never used by KCC, in order to help catch any use of NULL pointers.
>> ...
In kcc, the C fopen() call second argument was extended with extra
flag letters:
>> ...
>> The user can override either the bytesize or the conversion
>> by adding explicit specification characters, which should come after
>> any regular specification characters:
>> "C" Force LF-conversion.
>> "C-" Force NO LF-conversion.
>> "7" Force 7-bit bytesize.
>> "8" Force 8-bit bytesize.
>> "9" Force 9-bit bytesize.
>> "T" Open for thawed access (TOPS-10/TENEX only)
>>
>> These are KCC-specific however, and are not portable to other
>> systems. Note that the actual LF conversion is done by the USYS (Unix
>> simulation) level calls (read() and write()) rather than STDIO.
>> ...
As the PDP-10 evolved, addressing was extended from 18 bits to 22
bits, and kcc had support for such extended addresses.
Inside the kcc compiler,
>> ...
>> Chars are aligned on 9-bit byte boundaries, shorts on halfword
>> boundaries, and all other data types on word boundaries (with the
>> exception of bitfields and the _KCCtype_charN types). Converting any
>> pointer to a (char *) and back is always possible, as a char is the
>> smallest possible object. If the original object was larger than a
>> char, the char pointer will point to the first byte of the object; this
>> is the leftmost 9-bit byte in a word (if word-aligned) or in the halfword
>> (if a short).
>> ...
That design choice meant that the common assumption that a 32-bit word
holds 4 characters remained true on the PDP-10. The _KCCtype_charN
types could have N from 1 to 36. The case N = 6 was special: it
handled the SIXBIT character representation used by compilers,
linkers, and the O/S to encode external function names mapped to a
6-bit character set unique to the PDP-10, allowing 6-character unique
names for symbols.
I didn't readily find documentation of kcc features on the Web, so for
those who would like to learn more about support of C and Unix code on
the PDP-10, I created this FTP/Web site today:
http://www.math.utah.edu/pub/kccftp://ftp.math.utah.edu/pub/kcc
It supplies several *.doc files; the user.doc file is likely the one
of most interest for this discussion.
Getting C onto TOP-20 was hugely important for us, because it gave us
access to many Unix tools (I was the first to port Brian Kernighan's
awk language to the PDP-10, and also to the VAX VMS native C
compiler), and eased the transition from TOPS-20 to Unix that began
for our users about 1984, and continued until our complete move in
summer 1991, when we retired our last VAX VMS systems.
Finally, here is a pointer to a document that I wrote about that
transition:
http://www.math.utah.edu/~beebe/reports/1987/t20unix.pdf
P.S. I'll be happy to entertain further questions about these two C
compilers on the PDP-10, offline if you prefer, or on this list.
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------
All, I just had this question popped into my inbox.
Cheers, Warren
----- Forwarded message from Evan Koblentz <evan(a)snarc.net> -----
Hi Warren. Evan K. here from Vintage Computer Festival, etc.
I'm trying to find out who invented the Chroot command in Version 7 Unix.
Could you help, possibly including a post to TUHS email list on my behalf?
I posted to our local (northeast US) list and also emailed Brian Kernighan and
Bill Cheswick.
Hoping this leads to an answer. I'm looking for a name, not just generalities.
Thanks very much.
----- End forwarded message -----
Random832:
Just out of curiosity, where does perror(filename), quite possibly the
*most* common error message on Unix as a whole, fall on your scale? It
says which of the file location or permissions (or whatever else) it is,
but not whether it was attempting to open it for reading or writing.
=====
I never liked perror much. It's a little too primitive:
you get exactly one non-formatted string; you get only
stderr, so if you're sending messages separately to a log
or writing them to a network connection or the like, you're
out of luck.
strerror(errno) hits the sweet spot for me. Until it
appeared in the standard library (and until said standard
spread enough that one could reasonably expect to find it
anywhere) I kept writing more or less that function into
program after program.
I guess the problem with perror is that it isn't sufficiently
UNIX-like: it bundles three jobs that are really separate
(convert errno to string message, format an error message,
output the message) into one function, inseparably and
inflexibly.
Norman Wilson
Toronto ON