A. P. Garcia <a.phillip.garcia(a)gmail.com> wrote:
> Were the original Unix authors annoyed when they learned that
> some irascible young upstart named Richard Stallman was determined to make
> a free Unix clone?
A deeper, more profound question would be: how did these original Unix
authors feel about their employer owning the rights to their creation?
Did they feel any guilt at all for having had to sign over all rights
in exchange for their paychecks?
Did Dennis and/or Ken personally wish their creation were free to the
world, public domain, or were they personally in agreement with the
licensing policies of their employer? I argue that this question is
far more important than how they felt about RMS (if they cared at all).
Ronald Natalie <ron(a)ronnatalie.com> wrote:
> [RMS] If you read his earlier manifesto rants he hated UNIX =
> with a passion.
> Holding out the TOPS operating systems as the be-all and end-all of user =
> interface.
I wish more people would point out this aspect of RMS and GNU. While
I wholeheartedly agree with Richard on the general philosophy of free
software, i.e., the *ethics* part and the Four Freedoms, when it comes
to GNU as a specific OS, in technical terms, I've always disliked
everything about it. I love UNIX, and as Ron pointed it out like few
people do, GNU was fundamentally born out of hatred for the thing I
love.
SF
So it turns out the 'dcheck' distributed with V6 has two (well, three, but
the third one was only a potential problem for me) bugs it.
The first was a fence-post error on a table clearing operation; it could
cause the entry for the last inode of the disk in the constructed table of
directory entry counts to start with a non-zero count when a second disk was
scanned. However, it was only triggered in very specific circumstances:
- A larger disk was listed before a smaller one (either in the command line,
or compiled in)
- The inode on the larger disk corresponding to the last inode on the smaller
one was in use
I can understand how they never ran across this one.
The other one, however, which was an un-initalized variable, should have
bitten them anytime they had more than one disk listed! It caused the
constructed table of directory entry counts to be partially or wholly
(depending on the size of the two disks) blank in all disks after the first
one, causing numerous (bogus) error reports.
(It was also amusing to find an un-used procedure in the source; it looks
like dcheck was written starting with the code for 'icheck' - which explains
the second bug; since the logic in icheck is subtly different, that variable
_is_ set properly in icheck.)
How this bug never bit them I cannot understand - unless they saw it, and
couldn't be bothered to find and fix it!
To me, it's completely amazing to find such a serious bug in such a critical
piece of widely-distributd code! A lesson for archaeologists...
Anyway, a fixed version is here:
http://ana-3.lcs.mit.edu/~jnc/tech/unix/ucmd/dcheck.c
if anyone cares/needs it.
Noel
Larry McVoy scripsit:
> I love Rob Pike, he's spot on on a lot of stuff. I'm a big fan of
> "if you think you need threads then your processes are too fat".
Oh, he's a brilliant fellow. I don't know him personally, but I know
people who do, and I don't think I'd love him if I knew him. Humanity has
always found it useful to keep its (demi)gods at arm's length at least.
--
John Cowan http://www.ccil.org/~cowan cowan(a)ccil.org
Barry thirteen gules and argent on a canton azure fifty mullets of five
points of the second, six, five, six, five, six, five, six, five, and six.
--blazoning the U.S. flag
> From: jnc(a)mercury.lcs.mit.edu (Noel Chiappa)
> the second (the un-initialized variable) should have happened every
> time.
OK, so I was wrong! The variable in question was a global static, 'ino' (the
current inode number), so the answer isn't something simple like 'it was an
auto that happened to be cleared for each disk'. But now that I look closely,
I think I see a way it might have worked.
'dcheck' is a two-pass per disk thing: it begins each disk by clearing its
'inode link count' table; then the first pass does a pass over all the inodes,
and for ones that are directories, increments counts for all the entries; the
second pass re-scans all the inodes, and makes sure that the link count in the
inode itself matches the computed count in the table.
'ino' was cleared before the _second_ pass, but not the _first_. So it was
zero for the first pass of the first disk, but non-zero for the first pass on
the second disk.
This looks like the kind of bug that should almost always be fatal, right?
That's what I thought at first... (and I tried the original version on one of
my machines to make sure it did fail). But...
The loop in each pass has two index variables, one of which is 'ino', which it
compares with the maximum inode number for that disk (per the super-block),
and bails if it reaches the max:
for(i=0; ino<nfiles; i =+ NIBLK)
If the first disk is _larger_ than the second, the first pass will never
execute at all for the second desk (producing errors).
However, if the _second_ is larger, then the second disk's first pass will in
fact examine the starting (nfilesSUBsecond - nfilesSUBfirst) inodes of the
second disk to see if they are directories (and if so, count their links).
So if the last nfilesSUBfirst inodes of the second disk are empty (which is
often the case with large drives - I had modified 'df' to count the free
inodes as well as disk blocks, and after doing so I noticed that Unix seems to
be quite generous in its default inode allocations), it will in fact work!
The fact that 'ino' is wrong all throughout the first pass of the second disk
(it counts up from nfilesSUBfirst to nfilesSUBsecond) turns out to be
harmless, because the first pass never uses the current inode number, it only
looks at the inode numbers in the directories.
Note that with two disks of _equal size_, it fails. Only if the second is
larger does it work! (And this generalizes out to N disks - as long as each
one is enough larger than the one before!) So for the config they were
running (rk2, dp0) it probably did in fact work!
Noel
Noel Chiappa:
To me, it's completely amazing to find such a serious bug in such a critical
piece of widely-distributd code! A lesson for archaeologists...
======
To me it's not surprising at all.
On one hand, current examples of widely-distributed critical
code containing serious flaws are legion. What, after all,
were the Heartbleed and OS X goto fail; bugs? What is every
version of Internet Explorer?
On the other hand, Ken and Dennis and the other guys behind
the earliest UNIX code were smart guys and good programmers,
but they were far from perfect; and back in those days we
were all a lot sloppier.
So surprising? No. Interesting? Certainly. All bugs are
interesting.
(To me, anyway. Back in the 1980s, when I was at Bell Labs,
SP&E published a paper by Don Knuth discussing all the many
bugs found in TeX, including some statistical analysis. I
thought it fascinating and revealing and think reading it
made me a better programmer. Rob Pike thought it was terribly
boring and shouldn't have been published. Decidedly different
viewpoints.)
Norman Wilson
Toronto ON
> From: Ronald Natalie <ron(a)ronnatalie.com>
> If I understand what you are saying, it only occurs when you run dcheck
> with mutliple volumes at one time?
Right, _both_ bugs have that characteristic. But the first one (the
fence-post) only happens in very particular circumstances; the second (the
un-initialized variable) should have happened every time.
> From: norman(a)oclsc.org (Norman Wilson)
> To me it's not surprising at all.
> On one hand, current examples of widely-distributed critical code
> containing serious flaws are legion.
What astonished me was not that there was a bug (which I can easily believe),
but that it was one that would have happened _every time they ran it_.
'dcheck' has this list of disks compiled into it. (Oh, BTW, my fixed version
now reads a file, /etc/disks; I am running a number of simulated machines,
and the compiled-in table was a pain.)
So I would have thought they must have at least tried that mode of operation
once? And running it that way just once should have shown the bug. Or did
they try it, see the bug, and 'dealt' with it by just never running it that
way?
Noel
> From: asbesto <asbesto(a)freaknet.org>
> We have about 40 disks, with RT-11 on them
Ah. You should definitely try Unix - a much more pleasant computing/etc
environment!
Although without a video editor... although I hope to have one available
'soon', from the MIT V6+ system (I think I have found some backup tapes from
it).
> This PDP-11/34 was used for a medical CAT equipment
As, so it probably has the floating point, then. If so, you should be able to
use the Shoppa V6 Unix disk as it is, then - that has a Unix on it which will
work on an 11/23 (which don't have the switch register that V6 normally
requires).
But if not, let me know, and I can provide a V6 Unix for it (I already have
the tweaked version running on a /23 in the simulator).
Noel
PS: For those who downloaded the 'fixed' ctime.c (if anyone :-), it turns out
there was a bug in my fix - in some cases, one variable wasn't initialized
properly. There's a fixed one up there now.
> From: asbesto <asbesto(a)freaknet.org>
> Just in these days we restored a PDP-11/23PLUS here at our Museum! :)
> ...
> CPU is working
That is good to hear! You all seem to have been very resourceful in making
the power supply for it!
> and we're trying to boot from a RL02 unit :)
Is your RL02 drive and RLV11 controller all working? Here are some
interesting pages:
http://www.retrocmp.com/pdp-11/pdp-1144/my-pdp-1144/rl02-disk-troublehttp://www.retrocmp.com/pdp-11/pdp-1144/my-pdp-1144/more-on-rl01rl02
from someone in Germany about getting their RL11 and RL02 to work.
Also, when you say "boot from an RL02", what are you trying to boot? Do you
have an RL02 pack with a working system on it? If so, what kind - a Unix
of some sort, or some DEC operating system?
> From: SPC <spedraja(a)gmail.com>
> I'll keep a reference of this message and try it as soon as possible...
Speaking of getting Unix to run on an 11/23 with an RL02... I just realized
that the hard part of getting a Unix running, for you, will not be getting V6
to run on a machine without a switch register (which is actually pretty easy
- I have worked out a way to do it that involves changing one line in
param.h, and adding two lines of code to main.c).
The hard part is going getting the bits onto the disk! If all you have is an
RL02, you are going to have to load bits into the computer over a serial line.
WKT has done this for V7 Unix:
http://www.tuhs.org/Archive/PDP-11/Tools/Tapes/Vtserver/
but V7 really wants a machine with split I/D (which the /23 does not have). I
guess V7 'sort of' works on a machine without I/D, but I'm not a V7 expert,
so I can't say for sure.
It would not be hard to do something similar to the VTServer thing for V6,
though. If you would like to go this way, let me know, I would be very
interested in helping with this.
Also, do you only have one working RL02 drive, or more than one? If you only
have one, you will not be able to do backups (unless you have something else
connected to the machine, e.g. some sort of tape drive, or something).
Noel
> From: SPC <spedraja(a)gmail.com>
> I'll keep a reference of this message and try it as soon as possible...
No rush! Take your time...
> the disruptive fact (in terms of time) here is to put up-to-date both
> the PDP-11/23-PLUS and RL02.
My apologies, I just now noticed that you have an 11/23-PLUS (it is slightly
different from a plain 11/23).
I am not very familiar with the 11/23-PLUS (I never worked with one), but from
documentation I just dug out, it seems that they normally come with the MMU
chip, so we don't need to worry about that. However, the FPP is not standard,
so that is still an issue for bringing up Unix.
In fact, there are two different FPP options for the 11/23-PLUS (and,
actually, for the 11/23 as well): one is the KEF-11AA chip which goes on the
CPU card (on the 11/23-PLUS, in the middle large DIP holder), and the other is
something called the FPF-11 card, which is basically hardware floating point
(the KEF-11A is just microcode), for people who are doing serious number
crunching. It's a quad-size card which has a cable with a DIP header on the
end which plugs into the same DIP holder on the CPU card as the KEF-11A. They
look the same to software; one is just faster than the other.
Anyway, if you don't have either one, we'll have to produce a new Unix
load for you (not a big problem, if it is needed).
Noel
Does anyone know if the source for an early PDP-11 version of MERT is
available anywhere?
(For those who aren't familiar with MERT, it was a micro-kernel [as we would
name it now] which provided message-passing and [potentially shared] memory
segments, intended for real-time applications; it supported several levels of
protection, using the 'Supervisor' mode available in the 11/45 and 11/70. One
set of supervisor processes provided a Unix environment; the combination was
called UNIX/RT - hence my asking about it here.)
Thanks!
Noel