Larry McVoy scripsit:
> I love Rob Pike, he's spot on on a lot of stuff. I'm a big fan of
> "if you think you need threads then your processes are too fat".
Oh, he's a brilliant fellow. I don't know him personally, but I know
people who do, and I don't think I'd love him if I knew him. Humanity has
always found it useful to keep its (demi)gods at arm's length at least.
--
John Cowan http://www.ccil.org/~cowan cowan(a)ccil.org
Barry thirteen gules and argent on a canton azure fifty mullets of five
points of the second, six, five, six, five, six, five, six, five, and six.
--blazoning the U.S. flag
> From: jnc(a)mercury.lcs.mit.edu (Noel Chiappa)
> the second (the un-initialized variable) should have happened every
> time.
OK, so I was wrong! The variable in question was a global static, 'ino' (the
current inode number), so the answer isn't something simple like 'it was an
auto that happened to be cleared for each disk'. But now that I look closely,
I think I see a way it might have worked.
'dcheck' is a two-pass per disk thing: it begins each disk by clearing its
'inode link count' table; then the first pass does a pass over all the inodes,
and for ones that are directories, increments counts for all the entries; the
second pass re-scans all the inodes, and makes sure that the link count in the
inode itself matches the computed count in the table.
'ino' was cleared before the _second_ pass, but not the _first_. So it was
zero for the first pass of the first disk, but non-zero for the first pass on
the second disk.
This looks like the kind of bug that should almost always be fatal, right?
That's what I thought at first... (and I tried the original version on one of
my machines to make sure it did fail). But...
The loop in each pass has two index variables, one of which is 'ino', which it
compares with the maximum inode number for that disk (per the super-block),
and bails if it reaches the max:
for(i=0; ino<nfiles; i =+ NIBLK)
If the first disk is _larger_ than the second, the first pass will never
execute at all for the second desk (producing errors).
However, if the _second_ is larger, then the second disk's first pass will in
fact examine the starting (nfilesSUBsecond - nfilesSUBfirst) inodes of the
second disk to see if they are directories (and if so, count their links).
So if the last nfilesSUBfirst inodes of the second disk are empty (which is
often the case with large drives - I had modified 'df' to count the free
inodes as well as disk blocks, and after doing so I noticed that Unix seems to
be quite generous in its default inode allocations), it will in fact work!
The fact that 'ino' is wrong all throughout the first pass of the second disk
(it counts up from nfilesSUBfirst to nfilesSUBsecond) turns out to be
harmless, because the first pass never uses the current inode number, it only
looks at the inode numbers in the directories.
Note that with two disks of _equal size_, it fails. Only if the second is
larger does it work! (And this generalizes out to N disks - as long as each
one is enough larger than the one before!) So for the config they were
running (rk2, dp0) it probably did in fact work!
Noel
Noel Chiappa:
To me, it's completely amazing to find such a serious bug in such a critical
piece of widely-distributd code! A lesson for archaeologists...
======
To me it's not surprising at all.
On one hand, current examples of widely-distributed critical
code containing serious flaws are legion. What, after all,
were the Heartbleed and OS X goto fail; bugs? What is every
version of Internet Explorer?
On the other hand, Ken and Dennis and the other guys behind
the earliest UNIX code were smart guys and good programmers,
but they were far from perfect; and back in those days we
were all a lot sloppier.
So surprising? No. Interesting? Certainly. All bugs are
interesting.
(To me, anyway. Back in the 1980s, when I was at Bell Labs,
SP&E published a paper by Don Knuth discussing all the many
bugs found in TeX, including some statistical analysis. I
thought it fascinating and revealing and think reading it
made me a better programmer. Rob Pike thought it was terribly
boring and shouldn't have been published. Decidedly different
viewpoints.)
Norman Wilson
Toronto ON
> From: Ronald Natalie <ron(a)ronnatalie.com>
> If I understand what you are saying, it only occurs when you run dcheck
> with mutliple volumes at one time?
Right, _both_ bugs have that characteristic. But the first one (the
fence-post) only happens in very particular circumstances; the second (the
un-initialized variable) should have happened every time.
> From: norman(a)oclsc.org (Norman Wilson)
> To me it's not surprising at all.
> On one hand, current examples of widely-distributed critical code
> containing serious flaws are legion.
What astonished me was not that there was a bug (which I can easily believe),
but that it was one that would have happened _every time they ran it_.
'dcheck' has this list of disks compiled into it. (Oh, BTW, my fixed version
now reads a file, /etc/disks; I am running a number of simulated machines,
and the compiled-in table was a pain.)
So I would have thought they must have at least tried that mode of operation
once? And running it that way just once should have shown the bug. Or did
they try it, see the bug, and 'dealt' with it by just never running it that
way?
Noel
> From: asbesto <asbesto(a)freaknet.org>
> We have about 40 disks, with RT-11 on them
Ah. You should definitely try Unix - a much more pleasant computing/etc
environment!
Although without a video editor... although I hope to have one available
'soon', from the MIT V6+ system (I think I have found some backup tapes from
it).
> This PDP-11/34 was used for a medical CAT equipment
As, so it probably has the floating point, then. If so, you should be able to
use the Shoppa V6 Unix disk as it is, then - that has a Unix on it which will
work on an 11/23 (which don't have the switch register that V6 normally
requires).
But if not, let me know, and I can provide a V6 Unix for it (I already have
the tweaked version running on a /23 in the simulator).
Noel
PS: For those who downloaded the 'fixed' ctime.c (if anyone :-), it turns out
there was a bug in my fix - in some cases, one variable wasn't initialized
properly. There's a fixed one up there now.
> From: asbesto <asbesto(a)freaknet.org>
> Just in these days we restored a PDP-11/23PLUS here at our Museum! :)
> ...
> CPU is working
That is good to hear! You all seem to have been very resourceful in making
the power supply for it!
> and we're trying to boot from a RL02 unit :)
Is your RL02 drive and RLV11 controller all working? Here are some
interesting pages:
http://www.retrocmp.com/pdp-11/pdp-1144/my-pdp-1144/rl02-disk-troublehttp://www.retrocmp.com/pdp-11/pdp-1144/my-pdp-1144/more-on-rl01rl02
from someone in Germany about getting their RL11 and RL02 to work.
Also, when you say "boot from an RL02", what are you trying to boot? Do you
have an RL02 pack with a working system on it? If so, what kind - a Unix
of some sort, or some DEC operating system?
> From: SPC <spedraja(a)gmail.com>
> I'll keep a reference of this message and try it as soon as possible...
Speaking of getting Unix to run on an 11/23 with an RL02... I just realized
that the hard part of getting a Unix running, for you, will not be getting V6
to run on a machine without a switch register (which is actually pretty easy
- I have worked out a way to do it that involves changing one line in
param.h, and adding two lines of code to main.c).
The hard part is going getting the bits onto the disk! If all you have is an
RL02, you are going to have to load bits into the computer over a serial line.
WKT has done this for V7 Unix:
http://www.tuhs.org/Archive/PDP-11/Tools/Tapes/Vtserver/
but V7 really wants a machine with split I/D (which the /23 does not have). I
guess V7 'sort of' works on a machine without I/D, but I'm not a V7 expert,
so I can't say for sure.
It would not be hard to do something similar to the VTServer thing for V6,
though. If you would like to go this way, let me know, I would be very
interested in helping with this.
Also, do you only have one working RL02 drive, or more than one? If you only
have one, you will not be able to do backups (unless you have something else
connected to the machine, e.g. some sort of tape drive, or something).
Noel
> From: SPC <spedraja(a)gmail.com>
> I'll keep a reference of this message and try it as soon as possible...
No rush! Take your time...
> the disruptive fact (in terms of time) here is to put up-to-date both
> the PDP-11/23-PLUS and RL02.
My apologies, I just now noticed that you have an 11/23-PLUS (it is slightly
different from a plain 11/23).
I am not very familiar with the 11/23-PLUS (I never worked with one), but from
documentation I just dug out, it seems that they normally come with the MMU
chip, so we don't need to worry about that. However, the FPP is not standard,
so that is still an issue for bringing up Unix.
In fact, there are two different FPP options for the 11/23-PLUS (and,
actually, for the 11/23 as well): one is the KEF-11AA chip which goes on the
CPU card (on the 11/23-PLUS, in the middle large DIP holder), and the other is
something called the FPF-11 card, which is basically hardware floating point
(the KEF-11A is just microcode), for people who are doing serious number
crunching. It's a quad-size card which has a cable with a DIP header on the
end which plugs into the same DIP holder on the CPU card as the KEF-11A. They
look the same to software; one is just faster than the other.
Anyway, if you don't have either one, we'll have to produce a new Unix
load for you (not a big problem, if it is needed).
Noel
Does anyone know if the source for an early PDP-11 version of MERT is
available anywhere?
(For those who aren't familiar with MERT, it was a micro-kernel [as we would
name it now] which provided message-passing and [potentially shared] memory
segments, intended for real-time applications; it supported several levels of
protection, using the 'Supervisor' mode available in the 11/45 and 11/70. One
set of supervisor processes provided a Unix environment; the combination was
called UNIX/RT - hence my asking about it here.)
Thanks!
Noel
>> I got one PDP-11/23-PLUS without any kind of disk (by now, I got one
>> RL12 board plus one RL02 drive pending of cleaning and arrangement)...
>> I guess if could be possible to run V6 in this machine. There's any
>> kind of adaptation of this Unix version (or whatever) to run under ?
> IIRC the README page for that set of disk images indicates that in fact
> they originally came off an 11/23, so they should run fine on yours.
So I was idly looking through main.c for the Shoppa Unix (because it printed
some unusual messages when it started, and I wanted to see that code), and I
noticed it had some fancy code for dealing with the clock, and that tickled a
very dim memory that LSI-11's had some unusual clock thing. So I decided I
had better check up on that...
I got out an LSI-11 manual, and it looked like the 23 should work, even for
the 'vanilla' V6 from the Bell distro. But I decided I had better check it to
be sure, so I fired up the simulator, mounted a Bell disk, set the cpu type
to '23', and booted 'rkunix'. Which promptly halted!
After a bit of digging, it turned out that the problem is that the 11/23
doesn't have a switch register! It hit a kernel NXM trying to touch it -
and then another trying to read it in the putchar() routine trying to do a
panic(), at which point it died a horrible death.
So I added a SR (you can create all sorts of bizarre hybrids like that with
Ersatz-11, like 11/40's with 11/45 type floating point :-), and then it
booted fine. The clock even worked!
So you will have to use the Shoppa disk to boot (but see below), or we'll
have to spin you a special vanilla V6 Unix that doesn't try to touch the SR -
that shouldn't be much work, I only found two place in the code that touch it.
I did try the Shoppa 'unix', and it booted fine on an 11/23.
Two things to check for, though: first, your 11/23 _has_ to have the MMU chip
(that's the large DIP package with one chip on it nearest the edge of the
card), so if yours looks like this:
http://www.psych.usyd.edu.au/pdp-11/Images/23.jpeg
you're OK. Without the MMU chip, most variants of Unix will not run on the 23
(although there's something called MiniUnix, IIRC, which runs on an LSI-11,
which would probably run on a /23 without an MMU).
Here's the part that might be a problem: To run any of the Unixes on the
Shoppa disk, you also have to have the FPP chip too (that's the second large
DIP package with two chips on it - the image above does not include that
chip, so if yours looks like that, you have a minor problem, and I will have
to build you a Unix or something).
All of the Unixes on the Shoppa disk have to have the FPP, except one - and
that one wants an RX floppy as the root/swap device! The others will all
crash (I tried one, to make sure) if you try and boot them on an 11/23
without the FPP.
I could try patching the binary on the one that doesn't expect to use the FPP
to use the RL as the root, or either i) build you a vanilla V6 for a 23
(above), or ii) figure out how to build systems on the Shoppa disk, and build
you a Unix there which i) uses the RL as the root/swap, and ii) does not
expect to have the FPP.
But let's first find out exactly what you have...
Noel
> From: SPC <spedraja(a)gmail.com>
> I got one PDP-11/23-PLUS without any kind of disk (by now, I got one
> RL12 board plus one RL02 drive pending of cleaning and arrangement)...
> I guess if could be possible to run V6 in this machine. There's any
> kind of adaptation of this Unix version (or whatever) to run under ?
As I mentioned in a previous message on this thread, when I took that root
pack image from the Shoppa group, I could get it to boot to Unix right off.
All it needs is a single RL02 drive (RL/0) (and the console terminal, of
course).
I looked at the 'unix' on it, and it's for an 11/40 type machine (which
includes 11/23's); IIRC the README page for that set of disk images indicates
that in fact they originally came off an 11/23, so they should run fine on
yours.
That Unix has a couple of other devices built into it (looks like an RX and
some sort of A-D), but as long as you don't try and touch them, they will not
be an issue.
Let me know if you need any help getting it up (once you have a working RL02).
Noel