I got a chance to do some work on the UNIX V1 sources this evening. I
took the output of my OCR software and with a couple of hours of editing,
it successfully assembles with a MACRO11 assembler modified for "as" syntax,
with the only exception being that "fpsym" is undefined. It looks like
the floating point emulation code is missing.
Since this OCR is independent of the other work that has been done, a
diff should provide an opportunity to fix any errors in the comments
that would not have been caught by the assembler.
Is there a place to upload this without a Google account? The assembler
listing is about 416K.
I wrote much of the bootstrap code a few weeks ago, so it ought to be
straightforward to get this up and running under simulation.
James Markevitch
A while ago, I heard someone (I can't remember who) say that he had a
paper listing of (at least part of) PDP-7 Unix. How much is there in the
way of surviving listings of PDP-7 Unix (if any)? With all of the
discussion of OCRing the V1 Unix kernel listing, I was wondering if
something similar could be done with PDP-7 Unix if enough listings have
survived (which is sort of unlikely, but you never know).
> I have dug up another listing of the PDP-11 assembly languge
> version, which seems to about contemporary with the
> one you have. The files mostly bear a copyright date
> of 1972, but like other printouts from the time,
> the datestamps only give month and day, not year.
> They are generally from May. It is post 11/45,
> and has segmentation and floating-point support.
Very cool! (fpsym, presumably)
> I replied and asked if we could get either a scan copy of the "other listing",
> or if he could send a photocopy to Tim.
As usual, the key is a high resolution, high quality scan. There is a huge
difference between 300dpi and 400dpi/600dpi for this old stuff, since the
signal to noise ratio is much better with the better scans.
This sounds like a broken record, but there was a 1200 page listing were
the first 400 pages were at 300dpi and the remaining 800 pages were at
400dpi. When you zoomed in, the differences were astounding and the
OCR results reflected that (the person needed to do a lot of editing on
the first third of the document to get it to compile).
If someone can get me a hardcopy, I'll scan it at 600dpi, as I am sure
Al would, if Tim isn't set up to scan stuff like this.
James Markevitch
Guys, I got this message from Dennis.
Warren
----- Forwarded message from Dennis Ritchie -----
Subject: Re: Trying to restore 1972 UNIX
Date: Thu, 1 May 2008 00:55:35 -0400
About the assembler, I am pretty sure that it's substantially
the same as that on the 5th edition tape, so it's likely
that a modified version, without the syscall definitions,
could be produced.
I have dug up another listing of the PDP-11 assembly languge
version, which seems to about contemporary with the
one you have. The files mostly bear a copyright date
of 1972, but like other printouts from the time,
the datestamps only give month and day, not year.
They are generally from May. It is post 11/45,
and has segmentation and floating-point support.
Incidentally, it doesn't use any of the system call names
as such; 'read' is at sysread: and so on.
About assembling it, I'm pretty sure we just did
'as u?.s' and the a.out was ready. This was before
make, after all.
Dennis
----- End forwarded message -----
I replied and asked if we could get either a scan copy of the "other listing",
or if he could send a photocopy to Tim.
Cheers,
Warren
I went through all the errors on the code checked in so far and made
edits consistent (I hope :-) with the pdf.
I also added the missing KE11A addresses (memory mapped EAE).
The remaining errors seem to be only due to missing pages.
-brad
> Can you show me how you are running it? (and feel free to cc the list)
(I think its mentioned in an earlier post already). I copy the
files to my 7ed system (make a tar, put it on a tape image, and
attach it in simh, then tar x to get contents). Probably easier
if you're using apout and local filesystem... I'm using the following
script (in my tools but not checked in because I'm using nonstandard
conv2):
tools/rebuild
(cd rebuilt; gtar -O -cf ../u.tar u?.s)
./conv2 -o tape.tm u.tar
cp tape.tm ~/work/simh/unix-v7-4/run/
Anyway to assemble I run:
as - sys.s u0.s u1.s ux.s
btw, I noticed some unicode characters in the files you committed.
I havent' had a chance to spend time editing it yet.. The ocr
often uses unicode for things like "-".
> I think there is a binary format. I think I figured it out once and
> wrote something to turn an a.out into it. hmmm. I'll go digging.
a.out is so simple, it wouldnt be hard to reproduce if we had to.
> I checked in the missing pages from e3, e4 and e8. I have not tried
> to assemble them yet, however.
I noticed that. Thank you.
> -brad
Tim Newsham
http://www.thenewsh.com/~newsham/
> I can happily deal with the jsr pc,do type of jsr, but the ones
> involving r5 have me stumped, e.g.:
>
> jsr r5,questf; < nonexistent\n\0>; .even
I have encountered this type of construct a lot when doing disassemblers
over the years. My usual strategy for dealing with this is:
1. If it's quick and dirty and I am not running huge amounts of code,
then the disassembler allows the user to provide a list of "hints" to
it. The hints for this would describe the arguments to each subroutine.
For illustrative purposes, you might have a side file that contains
the following:
subr 002004 questf string
meaning that location 002004 is a subroutine names questf that expects
a null-terminated string as the argument. As an additional benefit,
you get a nice name for the subroutine that the disassembler can put
into the output.
And if a subroutine takes two 16-bit arguments, you might have:
subr 003436 mysub arg16 arg16
If the disassembler identifies each of the targets of the jsr
instructions, then you can usually do a quick look at the code to
see what it expects, then add to the side file, then re-run the
disassembler.
2. If you want to be less quick and dirty, you can have the disassembler
do a partial flow analysis of the code to figure out what is expected
for arguments. This is usually much more involved and you still often
need to add hints for cases where the '60s or '70s programmer did some
kind of "neat trick" when coding.
My philosophy on these is to use tools to get to the 95%+ level of
automation and provide hints to pick up the rest. Using strategy
number 1 above will probably get you a lot of success with a small
amount of coding in your disassembler.
James Markevitch
All, I've just created a mailing list for the people involved in the effort
to reconstruct the Unix kernel from the 1972 assembly listing. I thought
it would be good to keep the mundane details of the work separate from the
TUHS mailing list.
The new list is unix-jun72(a)tuhs.org
I've manually subscribed the e-mail addresses that seem to be interested
in the work. If you want to be removed from the new list, e-mail me. If
you want to subscribe to the list, you can go here to do that:
https://minnie.tuhs.org/mailman/listinfo/unix-jun72
Cheers,
Warren
Hi,
here is my 2p :
http://cyrillelefevre.free.fr/jun72/jun72.zip
which is an archive of automatically extracted tif images from the
original pdf file.
so, no need to print/scan any more...
Regards,
Cyrille Lefevre
--
mailto:Cyrille.Lefevre-lists@laposte.net