Guys, I'm writing a PDP-11 a.out disassember. I think it will be useful for
a couple of reasons:
- we will be able to convert the extant 1972 binaries back into some form
of source code. It won't be as good as the real thing, but it will be
better than the binary.
- we have some source code in fragmentary form on the s1 tape, see
http://minnie.tuhs.org/UnixTree/1972_stuff/. Some of the fragments
are identifiable, some are not. We might be able to use the
diassembled binaries to identify some of the fragments, and even
reconstruct a hybrid original/diassembled version of the source
for some of the 1972 applications.
Right now, here's what I've got: disassembly of the top of 1972 ls:
sys break: 00
mov $01,044260
mov sp,r5
mov (r5)+,043732
tst (r5)+
dec 043732
mov 043732,043734
bgt 040056
mov $042542,r5
mov (r5)+,r4
cmpb (r4)+,$055
bne 040174
dec 043734
and the top of the frag19 file:
sys break; end+512.
mov $1,obuf
mov sp,r5
mov (r5)+,count
tst (r5)+
dec count
mov count,ocount
bgt loop
mov $dotp,r5
loop:
mov (r5)+,r4
cmpb (r4)+,$'-
bne 1f
dec ocount
At the moment it's a 1-pass disassembler. I want to make it 2-pass: on the
first pass I will try to identify labels for branches, functions, strings and
variable locations (and given them arbitrary names); on the second pass
I'll print out the instructions with reference to the labels.
None of the binaries have symbol tables, unfortunately.
It's a start, anyway.
Warren
I got a chance to do some work on the UNIX V1 sources this evening. I
took the output of my OCR software and with a couple of hours of editing,
it successfully assembles with a MACRO11 assembler modified for "as" syntax,
with the only exception being that "fpsym" is undefined. It looks like
the floating point emulation code is missing.
Since this OCR is independent of the other work that has been done, a
diff should provide an opportunity to fix any errors in the comments
that would not have been caught by the assembler.
Is there a place to upload this without a Google account? The assembler
listing is about 416K.
I wrote much of the bootstrap code a few weeks ago, so it ought to be
straightforward to get this up and running under simulation.
James Markevitch
A while ago, I heard someone (I can't remember who) say that he had a
paper listing of (at least part of) PDP-7 Unix. How much is there in the
way of surviving listings of PDP-7 Unix (if any)? With all of the
discussion of OCRing the V1 Unix kernel listing, I was wondering if
something similar could be done with PDP-7 Unix if enough listings have
survived (which is sort of unlikely, but you never know).
> I have dug up another listing of the PDP-11 assembly languge
> version, which seems to about contemporary with the
> one you have. The files mostly bear a copyright date
> of 1972, but like other printouts from the time,
> the datestamps only give month and day, not year.
> They are generally from May. It is post 11/45,
> and has segmentation and floating-point support.
Very cool! (fpsym, presumably)
> I replied and asked if we could get either a scan copy of the "other listing",
> or if he could send a photocopy to Tim.
As usual, the key is a high resolution, high quality scan. There is a huge
difference between 300dpi and 400dpi/600dpi for this old stuff, since the
signal to noise ratio is much better with the better scans.
This sounds like a broken record, but there was a 1200 page listing were
the first 400 pages were at 300dpi and the remaining 800 pages were at
400dpi. When you zoomed in, the differences were astounding and the
OCR results reflected that (the person needed to do a lot of editing on
the first third of the document to get it to compile).
If someone can get me a hardcopy, I'll scan it at 600dpi, as I am sure
Al would, if Tim isn't set up to scan stuff like this.
James Markevitch
Guys, I got this message from Dennis.
Warren
----- Forwarded message from Dennis Ritchie -----
Subject: Re: Trying to restore 1972 UNIX
Date: Thu, 1 May 2008 00:55:35 -0400
About the assembler, I am pretty sure that it's substantially
the same as that on the 5th edition tape, so it's likely
that a modified version, without the syscall definitions,
could be produced.
I have dug up another listing of the PDP-11 assembly languge
version, which seems to about contemporary with the
one you have. The files mostly bear a copyright date
of 1972, but like other printouts from the time,
the datestamps only give month and day, not year.
They are generally from May. It is post 11/45,
and has segmentation and floating-point support.
Incidentally, it doesn't use any of the system call names
as such; 'read' is at sysread: and so on.
About assembling it, I'm pretty sure we just did
'as u?.s' and the a.out was ready. This was before
make, after all.
Dennis
----- End forwarded message -----
I replied and asked if we could get either a scan copy of the "other listing",
or if he could send a photocopy to Tim.
Cheers,
Warren
I went through all the errors on the code checked in so far and made
edits consistent (I hope :-) with the pdf.
I also added the missing KE11A addresses (memory mapped EAE).
The remaining errors seem to be only due to missing pages.
-brad