> From: Random832
> Interestingly, the SysIII sum.c program, which I assume yields the same
> result for this input, appears to go through the whole input
> accumulating the sum of all the bytes into a long, then adds the two
> halves of the long at the end rather than after every byte.
That's the same hack a lot of TCP/IP checksums routines used on machines with
longer words; add the words, then fold the result in the shorter length at the
end. The one I wrote …
[View More]for the 68K back in '84 did that.
> This suggests that the two programs would give different results for
> very large files that overflow a 32-bit value.
No, I don't think so, depending on the exact detals of the implementation. As
long as when folding the two halves together, you add any carry into the sum,
you get the same result as doing it into a 16-bit sum. (If my memory of how
this all works is correct - the neurons aren't what they used to be,
especially late in the day... :-)
> Also, if this sign extends, then its behavior on "negative" (high bit
> set) bytes is likely to be very different from the SysIII one, which
> uses getc.
I have this bit set that in C, 'char' is defined to be signed, and
furthermore that when you assign a shorter int to a longer one, the sign is
extended. So if one has a char holding '0200' octal (i.e. -128), assigning it
to a 16-bit int should result in the latter holding '0177600' (i.e. still
-128). So in fact I think they probably act the same.
Noel
[View Less]
> From: Will Senn
> I noticed that the sum utility from v6 reports a different checksum
> than it does using the sum utility from v7 for the same file.
> ... does anyone know what's going on here?
> Why is sum reporting different checksum's between v6 and v7?
The two use different algorithms to accumulate the sum (I have added comments
to the relevant portion of the V6 assembler one, to help understand it):
V6:
mov $buf,r2 / Pointer to buffer in R2
2: …
[View More]movb (r2)+,r4 / Get new byte into R4 (sign extends!)
add r4,r5 / Add to running sum
adc r5 / If overflow, add carry into low end of sum
sob r0,2b / If any bytes left, go around again
Read the description of MOVB in the PDP-11 Processor manual.
V7:
while ((c = getc(f)) != EOF) {
nbytes++;
if (sum&01)
sum = (sum>>1) + 0x8000;
else
sum >>= 1;
sum += c;
sum &= 0xFFFF;
}
I'm not clear on some of that, so I'll leave its exact workings as an
exercise, but I'm pretty sure it's not a equivalent algorithm (as in,
something that would produce the same results); it's certainly not
identical. (The right shift is basically a rotate, so it's not a straight sum,
it's more like the Fletcher checksum used by XNS, if anyone remembers that.)
Among the parts I don't get, for instance, sum is declared as 'unsigned',
presumably 16 bits, so the last line _should_ be a NOP!? Also, with 'c' being
implicitly declared as an 'int', does the assignment sign extend? I have this
vague memory that it does. And does the right shift if the low bit is one
really do what the code seems to indicate it does? I have this bit that ASR on
the PDP-11 copies the high bit, not shifts in a 0 (check the processor
manual). That is, of course, assuming that the compiler implements the '>>'
with an ASR, not a ROR followed by a clear of the high bit, or something.
Noel
[View Less]
Ok, it definitely sounds like the v6tar source is around somewhere so
if someone could point me in the right direction...
I've only seen the binary, and I can't remember where I got it.
Mark
All,
While working on the latest episode of my saga about moving files
between v6 and v7, I noticed that the sum utility from v6 reports a
different checksum than it does using the sum utility from v7 for the
same file. To confirm, I did the following on both systems:
# echo "Hello, World" > hi.txt
# cat hi.txt
Hello, World
Then on v6:
# sum hi.txt
1106 1
But on v7:
# sum hi.txt
37264 1
There is no man page for the utility on v6, and it's assembler. On v7,
there's a manpage and …
[View More]it's C:
man sum
...
Sum calculates and prints a 16-bit checksum for the named
file, and also prints the number of blocks in the file.
...
A few questions:
1. I'll eventually be able to read assembly and learn what the v6
utility is doing the hard way, but does anyone know what's going on here?
2. Why is sum reporting different checksum's between v6 and v7?
3. Do you know of an alternative to check that the bytes were
transferred exactly? I used od and then compared the text representation
of the bytes on the host using diff (other than differences in output
between v6 and v7 related to duplicate lines, it worked ok but is clunky).
Thanks,
Will
[View Less]
All,
In my exploration of v6, I followed the advice in "Setting up Unix -
Seventh Edition" and copied v6tar from v7 to v6. Life is good. However,
tar is using mt1 and it is hard coded into the source, tar.c:
char magtape[] = "/dev/mt1";
As the subject line suggested, I have two questions for those of you who
might know:
1. Why is it hard coded?
2. Why is it the second device and not the first?
Interestingly, it took me a little while to figure out it was doing this
because I …
[View More]didn't actually move files between v6 and v7 until today.
Before this my tests had been limited to separate tests on v6 and v7
along the lines of:
cd /wherever
tar c .
followed by
tar t
list of files
cd /elsewhere
tar x
files extracted and matching
What it was doing was writing to the non-existant /dev/mt1, which it
then created, tarring up stuff, and exiting. Then when I listed the
contents of the tarfile, or extracted the contents, it was successful.
But, when I went to move the tape between v6 and v7, the tape (mt0) was
blank, of course. It was at this point that I followed Noel's advice
and "Used the source", and figured out that it was hard-coded as you see
above.
Thanks,
Will
[View Less]
That's exactly right. ld performs the same task as LOAD did on BESYS,
except it builds the result in the file system rather than user
space. Over time it became clear that "linker" would be a better
term, but that didn't warrant canning the old name. Gresham's law
then came into play and saddled us with the ponderous and
misleading term, "link editor".
Doug
> My understanding, which predates my contact with Unix, is that the
> original toochains for single-job machines consisted of the …
[View More]assembler
> or compiler, the output of which was loaded directly into core with
> the loader. As things became more complicated (and slow), it made
> sense to store the memory image somewhere on drum, and then load that
> image directly when you wanted to run it. And that in some systems
> the name "loader" stuck, even though it no longer loaded. Something
> like the modern ISP use of the term "modem" to mean "router". But I
> don't have anything to back up this version; comments welcome.
[View Less]
> estabur (who thought these names up, I know 8 characters is limiting,
> but c'mon)
'establish user mode registers'
> the 411 header is read by a loader
Actually, it's read by the exec() system call (in sys1.c).
Noel
> From: Dave Horsfall
> I love those PDP-11 instructions, such as "blos" and "sob" :-)
Yes, but alas, there is no 'jump [on] no carry' instruction! (Yes, yes, I
know about BCC! :-) Although I guess the x86 has one...
Noel
> Yes the V6 kernel runs in split I and D mode, but it doesn't end up
> supporting any more data. I.e. the kernel is still a 407 (or 410) file.
> _etext/_edata/_end are still referencing the same 64K space.
Err, actually, not really.
The thing is that to build the split-I/D kernel, one sets the linker to
produce an output file which still contains the relocation bits. That is then
post-processed by 'sysfix', which does wierd magic (moves the text up to
020000, in terms of …
[View More]address space; and puts the data _below_ the text, in the
actual output file). So while the files concerned may have a '407' in their
header, they definitely aren't what one normally finds in a linked 407 or 410
file.
In particular, data addresses start at 0, and can run up to 0140000 (i.e. up
to 56KB), while text addreses start at 020000 and can run up to 0160000. So,
_etext/_edata/_end are not, in fact, in the same 64K space. And the total of
data (initialized and un-initialized) together with the text can be much
larger than 64KB - up to 112KB (modor so.
Noel
[View Less]
J.F. Ossanna (jfo) was born in 1928; he helped give us Unix, and developed
the ROFF series (which I still use).
And Ada Lovelace, the world's first computer programmer, was coded in 1815.
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
> From: Ronald Natalie
> I'm pretty sure the V6 kernel didn't run in split I/D.
Nope. From 'SETTING UP UNIX - Sixth Edition':
"Another difference is that in 11/45 and 11/70 systems the instruction and
data spaces are separated inside UNIX itself."
And if you don't believe that, check out:
http://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/sys/conf/m45.s
the source! ;-)
> It wasn't too involved of a change to make a split I/D kernel.
> Mike Muuss and his …
[View More]crew at JHU did it.
Maybe you're remembering the process on a pre-V6 system?
> We spent more time getting the bootstrap to work than anything else I
> recall.
It's possible you're remembering that, as distributed, V6 didn't support load
images where the text+initialized-data was larger than 24KW-delta; it would
have been pretty eaay to up that to 28KW-delta (change a parameter in the
bootstrap source, and re-assemble), but after that, the V6 bootstrap would
have had to have been extensively re-worked.
And there were _also_ a variety of issues with handling maximal large images
in the startup code. Once operating, the kernel has segments KI1-KI7 available
the hold the system's code; however, it's not clear that all of KI1-7 are
really usable, since the system can't 'see' enough code while in the code
relocation phase in the startup to fill them all. E.g. during code relocation,
KI7 is ripped off to hold a pointer to I/O space (since KD7 is set to point to
low memory just after the memory that KD6 points to).
These might have been issues in systems which were ARPANET-connected (i.e.
ran NCP), as that added a very large amount of code to the kernel.
Noel
[View Less]
> From: Will Senn
> my now handy-dandy PDP11/40 processor handbook
That's good for the instruction set, but for the memory management hardware,
etc you'll really want one of the {/44, /45, /70, /73, etc} group, since only
those models support split I+D.
> the 18 bits holding the word 000407
You mean '16 bits', right? :-)
> This means that branches are to 9th, 10th, 11th and 7th words,
> respectively. It'll be a while before I really understand what the
&…
[View More]gt; ramifications are.
Only the '407' is functional. (IIRC, in early UNIX versions, the OS didn't
strip the header on loading a new program into memory, so the 407 was actually
executed.) The others are just magic numbers, inspired by the '407' - the
code always starts at byte 020 in the file.
> Oh and by the way, jumping between octal and decimal is weird, but
> convenient once you get the hang of it - 512 is 1000, which is nifty
> and makes finding buffer boundaries in an octal dump easy :).
The _real_ reason octal is optimal for PDP-11 is that when looking at core,
most instructions are more easily understood in octal, because the PDP-11 has
8 registers (3 bits), and 3 bits worth of mode modifier, and the fields are
aligned to show up in octal.
I.e. in the double-op instruction '0awxyz', the 'a' octit gives the opcode,
'w' is the mode for the first operand, 'x' is the register for the first
operand, and 'y' and 'z' similarly for the second operand. So '12700' is
'MOV (PC)+, R0' - AKA 'MOV #bbb, R0', where 'bbb' is the contents of the word
after the '12700'.
Noel
[View Less]
> From: Will Senn <will.senn(a)gmail.com>
> The problem is this, when I attempt to execute the v6tar binary on the
> v6 system (it works in v7) it errors out:
> v6tar
> v6tar: too large
That's an error message from the shell; the exec() call on the command
('v6tar') is returning an ENOMEM error. Looking in the kernel, that comes from
estabur() in main.c; there are a number of potential causes, but the most
likely is that 'v6tar' is linked to be split I+D, …
[View More]and your V6 emulation is on
a machine that doesn't have split I+D (e.g. an 11/40). If that's not it,
please dump the a.out header of 'v6tar', so we can work out what's causing the
ENOMEM.
Noel
[View Less]
> From: Will Senn
> Thanks for supplying the logic trail you followed as well!
"Use the source, Luke!" This is particularly true on V6, where it's assumed
that recourse to the source (which is always at hand - long before RMS and
'Free Software', mind) will be an early step.
> when you say dump the a.out header, how do you do that?
On vanilla V6? Hmm. On a system with 'more' (hint, hint ;-), I'd just do 'od
{file} | more', and stop it after the first page. Without 'more', I'…
[View More]d probably
send the 'od' output to a file, and use 'ed' to look at the first couple of
lines.
Back in the day, of course, on a (slow) printing terminal, one could have just
said 'od', and aborted it after the first couple of lines. These days, with
video terminals, 'more' is kind of really necessary. Grab the one off my V6
Unix site, it's V6-ready (should be a compile-and-go).
Noel
[View Less]
> From: Mark Longridge
> I've never been able to transfer any file larger than 64K to Unix V5 or
> V6.
Huh?
# hrd /windows/desktop/TheMachineStops.htm Mach.htm
Xfer complete: 155+38
# l Mach.htm
154 -rw-rw-r-- 1 root 78251 Oct 25 12:13 Mach.htm
#
'more' shows that the contents are all there, and fine. ('hrd' is a command
in my V6 under Ersatz11 that reads an arbitrary file off the host file
system. Guess I need to set the date on the system!)
V6 definitely …
[View More]supports fairly large files; see the code in bmap() in subr.c,
which shows that the basic structure on disk can describe files of 7*256
(1792) + 256*256 (65536) blocks, or 67328 blocks total (34MB).
(In reality, of course, a file can't reach that limit; first, a disk
partition in V6 is limited to 64K blocks, but from that one has to deduct
blocks for the ilist, etc; further, the argument to bmap() is an int, which
limits the 'block number in file' to 16 bits, and in fact the code returns an
error if the high bit in the 'block number in file' is set.)
> I also don't recall seeing any file on V5 or V6 larger than 65536
> bytes
I don't think there is one; the largest are just less than 64KB. I don't
think this is deliberate, other than in the sense that they didn't put any
huge files in the distro so it would fit on a couple of RK packs.
> dd if=/dev/mt0 of=cont.a bs=1 count=90212
> ..
> 24676+0 records in
> 24676+0 records out
> Now, if we take 90212 and subtract 65536 we get 24676 bytes. So there
> definitely seems to be some 64K limit here
Probably 'count' is an 'int' in dd, i.e. limited to 16 bits. No longs in V6 C
(as distributed, although later versions of the C compiler for V6 do support
longs - see my 'bringing up Unix' pages).
Noel
[View Less]
> From: Noel Chiappa
> the most likely is that 'v6tar' is linked to be split I+D, and your V6
> emulation is on a machine that doesn't have split I+D (e.g. an 11/40)
Now that I think about it, the linked systems that are part of the V6 distro
tape are all linked to run on an 11/40. They will boot and run OK on a more
powerful machine (/45 or /70), but they will act like they are on a /40 -
i.e. no split I+D support/use (user or kernel). So to get split I+D support,
you need …
[View More]to build a new Unix binary, with m45.s instead of m40.s. If you
haven't done that, that's probably what the problem is.
Aside: V6 comes in two flavours: no split I+D at all, or split I+D in both
the kernel and user. For some reason that I can't recall, we actually
produced an 'm43.s', BITD at MIT, which ran the kernel in non-split-I-D, but
supported split I-D for the users.
I wish I could remember why we did this - it couldn't have been to save
memory (the machine didn't have a great deal on it when this was done -
although I have this vague memory that that was why we did it), because
running split I+D in the kernel does not, I think, use any more physical
memory (provided you don't fiddle with the parameters like the number of
buffers) than running non-split. Or maybe it does?
One possible reason was that the odd layout of memory with split I+D in the
kernel made debugging kernel code harder (we were doing a lot of kernel
hacking to support early networking work); another was that we were just being
conservative, didn't need to extra space in the kernel that I+D allowed, and
so didn't want to run it.
Noel
[View Less]
All, in the next few days I'm migrating minnie.tuhs.org from one VM to
another, so as to upgrade the OS and clean out the system. I think I've
got the mail subsystem up and running, but as usual there may be bugs.
I'll send out another message when the system is cut over. If things
don't seem to be right, e-mail me at:
wkt at tuhs.org, or
warren.toomey at tafe.qld.edu.au if the tuhs.org one fails.
Cheers all, Warren
On Tue, 8 Dec 2015, Brantley Coile wrote:
> We were indeed lucky that admiral hooper was with us. I know people who
> still cherish their "nano" seconds.
Ah yes, the 1ft piece of wire... Got a photo of it?
> By the way, she wouldn't have said she coined the term "debugging". That
> is at least as old as Thomas Edison. She said she was the first to a
> actually find a real bug!
For those who may be new around here:
https://en.wikipedia.org/wiki/Grace_Hopper#/media/File:…
[View More]H96566k.jpg
Yes, that is a real bug, found inside a real computer.
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
[View Less]
All,
According to "Setting Up Unix - Seventh Edition", by Haley and Ritchie:
The best way to convert file systems from 6th edition (V6) to 7th
edition (V7) format is to use tar(1). However, a special version of tar
must be prepared to run on V6.
The document goes on to describe a reasonable method to make v6tar on v7
and copy the binary over to the v6 system. I successfully built the
v6tar binary, which will execute in the v7 environment. I then moved it
over to the v6 system and did a …
[View More]byte compare on the file using od to
dump the octal bytes and then comparing them to the v7 version. The
match was perfect.
The problem is this, when I attempt to execute the v6tar binary on the
v6 system (it works in v7) it errors out:
v6tar
v6tar: too large
on the v7 system, it works:
v6tar
tar: usage tar -{txru}[cvfblm] [tapefile] [blocksize] file1 file2...
I don't think the binary is too large, is is only 18148 bytes.
ls -l v6tar
-rwxrwxrwx 1 root 18148 Oct 10 14:09 v6tar
Help. First, what does too large mean? Second, does this sound familiar
to anyone? etc.
Thanks,
Will
[View Less]
OK, slightly OT...
Rear Admiral Grace ("Amazing") Hopper PhD was given unto us in 1906. She
was famous for coining the term "debugging", whereby a moth was removed
from a relay contact in a *real* computer[*].
However, she must be condemned for giving us COBOL; yes, I know that vile
language, but I carefully leave it off my CV, as it seemed to be designed
for suits (Business Studies of course, but nothing technical) to spy upon
their programmers.
[*]
Defined, of course, where you could …
[View More]open a door and step inside it; I
actually did that once.
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
[View Less]
> It might not be so much a set of macros as just using a
> subset of raw groff.
Yes, there were no macros back then. If you format the
document using raw groff, the odds are that you will be
speaking the same roff that Dennis did.
> Doug having been there, might know/remember the actually lineage.
Aside from some fuzziness about who wrote what and in what
language, here's what happened:
To port Jerry Saltzer's Runoff (presumably written in MAD)
to Multics, either Dennis or Bob …
[View More]Morris or both together
reimplemented it (presumably in PL/I). To coexist with
Saltzer's version on CTSS, the new program needed a
distinct name, hence roff.
The early Multics PL/I compiler was far from a production
tool. Justifiably, the Bell Labs comp center didn't
support it. To get roff into general use at the Labs,
I undertook yet another implementation in BCPL. I added
functionality (number registers, three-part headings, etc)
and kept the new name. Molly Wagner added hyphenation.
Eventually, I added macros that were usable either as
commands or (when parameterless) embedded in text.
Almost as soon as Unix was up on the PDP-11 one of Ken, Dennis
or Ossanna reimplemented a pre-macro version of roff (presumably
in assembler or B). I'm quite sure roff never ran on the PDP-7.
Ossana had a grander plan and undertook nroff. When he learned
of the availability of the Graphic Systems CAT phototypesetter,
he promptly generalized nroff to handle it. Joe replaced the
CAT's paper tape reader with a direct wire to the computer.
It all worked swimmingly--nothing like the travails when the
CAT was replaced by the more capable Merganthaler Linotron.
An interesting question of priority is whether nroff or
BCPL roff was first to have a macro capability. Though
I don't remember for sure, the fact that BCPL roff unified
registers, macros, strings and diversions suggests that
I abstracted from nroff facilities.
Doug
[View Less]
All,
In the same vein as my prior note, I have made a note available on the
process of getting up and running on Unix Seventh Edition in a SimH
PDP-11 environment. The text is located at:
http://decuser.blogspot.com/2015/12/installing-and-using-research-unix.html
I welcome comments, suggestions, and even criticisms.
While I have learned a lot since my last blog entry (many thanks to
Hellwig Geisse, Nelson Beebe, Noel Chiappa, Clement Cole and several
others), I am still learning about …
[View More]these environments. I originally
invested time in getting v7 running so that I could more easily work
with v6, after having gone there, I believe that it was time very well
spent. I know a lot more about special devices, tape formats, and so on
than I did before as a result of taking the fork in the road.
Thanks for everyone's help.
Oh, and by the way, there appears to be quite a bit of active interest
in this topic - the blog post has been viewed several thousand times
since I posted it, two weeks ago.
Kind regards,
Will
[View Less]
I have set up v7 following [1] and I would like to better understand the
process of adding a disk to the environment. Here is what I know:
The system has one RP06 with two partitions rp0 and rp3 which correspond
to the two block devices rp0, rp3, and the two character devices rrp0,
and rrp3. The special files look like so:
brw-r--r-- 1 root 6, 0 Dec 31 19:05 /dev/rp0
brw-r--r-- 1 root 6, 7 Dec 31 19:04 /dev/rp3
crw-r--r-- 1 root 14, 0 Dec 31 19:01 /dev/rrp0
crw-r--r-- 1 root 14,…
[View More] 7 Dec 31 19:01 /dev/rrp3
This meshes with the device classes switches in c.c:
The block device switch:
struct bdevsw bdevsw[] =
{
...
nulldev, nulldev, hpstrategy, &hptab, /* hp = 6 */
...
}
The character device switch:
struct cdevsw cdevsw[] =
{
...
nulldev, nulldev, hpread, hpwrite, nodev, nulldev, 0, /* hp =
14 */
...
}
I would like to add another RP disk to the environment. After I attach
an RP04/05/06 to the system, what should I use as the major/minor device
numbers? To put it differently, it doesn't seem correct to me to use 6,1
for the block device or 14,1 for the character device on the new drive
as it's a completely different disk from rp0 and rp3 which are just
partitions on the first drive and have 6,0, 6,7, and 14,0, 14,7. If each
RP can have 8 partitions and there can be 8 drives, what is the correct
major, minor numbers to use with v7 for multiple devices?
c.c only lists one vector each for the hp device (one block vector where
hp = 6, and one char vector where hp = 15).
Thanks,
Will
[1] Haley, C. B. & Ritchie, D. M. (1979). Setting Up Unix – Seventh
Edition (pp. 497-505) in UNIX programmer's manual, Vol. 2, Revised and
Expanded Version. Bell Laboratories: NY.
[View Less]