> I got it.
Ta muchly! All seems OK now, after TUHS moved to a new ISP (linode,
which, ahem, is known for hosting spammers).
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
1 2 3... Is this mic on? Tap tap...
Seriously, my anti-spam defences were having an issue with this list for a
while, so let's see whether it comes back.
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
Could whoever runs this broken mirror please fix the damned mailer so that
it handles my RFC-compliant banner? I do not appreciate retries every
five seconds or so, because Dovecot cannot seem to handle a multi-line
SMTP banner (a great spam defence); I have since firewalled the IP address
of 45.79.103.53 out of self-defence.
Thank you.
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
All,
I'm stuck trying to determine what is going on with v6tar on v6. It
seems to work ok for files, but gets confused with subdirectories. I set
up a test folder structure:
t/dmr/vs.c
t/dmr/vt.c
t/ken/prf.c
then I created a tarball
tar cvf t.tar t
then I tried to extract the tarball. It made a mess:
# tar xvf t.tar
Tar: blocksize = 17
y ?
tar: t/ken/prf.c - cannot create
y ?
y ?
tar: t/dmr/vs.c - cannot create
y ?
y ?
tar: t/dmr/vt.c - cannot create
That was ugly and all of it was output. What exactly did I wind up with?:
# ls -l
total 19
drwxrwxrwx 2 root 32 Oct 10 12:54 y
-rw-rw-rw- 1 root 8704 Oct 10 12:54 t.tar
Ugh. Probably don't need the y directory...
# rmdir y
y ?
# ls y
y not found
Wow! It appears that I am unable to delete the y directory or list it by
name. That can't be good. Any ideas of how to remove this directory are
welcome.
Not to be deterred by one small failure, I copied the same tarball over
to v7 on the off chance that maybe v6tar isn't really for v6, but more
for moving files(and directories) over to v7 as Haley and Ritchie
describe, and lo and behold tar on v7 is able to extract both files and
directories from the same tarball without any trouble:
# tar xvf t.tar
Tar: blocksize = 17
x t/ken/prf.c, 2301 bytes, 5 tape blocks
x t/dmr/vs.c, 1543 bytes, 4 tape blocks
x t/dmr/vt.c, 834 bytes, 2 tape blocks
# ls -l
total 18
drwxrwxr-x 4 root 64 Dec 31 19:27 t
-rw-rw-r-- 1 root 8704 Dec 31 19:27 t.tar
# ls t
dmr
ken
# ls t/dmr
vs.c
vt.c
# ls t/ken
prf.c
Interesting. After looking at the tar source, the question marks in the
output appear to be coming from somewhere outside of tar (perhaps mkdir
or chown?). Also, the "cannot create" message comes from the following
snippet of the tar source, which looks reasonable:
...
if ((ofile = creat(dblock.dbuf.name, stbuf.st_mode & 07777)) < 0) {
fprintf(stderr, "tar: %s - cannot create\n",
dblock.dbuf.name);
...
I think this error is simply an effect related to the failure to create
the necessary directories properly. The code to do that looks pretty
straightforward:
checkdir(name)
register char *name;
{
register char *cp;
int i;
for (cp = name; *cp; cp++) {
if (*cp == '/') {
*cp = '\0';
if (access(name, 01) < 0) {
if (fork() == 0) {
execl("/bin/mkdir", "mkdir",
name, 0);
execl("/usr/bin/mkdir",
"mkdir", name, 0);
fprintf(stderr, "tar: cannot
find mkdir!\n");
done(0);
}
while (wait(&i) >= 0);
chown(name, stbuf.st_uid, stbuf.st_gid);
}
*cp = '/';
}
}
}
I speculate that chown is causing the "?" to be displayed. Is it safe
enough for me to add printf statements around this code to see what's
going on, or is there a better approach?
Thanks,
Will
/dev/makefile on the V7 distribution tape (or at least the
unpacked image I have that I believe to be same) says:
ht:
/etc/mknod mt0 b 7 64
/etc/mknod mt1 b 7 0
/etc/mknod rmt0 c 15 64
/etc/mknod rmt1 c 15 0
/etc/mknod nrmt0 c 15 192
/etc/mknod nrmt1 c 15 128
chmod go+w mt0 mt1 rmt0 rmt1 nrmt0 nrmt1
According to /usr/sys/dev/ht.c, the minor device
number was used as follows:
minor(dev) & 07 slave unit number
minor(dev) & 070 controller unit number
minor(dev) & 0100 tape density: set == 800 bpi, clear 1600
minor(dev) & 0200 no-rewind flag
It takes some digging in the source code (and the PDP-11
Peripherals Handbook) to understand all this numerology.
In most of the code, minor(dev) & 077 is just treated as
a unit number (fair enough). The use of 0200 appears only
as a magic number in htopen; that of 0100 only as a magic
number in htstart, and that only implied: the test is
not minor(dev) & 0100, but
unit = minor(bp->b_dev) & 0177;
if(unit > 077)
Not so bad when the whole driver is only 376 lines of code,
but it wouldn't have hurt to make it 400 lines if that
meant fewer magic numbers.
Anyway, what all this means is that /dev/*mt0 and /dev/*mt1
both actually meant slave 0 on TU16 controller 0, but mt0
was 800 bpi and mt1 1600 bpi. Hence, I would guess, tar's
default to mt1.
My first exposure to the insides of UNIX was in the High
Energy Physics group at Caltech. Some of our systems had
multiple tape drives and every drive supported multiple
densities, so we invented for ourselves a system like that
many other sites invented, with names like /dev/rmt3h to
mean the third tape drive at high density. (Hence the
USG naming scheme of /dev/rmt/3h and the like--not that
we taught it to them, just that many places had the same
idea.)
Our world wasn't nearly as exciting as that of our neighbors,
across the building and three floors down, in the Space
Radiation Laboratory. They had a huge room full of racks
of magtapes full of data from satellites, and many locally-
written tools for extracting the data so researchers could
work on it. The hardware was an 11/70 with eight tape drives,
and at any given time at least half the the drives would be
spinning. One of the drives was seven-track rather than
nine-track, because some of the satellite data had been
written in that format.
Fair disclosure: I had a vague memory that the `drive number'
in the device name had been recycled for other purposes,
but couldn't remember whether it was density or something
else. (I'm a little surprised none of the other old-timers
here remembered that, but maybe I worked with tapes more than
them.) But I had to dig into the source code for the details;
I didn't remember all that. And I did have to climb up to the
high shelf in my home office for a Peripherals Handbook to
understand the magic numbers being stuffed into registers!
Norman Wilson
Toronto ON
> I have no memory of why Ken used mt1 not mt0. Doug may know.
I don't know either. Come to think of it, I can't remember ever
using tar without option -f. Direct machine-to-machine trasfer,
e.g. by uucp, took a lot of business away from magtape soon
after tar was introduced. Incidentally, I think tar was written
by Chuck Haley or Greg Chesson, not Ken.
Doug
On 2015-12-12 07:16, William Pechter<pechter(a)gmail.com> wrote:
>
> Warren Toomey wrote:
>> >On Sat, Dec 12, 2015 at 03:54:16PM +1100, Peter Jeremy wrote:
>>> >>Also, I've seen suggestions that there's a 2.11BSD patch later than
>>> >>447 but I can't find anything "official" andwww.2bsd.com is either
>>> >>down or inaccessible from all the systems I have access to. Does
>>> >>anyone know if 448 or later were released? And given the issues with
>>> >>www.2bsd.com would someone be willing to mirror it (assuming we can
>>> >>got a copy of it)?
>> >[ Back to a real keyboard ]. Yes I'd be very happy to mirror 2bsd.com.
>> >Does anybody know what's happened to Steven Schultz?
>> >
>> >Cheers, Warren
>> >_______________________________________________
>> >TUHS mailing list
>> >TUHS(a)minnie.tuhs.org
>> >http://minnie.tuhs.org/cgi-bin/mailman/listinfo/tuhs
> Last patch is 447 from June 2012.
Uh. No. 447 is from December 31, 2008.
See /VERSION in the patch set, which holds the patch version and date
for the patch.
And I did an unofficial 448 in 2010, which I have tried to spread, and
which I suspect is the patch referred to above...
> I can get to the site just fine... pasted the patch below if it helps
> anyone.
> I haven't heard anything about him. Haven't worked at the same company
> since the early 1990's...
I used to talk with him a lot in the past, but have not been able to
raise him, and haven't seen anything from him in over 5 years... No idea
what he is up to nowadays...
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt(a)softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
> From: Random832
> Interestingly, the SysIII sum.c program, which I assume yields the same
> result for this input, appears to go through the whole input
> accumulating the sum of all the bytes into a long, then adds the two
> halves of the long at the end rather than after every byte.
That's the same hack a lot of TCP/IP checksums routines used on machines with
longer words; add the words, then fold the result in the shorter length at the
end. The one I wrote for the 68K back in '84 did that.
> This suggests that the two programs would give different results for
> very large files that overflow a 32-bit value.
No, I don't think so, depending on the exact detals of the implementation. As
long as when folding the two halves together, you add any carry into the sum,
you get the same result as doing it into a 16-bit sum. (If my memory of how
this all works is correct - the neurons aren't what they used to be,
especially late in the day... :-)
> Also, if this sign extends, then its behavior on "negative" (high bit
> set) bytes is likely to be very different from the SysIII one, which
> uses getc.
I have this bit set that in C, 'char' is defined to be signed, and
furthermore that when you assign a shorter int to a longer one, the sign is
extended. So if one has a char holding '0200' octal (i.e. -128), assigning it
to a 16-bit int should result in the latter holding '0177600' (i.e. still
-128). So in fact I think they probably act the same.
Noel
> From: Will Senn
> I noticed that the sum utility from v6 reports a different checksum
> than it does using the sum utility from v7 for the same file.
> ... does anyone know what's going on here?
> Why is sum reporting different checksum's between v6 and v7?
The two use different algorithms to accumulate the sum (I have added comments
to the relevant portion of the V6 assembler one, to help understand it):
V6:
mov $buf,r2 / Pointer to buffer in R2
2: movb (r2)+,r4 / Get new byte into R4 (sign extends!)
add r4,r5 / Add to running sum
adc r5 / If overflow, add carry into low end of sum
sob r0,2b / If any bytes left, go around again
Read the description of MOVB in the PDP-11 Processor manual.
V7:
while ((c = getc(f)) != EOF) {
nbytes++;
if (sum&01)
sum = (sum>>1) + 0x8000;
else
sum >>= 1;
sum += c;
sum &= 0xFFFF;
}
I'm not clear on some of that, so I'll leave its exact workings as an
exercise, but I'm pretty sure it's not a equivalent algorithm (as in,
something that would produce the same results); it's certainly not
identical. (The right shift is basically a rotate, so it's not a straight sum,
it's more like the Fletcher checksum used by XNS, if anyone remembers that.)
Among the parts I don't get, for instance, sum is declared as 'unsigned',
presumably 16 bits, so the last line _should_ be a NOP!? Also, with 'c' being
implicitly declared as an 'int', does the assignment sign extend? I have this
vague memory that it does. And does the right shift if the low bit is one
really do what the code seems to indicate it does? I have this bit that ASR on
the PDP-11 copies the high bit, not shifts in a 0 (check the processor
manual). That is, of course, assuming that the compiler implements the '>>'
with an ASR, not a ROR followed by a clear of the high bit, or something.
Noel
Ok, it definitely sounds like the v6tar source is around somewhere so
if someone could point me in the right direction...
I've only seen the binary, and I can't remember where I got it.
Mark
All,
While working on the latest episode of my saga about moving files
between v6 and v7, I noticed that the sum utility from v6 reports a
different checksum than it does using the sum utility from v7 for the
same file. To confirm, I did the following on both systems:
# echo "Hello, World" > hi.txt
# cat hi.txt
Hello, World
Then on v6:
# sum hi.txt
1106 1
But on v7:
# sum hi.txt
37264 1
There is no man page for the utility on v6, and it's assembler. On v7,
there's a manpage and it's C:
man sum
...
Sum calculates and prints a 16-bit checksum for the named
file, and also prints the number of blocks in the file.
...
A few questions:
1. I'll eventually be able to read assembly and learn what the v6
utility is doing the hard way, but does anyone know what's going on here?
2. Why is sum reporting different checksum's between v6 and v7?
3. Do you know of an alternative to check that the bytes were
transferred exactly? I used od and then compared the text representation
of the bytes on the host using diff (other than differences in output
between v6 and v7 related to duplicate lines, it worked ok but is clunky).
Thanks,
Will
All,
In my exploration of v6, I followed the advice in "Setting up Unix -
Seventh Edition" and copied v6tar from v7 to v6. Life is good. However,
tar is using mt1 and it is hard coded into the source, tar.c:
char magtape[] = "/dev/mt1";
As the subject line suggested, I have two questions for those of you who
might know:
1. Why is it hard coded?
2. Why is it the second device and not the first?
Interestingly, it took me a little while to figure out it was doing this
because I didn't actually move files between v6 and v7 until today.
Before this my tests had been limited to separate tests on v6 and v7
along the lines of:
cd /wherever
tar c .
followed by
tar t
list of files
cd /elsewhere
tar x
files extracted and matching
What it was doing was writing to the non-existant /dev/mt1, which it
then created, tarring up stuff, and exiting. Then when I listed the
contents of the tarfile, or extracted the contents, it was successful.
But, when I went to move the tape between v6 and v7, the tape (mt0) was
blank, of course. It was at this point that I followed Noel's advice
and "Used the source", and figured out that it was hard-coded as you see
above.
Thanks,
Will
That's exactly right. ld performs the same task as LOAD did on BESYS,
except it builds the result in the file system rather than user
space. Over time it became clear that "linker" would be a better
term, but that didn't warrant canning the old name. Gresham's law
then came into play and saddled us with the ponderous and
misleading term, "link editor".
Doug
> My understanding, which predates my contact with Unix, is that the
> original toochains for single-job machines consisted of the assembler
> or compiler, the output of which was loaded directly into core with
> the loader. As things became more complicated (and slow), it made
> sense to store the memory image somewhere on drum, and then load that
> image directly when you wanted to run it. And that in some systems
> the name "loader" stuck, even though it no longer loaded. Something
> like the modern ISP use of the term "modem" to mean "router". But I
> don't have anything to back up this version; comments welcome.
> estabur (who thought these names up, I know 8 characters is limiting,
> but c'mon)
'establish user mode registers'
> the 411 header is read by a loader
Actually, it's read by the exec() system call (in sys1.c).
Noel
> From: Dave Horsfall
> I love those PDP-11 instructions, such as "blos" and "sob" :-)
Yes, but alas, there is no 'jump [on] no carry' instruction! (Yes, yes, I
know about BCC! :-) Although I guess the x86 has one...
Noel
> Yes the V6 kernel runs in split I and D mode, but it doesn't end up
> supporting any more data. I.e. the kernel is still a 407 (or 410) file.
> _etext/_edata/_end are still referencing the same 64K space.
Err, actually, not really.
The thing is that to build the split-I/D kernel, one sets the linker to
produce an output file which still contains the relocation bits. That is then
post-processed by 'sysfix', which does wierd magic (moves the text up to
020000, in terms of address space; and puts the data _below_ the text, in the
actual output file). So while the files concerned may have a '407' in their
header, they definitely aren't what one normally finds in a linked 407 or 410
file.
In particular, data addresses start at 0, and can run up to 0140000 (i.e. up
to 56KB), while text addreses start at 020000 and can run up to 0160000. So,
_etext/_edata/_end are not, in fact, in the same 64K space. And the total of
data (initialized and un-initialized) together with the text can be much
larger than 64KB - up to 112KB (modor so.
Noel
J.F. Ossanna (jfo) was born in 1928; he helped give us Unix, and developed
the ROFF series (which I still use).
And Ada Lovelace, the world's first computer programmer, was coded in 1815.
--
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
> From: Ronald Natalie
> I'm pretty sure the V6 kernel didn't run in split I/D.
Nope. From 'SETTING UP UNIX - Sixth Edition':
"Another difference is that in 11/45 and 11/70 systems the instruction and
data spaces are separated inside UNIX itself."
And if you don't believe that, check out:
http://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/sys/conf/m45.s
the source! ;-)
> It wasn't too involved of a change to make a split I/D kernel.
> Mike Muuss and his crew at JHU did it.
Maybe you're remembering the process on a pre-V6 system?
> We spent more time getting the bootstrap to work than anything else I
> recall.
It's possible you're remembering that, as distributed, V6 didn't support load
images where the text+initialized-data was larger than 24KW-delta; it would
have been pretty eaay to up that to 28KW-delta (change a parameter in the
bootstrap source, and re-assemble), but after that, the V6 bootstrap would
have had to have been extensively re-worked.
And there were _also_ a variety of issues with handling maximal large images
in the startup code. Once operating, the kernel has segments KI1-KI7 available
the hold the system's code; however, it's not clear that all of KI1-7 are
really usable, since the system can't 'see' enough code while in the code
relocation phase in the startup to fill them all. E.g. during code relocation,
KI7 is ripped off to hold a pointer to I/O space (since KD7 is set to point to
low memory just after the memory that KD6 points to).
These might have been issues in systems which were ARPANET-connected (i.e.
ran NCP), as that added a very large amount of code to the kernel.
Noel
> From: Will Senn
> my now handy-dandy PDP11/40 processor handbook
That's good for the instruction set, but for the memory management hardware,
etc you'll really want one of the {/44, /45, /70, /73, etc} group, since only
those models support split I+D.
> the 18 bits holding the word 000407
You mean '16 bits', right? :-)
> This means that branches are to 9th, 10th, 11th and 7th words,
> respectively. It'll be a while before I really understand what the
> ramifications are.
Only the '407' is functional. (IIRC, in early UNIX versions, the OS didn't
strip the header on loading a new program into memory, so the 407 was actually
executed.) The others are just magic numbers, inspired by the '407' - the
code always starts at byte 020 in the file.
> Oh and by the way, jumping between octal and decimal is weird, but
> convenient once you get the hang of it - 512 is 1000, which is nifty
> and makes finding buffer boundaries in an octal dump easy :).
The _real_ reason octal is optimal for PDP-11 is that when looking at core,
most instructions are more easily understood in octal, because the PDP-11 has
8 registers (3 bits), and 3 bits worth of mode modifier, and the fields are
aligned to show up in octal.
I.e. in the double-op instruction '0awxyz', the 'a' octit gives the opcode,
'w' is the mode for the first operand, 'x' is the register for the first
operand, and 'y' and 'z' similarly for the second operand. So '12700' is
'MOV (PC)+, R0' - AKA 'MOV #bbb, R0', where 'bbb' is the contents of the word
after the '12700'.
Noel
> From: Will Senn <will.senn(a)gmail.com>
> The problem is this, when I attempt to execute the v6tar binary on the
> v6 system (it works in v7) it errors out:
> v6tar
> v6tar: too large
That's an error message from the shell; the exec() call on the command
('v6tar') is returning an ENOMEM error. Looking in the kernel, that comes from
estabur() in main.c; there are a number of potential causes, but the most
likely is that 'v6tar' is linked to be split I+D, and your V6 emulation is on
a machine that doesn't have split I+D (e.g. an 11/40). If that's not it,
please dump the a.out header of 'v6tar', so we can work out what's causing the
ENOMEM.
Noel
> From: Will Senn
> Thanks for supplying the logic trail you followed as well!
"Use the source, Luke!" This is particularly true on V6, where it's assumed
that recourse to the source (which is always at hand - long before RMS and
'Free Software', mind) will be an early step.
> when you say dump the a.out header, how do you do that?
On vanilla V6? Hmm. On a system with 'more' (hint, hint ;-), I'd just do 'od
{file} | more', and stop it after the first page. Without 'more', I'd probably
send the 'od' output to a file, and use 'ed' to look at the first couple of
lines.
Back in the day, of course, on a (slow) printing terminal, one could have just
said 'od', and aborted it after the first couple of lines. These days, with
video terminals, 'more' is kind of really necessary. Grab the one off my V6
Unix site, it's V6-ready (should be a compile-and-go).
Noel