On 12/9/15 9:07 AM, Clem Cole wrote:
On Wed, Dec 9, 2015 at 9:56 AM, Clem Cole <clemc(a)ccc.com
<mailto:clemc@ccc.com>> wrote:
check the man page on the a.out format.
Note to folks on the list - please don't give away the answer...
Will
A small piece of homework for the student -- look at 407 the magic
number for the a.out file and see if you can explain why Ken picked
that value. Hint it is particular to the PDP-11 architecture and is
not portable to other systems, although most other systems that UNIX
was "ported" too continued to used the a.out format kept the magic
numbers as 407/411 etc., although I know of some ports (like mine own
for Magix - the Tek Magnolia system of the late 1970s) that changed
the magic number to be correct for that architecture.
Clem
Clem,
Nice!
In full disclosure, I read about magic numbers being PDP architecture
related somewhere in the recent past, but I didn't really understand the
connection. So, when you mentioned it, rather than google it and spoil
the fun, I took the hint and thought a bit about it in the context of
our discussion. As it turns out, according to my now handy-dandy
PDP11/40 processor handbook, the 18 bits holding the word 000407 which
appears as the first word of most of my v6 binaries, 000 000 100 000
111b, the magic number represents the BR instruction base where the last
8 bits is an offset added to the PC after it has read the word. So, the
PDP loads the object, reads the first word, determines it is a branch
instruction:
BR PC+7
and jumps forward to start reading the 9th word of the program. Why the
9th word? According to man 5 a.out, the a.out header always contains 8
words.
So, given a binary, like write:
dd if=write bs=128 count=1|od
1+0 records in
1+0 records out
0000000 000407 001176 000032 000150 000000 000000 000000 000001
0000020 022627 000002 001412 003006 012700 000001 104404 000657
...
The analysis that follows in light of the information discovered so far
would be:
word 1 at 0000000 000407 Magic Number (BR PC+7)
word 2 at 0000002 001176 Size of the program text segment (638 bytes)
word 3 at 0000004 000032 Size of the initialized data segment (26 bytes)
word 4 at 0000006 000150 Size of the uninitialized data segment (104 bytes)
word 5 at 0000010 000000 Size of the symbol table (stripped in this case)
word 6 at 0000012 000000 The entry location 0 at present
word 7 at 0000014 000000 Unused
word 8 at 0000016 000001 Relocation bit suppression flag (I think this
means the file contains absolute memory references)
word 9 at 0000020 022627 The target of the first branch, contains
instructions to MOV (R6)+,(R7)+ (I think)...
According to man 5 a.out on v7, the different magic numbers represent
normal, read-only text, separate I&D, and overlay:
#define A_MAGIC1 0407 /* normal */
#define A_MAGIC2 0410 /* read-only text */
#define A_MAGIC3 0411 /* separated I&D */
#define A_MAGIC4 0405 /* overlay */
This means that branches are to 9th, 10th, 11th and 7th words,
respectively. It'll be a while before I really understand what the
ramifications are.
Oh and by the way, jumping between octal and decimal is weird, but
convenient once you get the hang of it - 512 is 1000, which is nifty and
makes finding buffer boundaries in an octal dump easy :).
Thanks,
Will