Thanks for the insights as always Clem! Mea culpa on not looking at the ar situation a
little more broadly, was pretty hyperfocused on 3B-20 stuff.
As for compatibility between development tools, I was mainly referring to available option
switches (the kind of thing that could potentially trip up someone's scripting). Once
SGS hits it seems like everything moving forward from that endeavored to keep comparable
command-line switches, but just to further compare:
- PDP as(1) uniquely has the '-' option to treat all undefined labels as globals
(as opposed to the much more common "this arg is stdin" that a lone
'-' would give elsewhere.) VAX as(1) drops this and adds -dN to define the size
reservation for undefined symbols. SGS as(1) keeps neither of these, opting to drop
'-' entirely and implement the -dN functionality as -b, -w, and -l instead.
- Things are a bit better for ld(1). SGS ld(1) drops the -X option of the earlier version
(pertains to cc behavior regarding internal labels, maybe irrelevant with pcc in the
picture). Also presumably the -n and -i options are dropped as their actions are already
default on VAX or otherwise only pertain to PDP. Old ld(1) had a -V option to store a
version string in the resulting object. This becomes -VS in SGS ld(1) to accommodate -V
being a standard "report my version" flag. SGS ld(1) then goes on to add -e to
explicitly denote the entry point, -f to provide a short int fill value for sections
needing it. We also pick up the now common -L for adding library paths. So all in all,
more commonality with pre-SGS ld(1) but still technically some breaking option changes.
- Looks like nm(1) may have some appreciable changes. The -g (only print globals), -p
(print in symbol table order), -r (print in reverse order), and -s (sort by size) option
values are removed. A few are replaced by different options: -n was originally sort
numerically instead of alphabetically (presumably by value rather than name), but in the
SGS version, this is reversed, -n being the print by name order option instead
(alphabetical is default in old nm(1)). The -o option morphs from meaning to include the
name of the source file in the output to print the symbol value as octal. For SGS nm(1),
we see the addition of -x (print in hex), -h (suppress headers), -v (sort by value,
presumably replaces the old -n meaning), -e (only print statics and externals), -f
("full" output), and -V (version). This presents breaking changes for all but
one of the switches to the earlier version of nm(1). For the record, V7, 32V, and System
III all appear to have a comparable version. This utility is particularly interesting
because a perusal of the current SUS
shows a
mishmash of options, with -g and -u surviving all the way from V7, whereas -e, -f, -o, -v,
-x derive from the SGS behavior. Then there are the -A, -P, and -t options which are
explained in the rationale section of the standard; basically these are POSIX additions to
avoid using conflicting option types where possible.
- As for size(1) and strip(1), the SGS versions only add options.
- Finally, a few utilities are added. System III features a dump(1) command that is some
sort of tape dump utility, but this name repurposed at least as of 4.1 into an object file
section dumper, a role it retains. The list(1) utility is also added in 4.1.
So in detailed review, as(1), ld(1), and nm(1) had the most changes between the research
versions and what eventually landed in SGS, with nm(1) especially being wildly
incompatible from a command-line option standpoint. As for as(1), one PDP-11 option drops
and one VAX option changes switches (but not functionality as far as I can tell). Finally,
ld(1) seems to drop a few options that aren't needed in a non-PDP world and adjusts
the version-assignment option to allow -V to be a universal version request of the various
object utilities.
- Matt G.
------- Original Message -------
On Wednesday, February 22nd, 2023 at 2:20 PM, Clem Cole <clemc(a)ccc.com> wrote:
below are some thoughts/hopefully answers to your
questions....
On Wed, Feb 22, 2023 at 3:16 PM segaloco via TUHS <tuhs(a)tuhs.org> wrote:
Good day all, figured I'd start a thread on
this matter as I'm starting to piece enough together to articulate the questions
arising in my research.
So based on my analysis of the 3B20S UNIX 4.1 manual I've been working through, all
evidence points to the formalized SGS package and COFF originating tightly coupled to the
3B-20 line, then growing legs to support VAX, but never quite absorbing PDP-11 in
entirety. That said, there are bits and pieces of the manual pages for the object format
libraries that suggest there was some providence for PDP-11 in the development of COFF as
well.
Where this has landed though is a growing curiosity regarding:
- Whether SGS and COFF were tightly coupled to one another from the outset, with SGS
being supported by the general library routines being developed for the COFF format
@scj - any enlightenment -- your team in USG must have been part of all that.
- Whether COFF was envisioned as a
one-size-fits-all object format from its inception or started as an experiment in 3B-20
development that wound up being general enough for other platforms
That I can not say, but I can say that to the UNIX source licenses (i.e. not the
Universities in the Research system or inside of the Bell Systems) - it was used in the
"consider it standard" campaign that AT&T marketing in NC was starting to
push. This was around the time that PCC2 was coming out to replace the original PCC but I
remember getting PCC2 was extra cost.
Most of the BSD based kernels (DEC, HP, etc..) were originally using a modified a.out of
their own flavor but I think almost all them switched to COFF post the System III license.
What I have forgotten, and it may have been a requirement/mixed up in the license.
I do remember this was right around when gcc first starts coming out, and they had a tool
called robitussin to "cure coffs" as they were using a.out wen they could.
-
- If, prior to this format, there were any other efforts to produce a unifying binary
format and set of development tools, or if COFF was a happy accident from what were a
myriad of different architectural toolset streams
MIT had a modified a.out format for the NU machine ports - that might have been called
b.out.
CMU had macho which again was an extended a.out but even more flexible.
- One of the curious things is how VAX for a
brief moment did have its own set of tools and a.out particulars before SGS/COFF.
Why is that curious - all original Vax development was just using the original PCC stream
from V7 (and pre-Judge Green more in a minute).
What I don't remember is if PCC2 was COFF when introduced, or COFF can first but I
think they were separate things - again someone like scj would be authoritative.
The three tools that have to care are the assembler (as), the linker (ld) program loading
code in the kernel itself.
For instance, many of the VAX-targeted utilities
in 3.0/System III bear little in common option/manual-wise with the general common SGS
utilities in System V. The "not on PDP-11" pages for various SGS components in
System V much more closely resemble the 3B-20 utilities in 4.1 than any of the non
PDP-11/VAX-only bits in System III.
Some examples:
- The VAX assembler in System III contains a -dN option indicating the number of bytes to
set aside for forward/external references for the linker to fill in.
- The VAX assembler in System V contains among others the -n and -m options from 4.1
which indicate to disable address optimization and use m4 respectively
- The System V assembler goes on to also include -R (remove input file after completion)
-r (VAX only, add .data contents to .text instead) and options -b, -w, and -l to replace
the -d1, -d2, and -d4 options indicated in the previous VAX assembler
- System V further adds a -V to all the SGS software indicating the version of the
software. This is new circa 5.0, absent from the 4.1 manual like the R, r, b, w, and l
options
- The 4.1 manual's singular ar(1) entry still agrees with the System III version. No
arcv(1) is listed, implying the old ar format never made it to 3B-20
Hmm this is confusing old v[456] ar format to new ar format was during Research V6 to
Research V7. By the time of any Vax development the old format had pretty much been
killed. I'd look at check what PWB 1.0 and 2.0 used. The new ar format was
independent of what it was in it.
i.e. V7: man 5 ar
[
AR(5)](http://man.cat-v.org/unix_7th/5/AR)
[
AR(5)](http://man.cat-v.org/unix_7th/5/AR)
NAME
ar - archive (library) file format
SYNOPSIS
#include <ar.h>
DESCRIPTION
The archive command ar is used to combine several files into
one. Archives are used mainly as libraries to be searched
by the link-editor ld.
A file produced by ar has a magic number at the start, fol-
lowed by the constituent files, each preceded by a file
header. The magic number and header layout as described in
the include file are:
#define
ARMAG 0177545
> struct ar_hdr {
> char
ar_name[14];
> long ar_date;
> char ar_uid;
> char ar_gid;
> int ar_mode;
>> long ar_size;
> };
- The System V manual has both this ar(1) version
as well as the new COFF-supporting version.
Why would ar(1) care?
- Not sure if this implies the VAX ar format was
expanded to support the COFF stuff for a little while until they decided on a new one or
what.
- The System III ld (which is implied to support PDP and VAX) survives in System V, but
is cut down to supporting PDP-11 only
- The COFF-ish ld shows up in 4.1, is then extended to VAX presumably in the same breath
as the other COFF-supporting bits by Sys V, leading to two copies like many others,
PDP-11-specific stuff and then COFF-specific stuff
The picture that starts to form in the context of all of this is, for a little while in
the late 70s/early 80s, the software development environments for PDP-11, VAX-11, and
3B-20 were interplaying with each other in often times inconsistent ways. Taking a peek at
the 32V manuals, the VAX tools in System III appear to originate with that project, which
makes sense. If I'm understanding the timeline, COFF starts to emerge from the 3B-20
project and USG probably decides that's the way to go, a unified format, but with
PDP-11 pretty much out the door support wise already, there was little reason to apply
that to PDP-11 as well, so the PDP-11 tools get their swan song in System V, original
VAX-11 tools from 32V are likely killed off in 4.x, and the stuff that started with the
3B-20 group goes on to dominate the object file format
That makes sense - but be careful - the 3B and WE32000 ISA may have been the driver but I
would expect that compiler folk in Summit were more in the driver seat. The 3B20 kernel
would use what they were getting from the tools team and core kernel team in USG.
Remember the politic at the time is Judge Green has unleashed AT&T and they are now
allowed to be in the biz, and the sales/marketing folks AT&T was pushing the 3B20 and
the WE32000 - so there are big forces behind the scenes that are not obvious/clear.
and development software stuff until ELF comes
along some time later.
Yep - never quite understood what the push for ELF was over COFF after all the effort to
drive COFF down people's throat. Note Microsoft "embraced and extended"
COFF as their format -- originally because of Xenix I believe. Someone like Paul W may
have some insights on this and that was before the 3B20.
What was the format that the original Xenix used - when it was targeting PDP-11, 68000,
x86 and Z8000? Again I'm fuzzy on the details here. But I do remember during the
license discussions that would lead to System III, that one of things the Microsoft team
was worried about -- IIRC it was Bob Greenberg pushing all that. I lost contact with Bob
a few years ago, but if we can find him, I would expect Bob to know what Xenix was doing.
And again that negotiation>>starts<< all pre-Judge Green, but finishes up soon
afterwards.
I guess other questions this raises are:
- Were the original VAX tools built with any attention to compatibility with the PDP-11
bits Ken and Dennis wrote many years prior (based on some option discrepancies, possibly
not?)
hrmph... folks started with the PDP-11 tools and changed them as needed. I'm not
sure compatibility is the right term. They were retargeted nad moved forward by people
trying support a new machine they got and did not want run DEC's OS.
- Do the VAX utilities derive from the Interdata
8/32 work or if there was actually another stream of tools as part of that project?
I guess I don't understand the question. The original V7 tools were retargeted. When
useful features were added, they might be offered/returned to other folks, but remember,
Research is not "supporting" UNIX. USG is where things start to think in terms
of multiple targets >>before Judge Green<< and then after Judge Green, there
was a push to stop using non-AT&T based equipment or chips in the Bell System and make
what Western Electric was selling be attractive [which sometimes was a little bit of
putting lipstick on porcine as it were]. For instance, Rob and Barts's original JERQ
is 68000 based, but by the time it becomes a product as 5620 it has to be refactored as a
WE32000.
- Was there any interplay between the existing
tool streams (original PDP-11, 32V's VAX utilities, possibly Interdata 8/32) and the
eventual COFF/SGS stuff, or was the latter pretty well siloed in 3B-20 land until
deployment with 4.1?
I think you are putting too much on the 3B program itself. The 3B was the task at hand at
the time and a solid opportunity to bring to bear business choices being made. You need to
look at the greater business to understand a lot of the choices. A lot of things were
happening in parallel in the market that had other impacts on technology and how it was
delivered -- the 3B program was the "technology train" leaving the station that
some of them got attached to/delivered using.
But, I as I said to you when we chatted, you really can not underestimate what was
happening (or not happening) as AT&T changed its business focus - pre/post-Judge
Green. It was a large company with lots of different spheres of interest (read - different
executives), each being measured with different things that they might value.