The "assembly code in the Bourne shell"
comment is in the same
London/Reiser paper. The full quote is:
"The (Bourne) shell is the standard user command interpreter. It required
by far the largest conversion effort of any supposedly portable program,
for the simple reason that it is not portable. Critical portions are coded
in assembly language and had to be painstakingly rewritten. The shell uses
its own sbrk which is functionally different from the standard routine in
libc. The shell wants the routine which fields a signal to be passed a
parameter giving the number of the signal being caught; signal was also a
private rou- tine. This was handled by having the operating system provide
the parameter in the first place, doing away with the private code for
signal. The code in fixargs (for constructing the argument list to an exec
system call) had to be diddled."
The files in the V7 tree on the Tuhs website are dated January 1979, so it
would seem that the fixes for 32V were immediately taken back to Research.
As you point out, this means that the comments above do not refer to the
well known source code, but to a predecessor of that (which I don’t think
survived).
We have ample evidence that V7 was really something more akin to a rolling
release. Let me explain: We know from the leaked '50 changes' tape that
many of the features were set earlier rather than later. This leaked in
1978 (if my notes are right), but I found references to it from as early as
November 1976 in
. This
was 18 months after V6 was released, but over 2 years before V7 was
released. In addition, we know from the AUUS newsletters in the archive
document that the V7 release process process took a while to get through
AT&T's legal department (IIRC a year, but I've not gone back to the AUUS
newsletters to refresh my recollection). A big push of V7 was to make it
portable as well (with AT&T doing an Interdata 8/32 port themselves, as
well as at least looking at the Wollongong Interdata 7/32 port and the
Harvard VM/370 port). In talking to Kirk and others that have been around
from approximately that time, 32V was widely viewed as V7 for Vaxen. We can
see evidence in the surviving 32V files of evolution from the 'PDP-11-like
swapping to a more sophisticated paging algorithm' since we have the
slowsys directory. It's my contention, as someone that coded in the era
before good source code control, that it's evidence that somebody got it
working, then renamed/copied it to slowsys while they got paging working so
they could build either kernel for A/B testing. Kirk has also told me that
the 32V port was started well in advance of V7's release to be both a
useful product inside of Bell Labs (since Vaxen were starting to appear) as
well as to prove that V7 was portable enough. I'll be the first to admit
this is at best conjecture that matches available facts, artifacts and old
timers recollections (sorry Kirk), but that we have no direct evidence for.
It also allowed the 3BSD efforts to get going before the official V7
release due to the close ties between Bell Labs and Berkeley and the DARPA
project around Unix.
I believe that we can conclude that the original 'hard to port' Bourne
shell was produced around the time of the 50 changes tape, give or take.
And that all the unix porting efforts that pre-dated the V7 release rolled
what appeared in 32V into V7 to reduce the amount of pdp-11 assembler. And
those efforts are what we read about in the paper.
It also goes a ways to explain the 32V meme of 'it was pdp11 swapping'
because originally, in a version we no longer have, it was. But many of the
32V tapes that we have represent a later version where that had been
abandoned in favor of what would evolve into System III's and later paging
code.
Despite all the criticism voiced above, I think it is
well understood that
the original Bourne shell is an amazing piece of work that managed to fit
an enormous amount of functionality into a cramped address space. Its
longevity attests to that. That its internals became difficult to
understand is par for the course -- the 1980’s in essence needed a Lions
commentary on sh.
On 30 Dec 2022, at 20:57, segaloco
<segaloco(a)protonmail.com> wrote:
I'll have to double check later but I'm fairly certain the remaining L/R
cheats are gone by SysV. From what I can tell much of that portability
work may have been done prior to the V7 release code base we're familiar
with, as I did some comparison and found only one significant change
between V7 and 32V code as I know it at least. Either the claims of
portability issues came between 32V and System III (meaning the shell was
accepted as "broken"? in 32V) or the code we actually see in V7 has already
been tidied up significantly and doesn't represent the "non-portable"
version lamented in the famous quote. Does this observation hold with
reality? Is there an earlier, more PDP-11 bound version of the Bourne
Shell out there? I seem to recall reading something about some bits of it
even being in assembly at one point, but can't remember the quote source.
- Matt G.
------- Original Message -------
On Friday, December 30th, 2022 at 10:25 AM, Paul Ruizendaal <
pnr(a)planet.nl>
wrote:
> London and Reiser report about porting the shell that “it required by
far the
largest conversion effort of any supposedly portable program, for
the simple reason that it is not portable.” By the time of SysIII this is
greatly improved, but also in porting the SysIII user land it was the most
complex of the set so far.
>
> There were three aspects that I found noteworthy:
>
> 1. London/Reiser apparently felt strongly about a property of casts.
The code
argues that casting an l-value should not convert it into a
r-value:
>
> <quote from "mode.h">
>
> /* the following nonsense is required
> * because casts turn an Lvalue
> * into an Rvalue so two cheats
> * are necessary, one for each context.
> */
> union { int _cheat;};
> #define Lcheat(a) ((a)._cheat)
> #define Rcheat(a) ((int)(a))
> <endquote>
>
>
> However, Lcheat is only used in two places (in service.c), to set and
to clear
a flag in a pointer. Interestingly, the 32V code already replaces
one of these instances with a regular r-value cast. So far, I’d never
thought about this aspect of casts. I stumbled across it, because the Plan
9 compiler did not accept the Lcheat expansion as valid C.
>
> 2. On the history of dup2
>
> The shell code includes the following:
>
> <quote from “io.c”>
>
> rename(f1,f2)
> REG INT f1, f2;
> {
> #ifdef RES /* research has different sys calls from TS */
> IF f1!=f2
> THEN dup(f1|DUPFLG, f2);
> close(f1);
> IF f2==0 THEN ioset|=1 FI
> FI
> #else
> INT fs;
> IF f1!=f2
> THEN fs = fcntl(f2,1,0);
> close(f2);
> fcntl(f1,0,f2);
> close(f1);
> IF fs==1 THEN fcntl(f2,2,1) FI
> IF f2==0 THEN ioset|=1 FI
> FI
> #endif
> }
> <endquote>
>
>
> I’ve check the 8th edition source, and indeed it supports using DUPFLG
to
signal to dup() that it really is dup2(). I had earlier wondered why
dup2() did not appear in research until 10th edition, but now that is
clear. It would seem that the dup of 8th edition is a direct ancestor to
dup() in Plan 9. I wonder why this way of doing things never caught on in
the other Unices.
>
> 3. Halfway to demand paging
>
> I stumbled across this one because I had a bug in my signal handling.
From
early days onwards, Unix supported dynamically growing the stack
allocation, which arguably is a first step towards building the mechanisms
for demand paging. It appears that the Bourne shell made another step,
catching page faults and expanding the data/bss allocation dynamically:
>
> <quote from “fault.c”>
>
> VOID fault(sig)
> REG INT sig;
> {
> signal(sig, fault);
> IF sig==MEMF
> THEN IF setbrk(brkincr) == -1
> THEN error(nospace);
> FI
> ELIF ...
> <endquote>
>
>
> This was already present in 7th edition, so it is by no means new in
32V or
SysIII -- it had just escaped my attention as a conceptual step in
the development of Unix memory handling.
>