Dave Horsfall wrote:
>On Tue, 7 Jan 2020, Bakul Shah wrote:
>
>> In Algol68 # ... # is one of the forms for block comments!
>
>Interesting... All we had at university though was ALGOL W (as far as I
>know; there were several languages that mere students could not use, such
>as FORTRAN H).
Yes, but when was it implemented? Kernighan is first ever if it is not
before 1974. So I decided to look and it took me down a rabbit hole of
ALGOL taht leads back to Bourne shell and then right back to # (but in C)
By reading the ALGOL 68 wiki page, the laguange seemed to have had a
character set problem since day one, and it seems if you didn't have the
cent-sign you were to use PR for pragmat for comments. And since it
had problems it was continually extened. I just cant find when # was defined.
I looked at various old implementations (none pre 1974 list #) --
- CDC's ALGOL 68 compiler from 1975 you could only use use PR .. PR
(both # and CO were not defined)
http://www.bitsavers.org/pdf/cdc/Tom_Hunter_Scans/Algol_68_version_1_Refere…
- The official revised ALGOL86 spec from 1978 lists all these ways to enter
them (bottom of page 112) in this order --
brief comment symbol: cent-sign
bold comment symbol: comment
style 1 comment symbol: co
style 2 comment symbol: #
bold pragmat symbol: pragmat
style 1 pragmat symbol: pr
seeing # is "style 2" it looks like a later extention to me
http://www.softwarepreservation.org/projects/ALGOL/report/Algol68_revised_r…
- ALGOL68/19 from 1975 list these 4 symbols as comments: # % co pr
http://www.softwarepreservation.org/projects/ALGOL/manual/Gennart_Louis-Alg… 68_19_Reference_Manual.pdf
- DECs ALGOL (1976 printing but first released was 1971) for system10 uses
a ! for a comment as # means "not equal" --
http://www.bitsavers.org/www.computer.museum.uq.edu.au/pdf/DEC-10-LALMA-B-D… decsystem10%20ALGOL%20Programmer's%20Reference%20Manual.pdf
- CMU's ALGOL68S from 1978 list all these ways --
co comment
comment comment
pr pragmat
pragmat pragmat
# (comment symbol) comment
:: (pragmat symbol) pragmat
(its for UNIX v6 or v7 so not surprising # is a comment)
http://www.softwarepreservation.org/projects/ALGOL/manual/a68s.txt/view
- Rutgers ALGOL 68 interprter from 1987 for UNIX does not implement
PR nor PRAMAT and says comments are # CO or COMMENT
https://www.renyi.hu/~csirmaz/algol-68/linux/manual
I could not find a freely accessible manual for ALGOL68R (very 1st one) nor
Cambridge's ALGOL68C. What's intresting here is Stephen Bourne was on the
team that made ALGOL68C before he move to Bell Labs. It'd be pretty funny
if he implemented a language that there were 7 or 8 ways to enter a comment
(cent, co, comment, pr, pragmat, #, ::, %) yet there were zero ways
to enter a comment in the Bourne shell.
Also the style of "COMMENT put a note here COMMENT" is very un-ALGOL like
(with DO .. OD, IF .. FI) shouldn't it be like this?
COMMENT put a note here TNEMMOC
CO put a note here OC
PRAGMAT directive here TAMGARP
PR directive here RP
So then I remembered Bourne used the C preprocssor to make C like ALGOL when
he wrote the shell. If you've never seen it, his C looks like this --
case TSW:
BEGIN
REG STRING r = mactrim(t->swarg);
t=t->swlst;
WHILE t
DO ARGPTR rex=t->regptr;
WHILE rex
DO REG STRING s;
IF gmatch(r,s=macro(rex->argval)) ORF (trim(s), eq(r,s))
THEN execute(t->regcom,0);
t=0; break;
ELSE rex=rex->argnxt;
FI
OD
IF t THEN t=t->regnxt FI
OD
END
break;
ENDSW
So I wanted to see if he remapped C comments /* */
I am not even sure you could even do that with the C preprocessor
but took alook anywy and in
https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh/xec.c
It's first lines are this --
#
/*
* UNIX shell
*
* S. R. Bourne
* Bell Telephone Laboratories
*
*/
#include "defs.h"
#include "sym.h"
So nope, just regular C comments (which came from PL/I btw which was
what multics was programmed in)
But look! The very first line of that file! It is
a single # sitting all by itself. Why? you ask. Well this is a hold
over from when the C preprocessor was new. C orginally did not
have it and was added later. PL/I had a %INCLUDE so Ritchie eventaully
made a #include -- but pre 7th Edition the C preprocessor would not be
inkoved unless the very first character of the C source file was an #
Since v7 the preprocessor always run on it. The first C preprocessor
was Ritchie's work with no nested includes and no macros. v7's was by
John Reiser which added those parts.
that 1st line with a single # sitting by itself reminds me of the
csh construct as well.
-Brian
A bit more on this.
csh(1) was wrtten around 1978 and yes # as a comment was only for
scrtipts, think it was why would you need to comment interactively?
And the # as an addition to be a comment in Bourne shell had to be around 1980
as that is when Dennis Ritchie added #! to exec(2) in the kernel. From this
point on this forced all UNIX scripting languages to use # as a comment as
it just exec'd the first string after the #! with the name of the current
file being exec'd as the single argument. So things like perl(1) and python(1)
had to use # if they wanted the #! mechanism to work for them too.
So this worked great for shell scripts, it didn't work for awk(1) nor sed(1)
nor s(1)(that is R(1) now) scripts (all needed a -f for input from file)
nor dc(1) scripts as dc had no comment character.
While Research UNIX got !# in 1980, this was after the 7th Edition release
and the 8th Edition wasn't released until 1985), BSD got it around 1982-83,
and System V didn't implement it until 1988. Eventully #! was extented
so #!/usr/bin/awk -f would work.
Also Bill Joy was the first to use # as a comment character in an /etc config
file for his /etc/ttycap (which became /etc/termcap) for vi(1). Most configs
did not have a comment at all at that time, while /etc/master used a * as a
comment (SCCS used * as a comment too btw)
Also before you say wait! ALGOL uses # as comment and is older than
Kernighan' ratfor(1). This is a later addition. The original used the EBCDIC
cent sign character to start and another cent sign to end the comment
(i.e. programmer's two cents). If you were on an ASCII system this became
"co" (for comment) as the original ASCII does not have a cent sign
-Brian
McIlroy:
> [vi] was so excesssive right from the start that I refused to use it.
> Sam was the first screen editor that I deemed worthwhile, and I
> still use it today.
Paulsen:
> my sam build is more than 2 times bigger than Gunnar Ritter's vi
> (or Steve Kirkendall's elvis) and even bigger than Bram Moolenaar's vim.
% wc -c /bin/vi bin/sam bin/samterm
1706152 /bin/vi
112208 bin/sam
153624 bin/samterm
These mumbers are from Red Hat Linux.
The 6:1 discrepancy is understated because
vi is stripped and the sam files are not.
All are 64-bit, dynamically linked.
Clem Cole wrote:
>A heretic!! Believers all know '*Bourne to Program, Type with Joy' *and*
>'One true bracing style' *are the two most important commandments of UNIX
>programmer!
>
>Seriously, I still write my scripts as v7 and use (t)csh as my login shell
>on all my UNIX boxes ;-)
>
>Clem
You know what's amazing? that Bill Joy code to launch either
csh or bourne shell based on the first character of teh file is
still in tcsh codebase today. It even has #! support just in case
your kernel does not. However this code never gets run as
who write scripts without #! anymore .. but here's a little test ---
$ tcsh
You have 2 mail messages.
> cat x1.sh
PATH=/bin
echo $SHELL
> ./x1.sh
/bin/sh
> cat x2.csh
#
setenv path /bin
echo $shell
> ./x2.csh
/usr/local/bin/tcsh
> exit
you can see it in https://github.com/tcsh-org/tcsh/blob/master/sh.exec.c
-Brian
Doug McIlroy wrote:
>Brian Walden's discussion of sh #, etc, is right on.
>However, his etymology for unary * in C can be
>pushed back at least to 1959. * was used for
>indirect addressing in SAP, the assembler for
>the IBM 7090.
Thank you for both the confirmation and also that history update.
-Brian
>
>> From: Warner Losh <imp(a)bsdimp.com>
>
>> There's no wupus source before V7.
>
> If you look at Clem's original message:
>
>>> From: Clem Cole <clemc(a)ccc.com>
>>> Date: Mon, 6 Jan 2020 16:08:50 -0500
>
>>> You got my curiosity up and found the V5 and V6 source code
>
> (the one Will was replying to), Clem's talking about the source to crt0.s,
> etc.
>
> Noel
>
Sorry. I could have been clearer. I thought Clem was saying that he found the Wumpus code in v5/v6. Now, I see that he was just talking about the crt files. When I said I couldn’t find the source prior to v7, I meant the wumpus source :).
On another note, porting the v7 code to MacOS is tricky, lots of minor differences, but I’m giving it a go. Prolly easier to just figure out what it’s supposed to do and do it with modern idioms, but it’s a fun puzzle to try to replicate the same functionality with only minor adjustments.
Will
> From: Warner Losh <imp(a)bsdimp.com>
> There's no wupus source before V7.
If you look at Clem's original message:
>> From: Clem Cole <clemc(a)ccc.com>
>> Date: Mon, 6 Jan 2020 16:08:50 -0500
>> You got my curiosity up and found the V5 and V6 source code
(the one Will was replying to), Clem's talking about the source to crt0.s,
etc.
Noel
> I'm interested in the possible motivations for a redirection to be
> a simple command.
I use it to truncate a file to zero length.
Or to create (an empty) file.
Doug
> From: Will Senn
> On another note,You said you looked in v5 and v6 source code? I looked
> at tuhs and didn't see anything earlier than v7. Where did you find
> them?
Huh? https://www.tuhs.org/cgi-bin/utree.pl
Noel
i started on the 7th edition on a perkin elmer (ne interdata) - this was v7 with some 2.1bsd sprinkled on top.
i remember the continual annoyance of unpacking shar files starting with hash comments : only on ed7. in the end i wrote a trivial sed to remove them called unshar.
i haven't thought of that for decades...
-Steve
Mike Haertel:
That's amusing, considering that the 5620 stuff was in /usr/jerq on
Research systems! Apparently the accident became institutionalized.
=====
I remember the name Jerq being tossed around to mean 5620
when I was at 1127. That doesn't mean it was historically
accurate, but it is consistent with the directory names, and
the latter are probably where I got my mistaken idea of the
history.
Thanks to Rob, who certainly should know, for clearing it up.
Norman Wilson
Toronto ON
Brian Walden's discussion of sh #, etc, is right on.
However, his etymology for unary * in C can be
pushed back at least to 1959. * was used for
indirect addressing in SAP, the assembler for
the IBM 7090.
Richard Salz wrote:
>> not the kernel. This had traditionally been done after the exec() failed
>> then shell ould run "sh argv[0]", but with two shells this was now a
>> problem.
>>
>
>It seems the kernel did that; http://man.cat-v.org/unix_7th/2/exec since
>argv[-1] was altered.
As a user of these systems, the offical 7th Edition kernel most certainly
could not execute a script, only binaries. It happend after the release
1979 and took time to make its way out, which it did via DSB before 8th Ed
was finalized in 1985.
The usenet announcement of this new functionality from Dennis is on
Jan 10, 1980. Is listed here https://en.wikipedia.org/wiki/Shebang_(Unix)
Dennis stated the idea was not his, it came up during csonverastions at
a conference.
-Brian
More than you ever wanted to know about #
The first shell to use it as a comment was csh(1), Bill Joy did this.
This was also pre #! in the kernel so the shell had to exec scripts,
not the kernel. This had traditionally been done after the exec() failed
then shell ould run "sh argv[0]", but with two shells this was now a problem.
So csh would look at the first line of the script and if it was a #\n
it would exec csh on it if not it would exec sh(1) on it. This was check
was also placed into to BSD's (not v7 nor att's) bourne shell so it could
run csh scripts as well.
However this was not the first use of # as a comment character. That award
goes to Brian Kernighan's ratfor(1) (rational fortran) compiler in 1974-75.
Then Feldman used in make(1) in 1976, followed by Kernighan's m4(1), learn(1)
and most famously awk(1) in 1977
Bourne shell, written around 1976, eventualy picked this up later on but after
the initial v7 release. And as some noted the : was kind of a comment, it
was a command that did an exit(0) orginally for labels for Thompson's
shell's goto command. The : command was eventually hard linked to the
true(1) command
Remember # was hard to type on teletypes as that was the erase character, so
to enter it, you needed to type \#
(# as erase and @ as line kill came from multics btw)
It was so hard to type that the orignal assember based on DEC PAL-11R,
that addressing syntax changed @ to * and # to $.
In DEC it would be--
MOV @X, R0;
In UNIX asm it became --
mov *x, r0
So this is also why C pointers use * notation.
-Brian
> From: Dave Horsfall dave at horsfall.org
>
>On Sat, 4 Jan 2020, Chet Ramey wrote:
>
>>> Which reminds me: which Shell introduced "#" as a true comment?
>>
>> Define "true comment." The v7 shell had `#' as the comment character, but
>> it only worked when in non-interactive shells. I think it was the Sys III
>> shell that made it work when the shell was interactive.
>
>Yes, that's what I meant.
>
>> This is, incidentally, why bash has the `interactive_comments' option,
>> which I saw in another message. BSD, which most of the GNU developers were
>> using at the (pre-POSIX) time, used the v7 shell and didn't have
>> interactive comments. When a sufficiently-advanced POSIX draft required
>> them, we added it.
>
>I never did catch up with all the options on the various shells; I just
>stick with the defaults in general. Eg:
>
> aneurin% man bash | wc -l
> 5947
>
>Life's too short...
>
>-- Dave
Hoi,
in a computer forum I came across a very long command line,
including `xargs' and `sh -c'. Anyways, throughout the thread
it was modified several times, when accidently a pipe symbol
appeared between the command and the output redirection. The
command line did nothing; it ran successful. I was confused,
because I expected to see a syntax error in case of
``cmd|>file''. This made me wonder ...
With help of Sven Mascheck, I was able to clear my understanding.
The POSIX shell grammer provided the answer:
pipeline : pipe_sequence
...
pipe_sequence : command
| pipe_sequence '|' linebreak command
;
command : simple_command
...
simple_command : cmd_prefix cmd_word cmd_suffix
| cmd_prefix cmd_word
| cmd_prefix <--- HIER!
| cmd_name cmd_suffix
| cmd_name
;
cmd_prefix : io_redirect
...
io_redirect : io_file
...
io_file : '<' filename
| LESSAND filename
| '>' filename
...
A redirection is a (full) simple_command ... and because
``simple_command | simple_command'' is allowed, so is
``io_file | io_file''. This can lead to such strange (but
valid) command lines like:
<a | >b
>b | <a
Sven liked this one:
:|>:
Here some further fun variants:
:|:>:
<:|:>:
They would provide nice puzzles. ;-)
My understanding was helped most by detaching from the
semantics and focussing on syntax. This one is obviously
valid, no matter it has no effect:
:|:|:
From there it was easier to grasp:
>a | >a | >a
Which is valid, because ``>a'' is a (complete) simple_command.
Thus, no bug but consistent grammer. ;-)
If one would have liked to forbid such a corner case,
additional special case handling would have been necessary
... which is in contrast to the Unix way.
Sven checked the syntax against various shells with these
results:
- Syntax ok in these shells:
SVR2 sh (Ultrix), SVR4 sh (Heirloom)
ksh93
bash-1.05, bash-aktuell
pdksh-5.2.14
ash-0.4.26, dash-0.5.6.1
posh-0.3.7, posh-0.12.3
mksh-R24, mksh-R52b
yash-2.29
zsh-3.0.8, zsh-4.3.17
- Exception to the rule:
7thEd sh:
# pwd|>>file
# echo $?
141
On first sight ok, but with a silent error ... SIGPIPE (128+13).
I'd be interested in any stories and information around this
topic.
What about 7thEd sh?
meillo
> I was always sad that the development of C that became Alef never got off
> the ground.
It eventuated in Go, which is definitely aloft, and responds
to Mike Bianchi's specific desires. Go also has a library
ecosystem, which C does not.
With its clean parallelism, Go may be suitable for handling
the complexity of whole-paragraph typsetting in the face
of unexpected traps, line-length changes, etc.
Doug
I'm having a party on Saturday January 11 (and if any of you are in Tucson,
or want to come to Tucson for it, you're invited; email me for the address
and time).
Although the party is Elvis-themed, it's really about boardgaming and
classic videogaming.
So I kind of wanted to put a general-purpose Z-machine interpreter on my
PiDP-11, so that people could play Infocom (and community) games on a real
terminal.
Turns out there wasn't really one, so I ported the venerable ZIP (which I
have renamed "zterp" for obvious reasons) to 2.11BSD on the PDP-11, and I
also wrote a little utility I call "tmenu" to take a directory (and an
optional command applying to files in the directory) and make a numbered
menu, so that my guests who are not familiar with Actual Bourne Shell can
play games too.
These things are at:
https://github.com/athornton/pdp11-zterp
and
https://github.com/athornton/pdp11-tmenu/
Both are K&R C, and compile with the 2.11BSD system C compiler.
My biggest disappointment is that the memory map of Trinity, my favorite
Infocom game, is weird and even though it's only a V5 game, I can't
allocate enough memory to start it. Other than that, V5 and below seem to
work mostly fine; V8 is in theory supported but no game that I've tried has
little enough low memory that I can malloc() it using C on 2.11BSD.
Adam
I have always marveled at folks who can maintain multiple
versions of software, but Larry's dispatch from the
trenches reveals hurdles I hadn't imagined. Kudos for
keeping groff alive.
Speaking of which, many thanks to all who pitched in
on the %% nit that I reported. The instant response
compares rather favorably to an open case I've been
following in gcc, which was originally filed in 2002.
Doug
The use of %% to designate a literal % in printf is not
a recent convention. It was defined in K&R, first edition.
Doug
Ralph Cordery wrote:
Though that may seem odd to our modern C-standardised eyes, it's
understandable in that if it isn't a valid %f, etc., format specifier
then it's a literal percent sign.
According to K&R the behavior of % followed by something
unexpected is undefined. So the behavior of Ralph's example
is officially an accident. (It's uncharacteristic of Dennis
to have defined printf so that there was no guaranteed way
to get a literal % into a format.)
Doug
------------------------------------------------
Ralph Corderoy wrote:
$ printf '%s\n' \
.PS 'print sprintf("%.17g %.0f% % %%", 3.14, 42, 99)' .PE |
> pic >/dev/null
3.1400000000000001 42% % %%
Though that may seem odd to our modern C-standardised eyes, it's
understandable in that if it isn't a valid %f, etc., format specifier
then it's a literal percent sign.
The linux kernel never implemented support for a few features of obsolete
terminals. I find myself wanting to use Raspberry Pi-style linux machines
with old hardware, so this became quite frustrating.
So, I've put together a patch to the n_tty line discipline that adds some
things needed for using a Teletype model 33 or similar natively:
- XCASE, escaping uppercase (and a few special characters) for input and
display,
- CRDLY, delay to allow time for the carriage-return function;
- NLDLY, delay to allow time for the newline function.
With XCASE and ICANON, the terminal outputs a backslash before uppercase
characters; and accepts a backslash escape to set input to uppercase. The
usual way to use this is `stty lcase`, which also down-cases all input by
default. The special character escapes are:
\^ to ~
\! to |
\( to {
\) to }
\' to `
With CRDLY there are three options, CR0 through CR2; and with NLDLY there
are options NL0 (no delay) and NL1 (one delay). This patch uses fill
characters for delay, not timing, so these flags only take effect when
OFILL is also set.
Note: this doesn't change `agetty`, which I don't think implements
uppercase login detection right now. I have a Teletype running with
auto-login; and then `stty 110 icanon lcase ofill cr1 nl1`.
Code changes and some brief build instructions are here:
https://github.com/hughpyle/ASR33/tree/master/rpi/kernel
Compare with the raspberrypi tree, here,
https://github.com/raspberrypi/linux/compare/rpi-4.19.y...hughpyle:teletype
Not yet submitted upstream - the changes are in quite a high-traffic code
path, and also I just don't know how :) Feedback is very welcome!
-Hugh
All, I got a new printer with a better duplex scanner. I've just scanned
all the Unix Review magazines that I've got (1984-85 period) and uploaded
them to www.archive.org:
https://archive.org/search.php?query=title%3A%28Unix%20Review%29%20AND%20me…
Merry festive-season-of-your-choice,
Warren
P.S I have a bunch of Unix/World magazines, just waiting for a stronger
guillotine to arrive.
Computer History Museum curator Dag Spicer passed along a question from former CHM curator Alex Bochannek that I thought someone on this list might be able to answer. The paper "The M4 Macro Processor” by Kernighan and Ritchie says:
> The M4 macro processor is an extension of a macro processor called M3 which was written by D. M. Ritchie for the AP-3 minicomputer; M3 was in turn based on a macro processor implemented for [B. W. Kernighan and P. J. Plauger, Software Tools, Addison-Wesley, Inc., 1976].
Alex and Dag would like to learn more about this AP-3 minicomputer — can anyone help?
I sense a hint of confusion in some of the messages
here. To lay that to rest if necessary (and maybe
others are interested in the history anyway):
As I understand it, the Blit was the original terminal,
hardware done by Bart Locanthi (et al?), software by
Rob Pike (et al?). It used an MC68000 CPU. Western
Electric made a small production run of these terminals
for use within AT&T. I don't think it was sold to the
general public.
By the time I arrived at Bell Labs in late 1984, the
Standard Terminal of 1127 was the AT&T 5620, locally
called the Jerq. This was a makeover with hardware
redesigned by a product group to use a Bellmac 32 CPU,
and software heavily reworked by a product group.
This is the terminal that was manufactured for general
sale.
I'm not sure, but I think the Blit's ROM was very basic,
just enough to be some sort of simple glass-tty or
perhaps smartass-terminal* plus an escape sequence to
let you load in new code. The Jerq had a fancier ROM,
which was a somewhat-flaky ANSI-ish terminal by default,
but an escape sequence put it into graphics-window-manager
mode, more or less like what had run a few years earlier
on the Blit.
By then the code used in Research had evolved considerably,
in particular allowing the tty driver to be exported to
the terminal (those familiar with 9term should know what
I mean). In 1127 we used a different escape sequence to
download a standalone program into the terminal and
replace the ROM window manager entirely, so we could run
our newer and (to my taste anyway) appreciably better code.
The downloaded code lived in RAM; you had to reload it
whenever the terminal was power-cycled or lost its connection
or whatnot. (It took a minute or so at 9600bps, rather
longer at 1200. This is not the only reason we jumped at
the chance to upgrade our home-computing scheme to use
9600bps over leased lines, but it was an important one.)
The V8 tape was made in late 1984 (I know that for sure
because I helped make it). It is unlikely to have anything
for the MC68000 Blit, only stuff for the Mac-32 Jerq.
Likewise for the not-really-a-release snapshots from the
9/e and 10/e eras. The 5620 ROM code is very unlikely to
be there anywhere, but the replacement stuff we used should
be somewhere.
Norman Wilson
Toronto ON
> If 5620s were called Jerqs, it was an accident. All the software with that
> name would be for the original, Locanthi-built and -designed 68K machines.
>
> The sequence is thus Jerq, Blit, DMD-5620
Maybe the “Jerq” name had a revival. If the processor switch came with some upheaval it is not hard to see how that revival could have happened.
The Dan Cross tar archive with the source code has two top level directories, one named “blit" with the 68K based source and another one named “jerq" with the Bellmac based source. The tar archive seems to have been made in the summer of 1985, or at least those dates are on the top level directories.
I am of course not disputing that the original name was Jerq. There are many clues in the source supporting that, among which this funny comment in mcc.c:
int jflag, mflag=1; /* Used for jerq. Rob Pike (read comment as you will) */