To close the loop a bit...
I really appreciate the anecdotes and background. It's helpful to those
of us who didn't live it.
On the best resources front:
The Unix Programmer's Manual for v7 contains:
"A Tutorial Introduction to the UNIX Text Editor" by B. W. Kernighan -
excellent coverage of Context Searching using a limited subset of regex.
"Advanced Editing on UNIX" by B. W. Kernighan - lots of examples.
"ed(1)" by authors of the manpages - super concise but thorough coverage
of the regex rules (great followup to the tutorial).
Articles:
"Regular Expression Search Algorithm", by K. Thompson - an Algol-60
implementation of regex described in 4 pages... in 1968... I was 2 1/2.
"Regular Expression Matching Can Be Simple and Fast", by Russ Cox - how
can an article be both simple and deep? Great concision.
Other Books:
"The AWK Programming Language" by A. V. Aho, B. W. Kernighan, & P. J.
Weinberger - the discussion on pp. 28-31, Regular Expressions, is the
best I've seen.
"Chapter 9. Regular Expresssions" in the XBD section of the SUS (IEEE
Std 1003.1-2017) - Comprehensive presentation of the spec (good stuff,
even if nobody perfectly implements it).
There are plenty more, but with the tutorial, ed(1), and AWK book in
hand, I think a beginner is covered.
BTW, awk is awesome (particularly with the new csv additions) - I don't
"need" the new unicode support, but it's nice. I didn't get awk, but
when I figured out you could do this:
awk '/SYS.*\(write\,/, /\)/' */*
SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
size_t, count)
in the kernel source, I was sold. I've never really wrapped my head
around how to efficiently search over multiple lines, awk's range
patterns... just make sense :). Even in it looks crazy, it works.
ranges bounded by regexes... who'd of thunk it?
Will
On 3/3/24 8:03 PM, Marc Rochkind wrote:
Will, here's my recollection, when I got to UNIX
in late 1972 or
thereabouts:
First, there was ed. grep and sed were derived from ed, so came along
later. awk came along way later.
There were only manual pages. You typed "man ed" and there it was. The
man pages were very accurate, very clear, and very authoritative. Many
found them too succinct, especially as UNIX got more popular, but all
of us back in the day found them perfect. Maybe you had to read the
man page a few times to understand it, but at least that's all you had
to read. No need to hunt around for more documentation!
(Well, there was more documentation: The source code, which was all
online. But reading the ed source to understand regular expressions
was impossible. It was in assembler, and Ken was generating code on
the fly as the expression was compiled.)
Also, it should be noted that ed produced a single error message: a
question mark. No wasting of teletype paper!
The motivation for learning regular expressions was that that's how
you edited files. ed was the only game in town.
(sh used a greatly restricted form of regular expressions, which were
documented on the sh man page.)
Marc Rochkind
On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn(a)gmail.com> wrote:
Hi All,
I was wondering, what were the best early sources of information
for regexes and why did folks need to know them to use unix? In my
recent explorations, I have needed to have a better understanding
of them, so I'm digging in... awk's my most recent thing and it's
deeply associated with them, so here we are. I went to the
bookshelf to find something appropriate and as usual, I've traced
to primary sources to some extent. I started with Mastering
Regular Expressions by Friedl, and I won't knock it (it's one of
the bestsellers in our field), but it's much to long for my
personal taste and it's not quite as systematic as I would like
(the author himself notes that his interests are less technical
than authors preceding him on the subject). So, back to the
shelves... Bourne's, The Unix Environment, and Kernighan & Pike's,
The Unix Programming Evironment both talk about them in the
context of grep, ed, sed, and awk. Going further back, the Unix
Programmer's Manual v7 - ed, grep, sed, awk...
After digging around it seems like folks needed regexes for ed,
grep, sed and awk... and any other utility that leveraged the
wonderful nature of these handy expressions. Fine. Where did folks
go learn them? Was there a particularly good (succinct and
accurate) source of information that folks kept handy? I'm
imagining (based on what I've seen) that someone might cut out the
ed discussion or the grep pages of the manual and tape them to
their monitors, but maybe I'm stooopid and they didn't need no
stinkin' memory device for regexes - surely they're intuitive
enough that even a simpleton could pick them up after seeing a few
examples... but if that were really the case, Friedl's book would
have been a flop and it wasn't :). So seriously, if you remember
that far back - what was the definitive source of your regex
knowledge and what were the first motivators for learning them?
Thanks,
Will
--
/My new email address is mrochkind(a)gmail.com/