To close the loop a bit...
I really appreciate the anecdotes and background. It's helpful to
those of us who didn't live it.
On the best resources front:
The Unix Programmer's Manual for v7 contains:
"A Tutorial Introduction to the UNIX Text Editor" by B. W.
Kernighan - excellent coverage of Context Searching using a
limited subset of regex.
"Advanced Editing on UNIX" by B. W. Kernighan - lots of examples.
"ed(1)" by authors of the manpages - super concise but thorough
coverage of the regex rules (great followup to the tutorial).
Articles:
"Regular Expression Search Algorithm", by K. Thompson - an
Algol-60 implementation of regex described in 4 pages... in
1968... I was 2 1/2.
"Regular Expression Matching Can Be Simple and Fast", by Russ Cox
- how can an article be both simple and deep? Great concision.
Other Books:
"The AWK Programming Language" by A. V. Aho, B. W. Kernighan,
& P. J. Weinberger - the discussion on pp. 28-31, Regular
Expressions, is the best I've seen.
"Chapter 9. Regular Expresssions" in the XBD section of the SUS
(IEEE Std 1003.1-2017) - Comprehensive presentation of the spec
(good stuff, even if nobody perfectly implements it).
There are plenty more, but with the tutorial, ed(1), and AWK book
in hand, I think a beginner is covered.
BTW, awk is awesome (particularly with the new csv additions) - I
don't "need" the new unicode support, but it's nice. I didn't get
awk, but when I figured out you could do this:
awk
'/SYS.*\(write\,/, /\)/' */*
SYSCALL_DEFINE3(write,
unsigned int, fd, const char __user *, buf,
size_t,
count)
in the kernel source, I was sold. I've never really wrapped my
head around how to efficiently search over multiple lines, awk's
range patterns... just make sense :). Even in it looks crazy, it
works.
ranges bounded by regexes... who'd of thunk it?
Will
On 3/3/24 8:03 PM, Marc Rochkind wrote:
Will, here's my recollection, when I got to UNIX in
late 1972 or thereabouts:
First, there was ed. grep and sed were derived from ed, so
came along later. awk came along way later.
There were only manual pages. You typed "man ed" and there
it was. The man pages were very accurate, very clear, and very
authoritative. Many found them too succinct, especially as
UNIX got more popular, but all of us back in the day found
them perfect. Maybe you had to read the man page a few times
to understand it, but at least that's all you had to read. No
need to hunt around for more documentation!
(Well, there was more documentation: The source code, which
was all online. But reading the ed source to understand
regular expressions was impossible. It was in assembler, and
Ken was generating code on the fly as the expression was
compiled.)
Also, it should be noted that ed produced a single error
message: a question mark. No wasting of teletype paper!
The motivation for learning regular expressions was that
that's how you edited files. ed was the only game in town.
(sh used a greatly restricted form of regular expressions,
which were documented on the sh man page.)
Marc Rochkind
Hi All,
I was wondering, what were the best early sources of
information for regexes and why did folks need to know
them to use unix? In my recent explorations, I have needed
to have a better understanding of them, so I'm digging
in... awk's my most recent thing and it's deeply
associated with them, so here we are. I went to the
bookshelf to find something appropriate and as usual, I've
traced to primary sources to some extent. I started with
Mastering Regular Expressions by Friedl, and I won't knock
it (it's one of the bestsellers in our field), but it's
much to long for my personal taste and it's not quite as
systematic as I would like (the author himself notes that
his interests are less technical than authors preceding
him on the subject). So, back to the shelves... Bourne's,
The Unix Environment, and Kernighan & Pike's, The Unix
Programming Evironment both talk about them in the context
of grep, ed, sed, and awk. Going further back, the Unix
Programmer's Manual v7 - ed, grep, sed, awk...
After digging around it seems like folks needed regexes
for ed, grep, sed and awk... and any other utility that
leveraged the wonderful nature of these handy expressions.
Fine. Where did folks go learn them? Was there a
particularly good (succinct and accurate) source of
information that folks kept handy? I'm imagining (based on
what I've seen) that someone might cut out the ed
discussion or the grep pages of the manual and tape them
to their monitors, but maybe I'm stooopid and they didn't
need no stinkin' memory device for regexes - surely
they're intuitive enough that even a simpleton could pick
them up after seeing a few examples... but if that were
really the case, Friedl's book would have been a flop and
it wasn't :). So seriously, if you remember that far back
- what was the definitive source of your regex knowledge
and what were the first motivators for learning them?
Thanks,
Will
--