Yes, I googled it per Clem's suggestion
and wound up on that exact link after wandering around admiring
the scenery. I envy the more mathematically inclined among us
their view of matters technical. This piece, being in C and having
step by step articulation of the diagrams, is better for me than
the more formal wikipedia article, although, when I get enough
background, that looks like it'll be good, too.
Thanks Rob, Clem and Bakul.
Will
On 7/31/20 7:36 PM, Rob Pike wrote:
I think this link -
https://swtch.com/~rsc/regexp/regexp1.html i-
s the best place to start. Superb exposition on the background,
theory, and implementation as well as a bit of history of how
the industry lost its way with regular expressions.
Regular expressions are beautiful, simple, and widely
misunderstood.
-rob
On
Jul 31, 2020, at 3:57 PM, Will Senn <will.senn@gmail.com> wrote:
>
> I've always been intrigued with regexes. When I was first
exposed to them, I was mystified and lost in the greediness of
matches. Now, I use them regularly, but still have trouble
using them. I think it is because I don't really understand
how they work.
> ...
> 1. What's the provenance of regex in unix (when did it
appear, in what form, etc)?
> 2. What are the 'best' implementations throughout unix
(keep it pre 1980s)?
> 3. What are some of the milestones along the way (major
changes, forks, disagreements)?
> 4. Where, in the source, or in a paper, would you point
someone to wanting to better understand the mechanics of
regex?
Start here: https://en.wikipedia.org/wiki/Thompson%27s_construction
[I learned about regular expressions in an automata theory
class,
before I knew anything about Unix. What helped me was
learning
about finite state machines. You won't need more than paper
and
pencil to construct one. Reading source code would make more
sense once you grasp how to construct a FSM corresponding to
a RE.]
--
GPG Fingerprint: 68F4 B3BD 1730 555A 4462 7D45 3EAA 5B6D A982 BAAF