On Mon, 4 Mar 2024, 08:27 Rob Pike, <robpike(a)gmail.com> wrote [to Larry]
Oh happy days. Hi Rob, loved the book.
If that's really true, that you learned from Spencer's library, then you
didn't learn the most important thing about them,
which is the automata
theory that guarantees their performance is always linear. Not to take
anything away from Henry, who admitted at the time that it could be slow
for bad expressions, but we're still paying the price for refusing to
connect "regex" with the theory that created them, ignoring it in fact.
I once got into a bunfight with a Googler on the topic of coding interview
questions, on a related matter. He was promulgating a regular expression to
correctly match/parse-out legitimate dotted-quad IPv4 addresses, including
bounds-checking the octets to be in the range 0..255, and arguing that it
since it was going to be run through a DFA that it was a sunk cost for
efficiency and therefore perfect.
The result looked like line noise, and he was perturbed that I said I would
prefer to take a much simpler (NFA?) RE, parse out the ints and
bounds-check them, just to reduce cognitive load and increase
maintainability of code.
We didn't really come to an agreement.
-a