[COFF] Re: Requesting thoughts on extended regular expressions in grep.

3 Mar 2023

On 3/3/23 6:47 AM, Dan Cross wrote:
...
  Oh, for sure; to be clear, it was obvious that in the
earlier
 discussion the original was just part of something larger. 
Good.  For a moment I thought that you might be thinking it was stand alone.
...
  FWIW, this RE seems ok to me; the additional context
makes it unlikely
 to match something else accidentally. 
:-)
...
  It needn't be special.  The point is simply that
there's some external
 knowledge that can be brought to bear to guide the shape of the REs. 
ACK
I've heard "domain (specific) knowledge" used to refer to both extremely
specific training in a field and -- as you have -- data that is having
something done to it.
...
  In this case, you know that log lines won't begin
with `___ 123
 456:789` or other similar junk. 
They darned well had better not.
...
  Kinda. The "machine" in this case is
actually an abstraction, like a
 Turing machine. The salient point here is that REs map to finite state
 machines, and in particular, one need not keep (say) a stack of prior
 states when simulating them. Note that even in an NDFA simulation,
 where one keeps track of what states one may be in, one doesn't need
 to keep track of how one got into those states. 
ACK
...
  Obviously in a real implementation you've got the
program counter,
 register contents, local variables, etc, all of which consume
 "memory" in the conventional sense. But the point is that you don't
 need additional memory proportional to anything other than the size
 of the RE. DFA implementation could be implemented entirely with
 `switch` and `goto` if one wanted, as opposed to a bunch of mutually
 recursive function calls, NDFA simulation similarly except that
 you need some (bounded) additional memory to hold the active set
 of states. Contrast this with a pushdown automata, which can parse
 a context-free language, in which a stack is maintained that can
 store additional information relative to the input (for example,
 an already seen character). Pushdown automata can, for example,
 recognize matched parenthesis while regular languages cannot. 
I think I understand the gist of what you're saying, but I need to
re-read it and think about it a little bit.
...
  Anyway, sorry, this is all rather more theoretical
than is perhaps
 interesting or useful. 
Apology returned to sender as unnecessary.
You are providing the requested thought provoking discussion, which is
exactly what I asked for.  I feel like I'm going to walk away from this
thread wiser based on the thread's content plus all additional reading
material on top of the thread itself.
...
  Bottom line is, I think your REs are probably fine.
`egrep` will
 complain at you if they are not, and I wouldn't worry too much about
 optimizing them: I'd "stop" whenever you're happy that you've
got
 something understandable that matches what you want it to match. 
Thank you (again) Dan.  :-)
--
Grant. . . .
unix || die

2025

2024

2023

2022

2021

2020

2019

2018

[COFF] Re: Requesting thoughts on extended regular expressions in grep.