On Fri, Mar 3, 2023 at 11:12 AM Dave Horsfall <dave(a)horsfall.org> wrote:
[snip]
# Yes, I have a warped sense of humour here.
/^[JFMAMJJASOND][aeapauuuecoc][nbrrynlgptvc] [ 0123][0-9] / \
{
date = sprintf("%4d/%.2d/%.2d",
year, months[substr($0, 1, 3)], substr($0, 5, 2))
If I may, I'd like to point out something fairly subtle here that, I
think, bears on the original question (paraphrased as, "where does one
draw the line between concision and understandability?").
Note Dave's class to match the first letter of the month:
`[JFMAMJJASOND]`. One may notice that a few letters are repeated (J,
M, A), and one _could_ shorten this to: `[JFMASOND]`. But I can see a
serious argument where that may be regarded as a mistake; in
particular, the original is easy to validate by just saying the names
of the month out loud as one scans the list. For the shorter version,
I'd worry that I would miss something or make a mistake. The lesson
here is keep it simple and don't over-optimize!
Etc. The idea is not to validate so much as to grab a
line of interest to
me and extract the bits that I want.
[snip]
Too true.
A few years ago, Rob Pike gave a talk about lexing in Go that bears on
this that's worth a listen:
https://www.youtube.com/watch?v=HxaD_trXwRE
- Dan C.