The paper was called "The Competence/Performance Dichotomy" by
Pratt. He borrowed an idea from Chompsky and distinguished between
being able to understand something and being able to produce it. The
idea I took from it was to recognize grammar rules even if the parse
was ambiguous and then use other mechanisms (e.g., Shift/Reduce,
precedence) to decide what to do.
About empty productions: one of the most useful versions of this was
in handling scopes in programming languages, especially in C where xyz
could be a variable in one scope and a typedef in an inner one, or
vice versa. The empty production could get called at the right
moment to make the appropriate symbol table fixes, where otherwise we
would have had a syntax error.
Steve
----- Original Message -----
From: "Bakul Shah" <bakul(a)bitblocks.com>
To:"Steve Johnson" <scj(a)yaccman.com>
Cc:<arnold@skeeve.com>, <ecashin(a)noserose.net>, <dave(a)horsfall.org>,
<tuhs(a)tuhs.org>
Sent:Wed, 16 Jan 2019 12:50:35 -0800
Subject:Re: [TUHS] Knuth and Unix
On Tue, 15 Jan 2019 14:32:16 -0800 "Steve Johnson" <scj(a)yaccman.com>
wrote:
I remember reading Knuth's paper, and certainly
heard
DeRemer's name, but it didn't affect much of what I did.
There was a paper out of Stanford about that time that
influenced me greatly -- it was about pattern matching
languages, and proposed separating two ideas: 1. "Here are
the patterns that match this tree". And 2. "If more than one
pattern matches, here's how to decide which one to use."
Given the constraints of size on the PDP 11, anything but
LR(1) was infeasable. But using ambiguous grammars and
broadening the shift/reduce test to trest operator precedence
fit right into that pattern. Another thing that I think was
unique to Yacc at the time was introducing symbols that
matched the empty string whose reduction caused program
actions. Many similar parser systems at the time could not
deal with these "empty" symbols.
Good to know all this. The Stanford paper you refer, would
that be "fast pattern matching" by Knuth, Morris & Pratt?
I remember struggling with empty strings and I also missed
other yacc tricks. In my formulation I had two kinds of
rules: set rules and sequence rules. Set rules avoided
unnecessary reductions in rules such as foo: bar|...
For example:
// sets
expr = relexpr | expr1
expr1 = addexpr | expr2
expr2 = mulexpr | expr3
...
// sequences
relexpr: expr1 RELOP expr1
addexpr: expr1 ('+'|'-') expr2
mulexpr: expr2 ('*'|'/') expr3
...
Basically I completely avoided empty symbols -- even an empty
file has an EOF! [I didn't try it on anything more complicated
than (extended) Pascal so no idea how it would have fared on
complexificated lanuages]