On Wed, May 22, 2024 at 11:37:39AM -0400, Paul
Winalski wrote:
On Tue, May 21, 2024 at 2:12???PM Luther Johnson
<luther.johnson(a)makerlisp.com>
wrote:
I like this anecdote because it points out the
difference between being
able to handle and process bizarre conditions, as if they were something
that should work, which is maybe not that helpful, vs. detecting them and
doing something reasonable, like failiing with a "limit exceeded" message
That is in fact precisely how the DEC compiler handled the 100 nested
parentheses condition.
. A silent, insidious failure down the line
because a limit was exceeded
is never good.
Amen! One should always do bounds checking when dealing with fixed-size
aggregate data structures. One compiler that I worked on got a bug report
of bad code being generated. The problem was an illegal optimization that
never should have triggered but did due to a corrupted data table. Finding
the culprit of the corruption took hours. It finally turned out to be due
to overflow of an adjacent data table in use elsewhere in the compiler.
The routine to add another entry to that table didn't check for table
overflow.
We invented a data structure that gets around this problem nicely. It's
an array of pointers that starts at [1] instead of [0]. The [0]
entry encodes 2 things:
In the upper bits, the log(2) the size of the array. So all arrays
have at least [0] and [1]. So 2 pointers is the smallest array and
that was important to us, we wanted it to scale up and scale down.
In the lower bits, we record the number of used entries in the array.
We assumed 32 bit pointers and with those we got ~134 million entries
as our maximum number of entries.
Usage is like
char **space = allocLines(4); // start with space for 4 entries
space = addLine(space, "I am [1]");
space = addLine(space, "I am [2]");
space = addLine(space, "I am [3]");
space = addLine(space, "I am [4]"); // realloc's to 8 entries
freelines(space, 0); // second arg is typically 0 or free()
It works GREAT. We used it all over BitKeeper, for stuff as small as
commit comments to arrays of data structures. It scales down, scales
up. Helper functions:
/*
* liblines - interfaces for autoexpanding data structures
*
* s= allocLines(n)
* pre allocate space for slightly less than N entries.
* s = addLine(s, line)
* add line to s, allocating as needed.
* line must be a pointer to preallocated space.
* freeLines(s, freep)
* free the lines array; if freep is set, call that on each entry.
* if freep is 0, do not free each entry.
* buf = popLine(s)
* return the most recently added line (not an alloced copy of it)
* reverseLines(s)
* reverse the order of the lines in the array
* sortLines(space, compar)
* sort the lines using the compar function if set, else string_sort()
* removeLine(s, which, freep)
* look for all lines which match "which" and remove them from the array
* returns number of matches found
* removeLineN(s, i, freep)
* remove the 'i'th line.
* lines = splitLine(buf, delim, lines)
* split buf on any/all chars in delim and put the tokens in lines.
* buf = joinLines(":", s)
* return one string which is all the strings glued together with ":"
* does not free s, caller must free s.
* buf = findLine(lines, needle);
* Return the index the line in lines that matches needle
*/
It's all open source, apache licensed, but you'd have to tease it out of
the bitkeeper source tree. Wouldn't be that hard and it would be useful.