few more comments about style and optimization:First, the thing that
make optimized code hard to understand is that (at least when a
compiler does it) the optimized program goes through a different set
of states than the user expected from the source code. When a loop
is unrolled, or invariant expressions moved out of a loop, or indexing
is turned into pointer manipulation, if you don't know what's
happening its mysterious. But once you'v seen this a few times, it's
only a momentary hitch...Also, one lesson I learned from Unix is that
if you can make a program 10x faster it becomes a qualitatively
different experience. The first version of Yacc I wrote, using the
textbook algorithms, took 20 minutes to process a 30-rule grammar.
On the shared PDP-11, when I ran Yacc everyone else's jobs slowed to
a crawl, and people were heard to mutter "*%&#! Johnson's running
Yacc again!" A couple of years later, Yacc produced parsers faster
than the C compiler could compile them.Consider what using Google
would be like if it took 7 seconds rather than .7 seconds to respond.
Probably still useful, but a different user experience.We had many
decades to be spoiled as far as program performance was concerned.
As our tools got more bloated, faster processors and more memory hid
the true effect from most of us. Now we have to rethink everything
to make it parallel. And complexity is enemy number one when making
something parallel...
Steve