[TUHS] PDP-11 legacy, C, and modern architectures
bakul at bitblocks.com
Fri Jun 29 12:02:11 AEST 2018
On Thu, 28 Jun 2018 10:15:38 -0400 "Theodore Y. Ts'o" <tytso at mit.edu> wrote:
> Bakul, I think you and Steve have a very particular set of programming
> use cases in mind, and are then over-generalizing this to assume that
> these are the only problems that matter. It's the same mistake
> Chisnall made when he assrted the parallel programming a myth that
> humans writing parallel programs was "hard", and "all you needed" was
> the right language.
Let me try to explain my thinking on this topic.
1. Cobol/C/C++/Fortran/Go/Java/Perl/Python/Ruby/... will be
around for a long time. Lots of computing platforms &
existing codes using them so them languages will continue
to be used. This is not particularly interesting or worth
2. A lot of processor architecture evolution in the last few
decades has been to make such programs run as fast as
possible but this emulation of flat shared/virtual address
spaces and squeezing out parallelism at run time from
sequential code is falling further and further behind.
The original Z80 had 8.5E3 transistors while an 8 core
Ryzen has 4.8E9 transistors (about 560K Z80s worth of
transistor). And yet, it is only 600K times faster in
Dhrystone MIPS even though it has > thousand times faster
clock rate, its data bus is 8 times wider and there is no
I make this crude comparison to point out how much less
efficient these resources are used due to the needs of
3. As Perry said, we are using parallel and distributed
computing more and more. Even the RaspberryPi Zero has a
several times more powerful GPU than its puny ARM "cpu"!
Most all cloud services use multiple cores & nodes. We may
not set up our services but we certainly use quite a few of
them via the Internet. Even on my laptop at present there
are 555 processes and 2998 threads. Most of these are
indeed "embarrassingly" parallel -- most of them don't talk
to each other!
Even local servers in any organization run a large number
Things like OpenCL is being used more and more to benefit
from whatever parallelism we can squeeze out of a GPU for
4. The reason most people prefer to use one very high perf.
CPU rather than a bunch of "wimpy" processors is *because*
most of our tooling uses only sequential languages with
very little concurrency. And just as in the case of
processors, most of our OSes also allow use of very little
parallelism. And most performance metrics focus on single
CPU performance. This is what is optimized so given these
assumptions using faster and faster CPUs makes the most
sense but we are running out that trick.
5. You may well be right that most people don't need faster
machines. Or that machines optimized for parallel languages
and codes may never succeed commercially.
But as a techie I am more interested in what can be built
(as opposed to what will sell). It is not a question of
whether problems amenable to parallel solutions are the
*only problems that matter*.
I think about these issues because
a) I find them interesting (one among many).
b) Currently we are using resources rather inefficiently
and I'm interested in what can be done about it.
c) This is the only direction in future that may yield
faster and faster solutions for large set of problems.
And in *this context* our current languages do fall short.
6. The conventional wisdom is parallel languages are a failure
and parallel programming is *hard*. Hoare's CSP and
Dijkstra's "elephants made out of mosquitos" papers are
over 40 years old. But I don't think we have a lot of
experience with parallel languages to know one way or
another. We are doing adhoc distributed systems but we
don't have a theory as to how they behave under stress.
But see also 
7. As to distributed vs parallel systems, my point was that
even if they different, there are a number of subproblems
common to them both. These should be investigated further
(beyond using adhoc solutions like Kubernetes). It may even
be possible to separate a reliability layer to simplify a
more abstract layer that is more common to the both. Not
unlike the way disks handle reliability so that at a higher
level we can treat them like a sequence of blocks (or how
IP & TCP handle reliability).
Here is a somewhat tenuous justification for why this topic does
make sense on this list: Unix provides *composable* tools. It
isn'just that one unix program did one thing well but that it
was easy to make them worked together to achieve a goal + a
few abstractions were useful from most programs. You hid
device peculiarities in device drivers and filesystem
peculiarities in filesystem code and so on. I think these same
design principles can help with distributed/parallel systems.
Even if we get an easy to use parallel language, we may still
have to worry about placement and load balancing and latencies
and so forth. And for that we may need a glue language.
 What I have discovered is that it takes some experience
and experimenting to think in a particular way that is natural
to the language at hand. As an example, when programming in k
I often start out with a sequential, loopy program . But if I
can think about in terms of "array" operations, I can iterate
fast and come up with a better solution. Not have to think
about locations and pointers make this iterativer process very
Similarly it takes a while to switch to writing forth (or
postscript) code idiomatically. Writind idiomatic code in any
language is not just a matter learning the language but being
comfortable with it, knowing what works well and in what
I suspect the same is true with parallel programming as well.
Also note that Unix hackers routinely write simple parallel
programms (shell pipelines) but these may seem quite foreign
to people who grew up using just GUI.
More information about the TUHS