On 21 Jun 2018, at 14:46, Doug McIlroy <doug(a)cs.dartmouth.edu> wrote:
Whether it was the (significant) mechanical part or the electronics
that typically broke is unclear. Failures in a machine that's always
doing the same thing are easier to detect quickly than failures in
a mchine that has a varied load. Also the task at hand could fail
for many other reasons (e.g. mistranscribed messages) so there was
no presumption of correctness of results--that was determined by
reading the decrypted messages. So I think it's a stretch to
argue that reliability was known to be a manageable issue.
I think it's reasonably well-documented that Flowers understood things about making
valves (tubes) reliable that were not previously known: Colossus was well over ten times
larger than any previous valve-based system at the time and there was huge scepticism
about making it work, to the extent that he funded the initial one substantially himself
as BP wouldn't. For instance he understood that you should never turn the things
off: even quite significant maintenance was done on Colossi with them on (I believe they
were kept on even when the room flooded on occasion, which led to fairly exciting
electrical conditions). He also did things with heaters (the heater voltage was ramped up
& down to avoid thermal stresses when powering things on or off) and other tricks such
as soldering the valves into their bases to avoid connector problems.
The mechanical part (the tape reader) was in fact one significant reason for Colossus: the
previous thing, Heath Robinson, had used two tapes which needed to be kept in sync (or,
rather, needed to be allowed to drift out of sync in a controlled way with respect to each
other), and this did not work well at all as the tapes would stretch & break.
Colossus generated one of the tapes (the one corresponding to the Lorenz machine's
settings) electronically and would then sync itself to the tape with the message text on
it.
It's also not the case that the Colossi did only one thing: they were not
general-purpose machines but they were more general-purpose than they needed to be and
people found all sorts of ways of getting them to compute properties they had not
originally been intended to compute. I remember talking to Donald Michie about this (he
was one of the members of the Newmanry which was the bit of BP where the Colossi were).
There is a paper, by Flowers, in the Annals of the History of Computing which discusses a
lot of this. I am not sure if it's available online (or where my copy is).
Further, of course, you can just ask people who run one about reliability: there's a
reconstructed one at TNMOC which is well worth seeing (there's a Heath Robinson as
well), and the people there are very willing to talk about reliability -- I learnt about
the heater-voltage-ramping thing by asking what the huge rheostat was for, and later
seeing it turned on in the morning -- one of the problems with the reconstruction is that
they are required to to turn it off at night (this is the same problem that afflicts
people who look after steam locomotives. which in real life would almost never have been
allowed to get cold).
As I said, I don't want to detract from Whirlwind in any way, but it is the case that
Tommy Flowers did sort out significant aspects of the reliability of relatively large
valve systems.
--tim