Colossus - The Greatest Secret in the History of Computing

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
My name is Chris Shore, down there, I'm not hear  in my ARM capacity, although I am in product   marketing at ARM, I've been there for 20 years  - just over 20 years now so I'm not here in my   ARM capacity I'm here in my personal capacity as  a dude who has a fascination with old computing   stuff and I'm sure I have that in common with  with a lot of you it's a bit of background about   me I've been in Cambridge for the last 37 years  I came here as a student and I studied first of   all physics I stuck that for two years which was  tough believe me. And then I switched to computer   science, so by that reckoning I think I've done  two of the top degrees guaranteed to terminate   conversations at parties. The number one I believe  is tax law so I've not been anywhere near that.   Anyway tonight, Colossus. I've always had this  kind of fascination with Colossus and when the   70th anniversary of Colossus came round, in,  I think it was 2014. I sort of got my act   together and wrote a presentation about it which  I delivered for the first time at a conference in   São Paulo Brazil believe it or not, and I think  they knew even less about Colossus than anybody   else in the world so I'm not quite sure what they  made of it. Um, but sort of conventional wisdom if   you like has given us this accepted history  of the development of electronic computing   but the emergence of an increased knowledge of  Colossus changes that and it changes it a lot,   and it changes it significantly. So I started to  burrow into it and try and work that out because   Colossus, I'm sure you've heard of it and I'm  sure you know that until relatively recently,   it was it was kept on an almost total secret.  Knowledge of it started to emerge in sort of mid   to late 1970s that there was this "thing" called  Colossus which had been built at Bletchley but   really nobody knew what it did, nobody knew what  it was supposed to solve, and the first people who   gave presentations about it were very carefully  vetted by the government and we're not allowed to   say that it was anything to do with cryptography,  oddly enough. They were just allowed to say that   there was this "big thing", this big valve thing  that was very important in electronic computing.   With the publication I think it was in 2000 of  document called "General Report on Tunney". Tunney   was the codename that Bletchley gave to the code  that Colossus was built to break, that was when   people suddenly started to realize ah, that's what  Colossus did. At that point Tony Sales was already   rebuilding one at Bletchley so the government  was was a little bit way behind at that point.   So maybe you've been to Bletchley, maybe you've  seen one if you haven't I encourage you to do it.   So maybe you know something about the origins of  it and some of the names that associated with it   but as I started to dig into that you the biggest  problem was teasing out of the documentation,   there was very little available, even now. Teasing  out exactly what Colossus was built to do and the   problem that it solved and how it solved it so I  eventually managed to tease that out it's a very   frustrating process and I've tried to build it  into here and I hope to take you through that,   and I hope by the time you come out of the other  end you understand it at least to the extent   I do - hopefully a little bit more. A lot of the  material about Colossus is known to exist still in   the National Archives, and is still classified as  top secret. Nobody quite knows why, I'm intrigued   if any of it ever becomes declassified and and  we'll see what happens. So I've tried to do a   historical context, a bit of a technical context,  and a bit of a personal context around the people   who designed it and built it. So - Two, Colossus.  So, really ok what's the connection and I'm sure   you're gonna work it out what's the connection  between a country house roughly halfway between   Cambridge and Oxford on a railway line that  doesn't exist but might now get rebuilt, the   Battle of the Atlantic, a Cambridge mathematician,  one of those, which is a Lorenz type SZ42   encrypting teleprinter, believe it or not, and a  telephone engineer. And the connection between all   of those is Colossus, designed built roughly in  the second half of 1943 and when it was built it   was by far the largest and most complex electronic  computing machine of its time. Conceived in total   secrecy, and deliberately destroyed immediately  after the war to protect that secret. But there's   no denying it, that these people and those places  changed history. In principally two ways. First   of all, it's estimated and widely accepted now  that the work at Bletchley Park, principally   around Enigma and Colossus shortened the war  by anything up to two years. And you can just   imagine how many lives and and pain that saved.  And secondly, and more of interest to us I guess,   it represents the the birth of an industry in  which I work and I'm guessing that many of you   work too. And that industry is is very exciting  and we owe its birth to Colossus. So I've got to   take you back to July 1st 1943 and that's a map  of Europe in July 1943 which shows the extent   of the domination of the Nazi regime, and you can  see it goes from Norway in the North down to the   African coast in the South, and from Asia Minor  in the East all the way to the coast of France,   at the Atlantic coast in the West. And you can  see that Britain, the red bit over there the bit   that we are proud to be part of, sort of stands  alone on the western edge of Europe. The Battle   of Britain's already happened, Hitler's already  given up - at least for the time being, any hopes   that he might have had of invading England, and  has decided to invade Russia instead. But the UK   kind of standing there, is vital. Strategically  very very important to the Allied forces, it is,   or will be the stepping stone for the inevitable  re-invasion of the European mainland. And England,   the UK, is kept alive if you like by convoys  of supplies crossing the Atlantic from America   and dodging the U-boat wolfpacks of the German  Kriegsmarine so the most crucial struggle if you   like at this time this, part of the war, is on  sea, not on land. The Kriegsmarine command, the   German navy, communicates with U-boats by radio,  and they encrypt it using an Enigma machine. And   I'm sure you all know about Eningma, Enigma is  very well-known and has been known about since   the mid-70s when that particular secret came out.  And we know that British codebreakers at Bletchley   comprehensively broke Enigma using work originally  started from Polish intelligence. Enigma machine,   lovely machine, it's beautifully simple to use,  it's very robust, great to use in the field,   you set the rotors, you press a key, one of  the letters lights up and the rotors move,   and you just write down what comes out, it's very  easy. And the decryption process is exactly the   opposite, it's a reversible machine so you type  in the encrypted message and the lights light up   and bingo, out comes the plain text again. And  the resulting cipher text is just transmitted   as standard Morse and the listening stations  on the coast of England would listen to this,   transcribe it, send them off to Bletchley and  Bletchley would decrypt them. Now the Enigma, say,   relies on the fact that there are 17,576 unique  ways of setting up the initial position of those   rotors. And there are lots of other complications,  you can change the order of the rotors, you can   choose three particular rotors from a set of up  to five that you have, and then they add things   like a plug board at the front which you can see  down there. And that increases the possible number   of letter substitutions to over a hundred billion.  It's a very, very clever machine, it's beautifully   simple and used properly - it should have been  unbreakable with the technology of the time. But,   at Bletchley Park, the government code and  cipher school which has relocated out of   London to Bletchley Park, and they assembled  this this motley group of mathematicians,   crossword solvers and linguists and goodness  knows how many other people to monitor,   intercept, and break German codes. I think it's  part of the fascination for me about Bletchley   and all the stuff that happened there,  it's possibly the last time in history   when a band of complete amateurs essentially  changed history by doing something they were   completely not expected to do. And that I find  absolutely fascinating. The name of Alan Turing,   the first of our Cambridge mathematicians  here, Cambridge 1, Oxford nil - we're doing ok,   is the best associated with Enigma. He was  already famous in the maths community at the time,   he just published or very recently published  his seminal work on computation theory the   famous paper, was it on computable numbers with  an application to the Entscheidungsproblem,   and they published that in 1936 and in that  paper he describes something which he called   the Universal Computing Machine - we now call  it a Universal Turing Machine, and that's widely   acknowledged as part of the birth of modern  computing. Now he and his team, and the many,   many, hundreds and thousands that worked with  him, analysed and broke enigma comprehensively   they essentially destroyed it as a communications  device. And they did it using machines like this,   that's a Bombe, which Turing and his crew devised,  it's basically the guts of 36 Enigma machines in   a single chassis, and it chunters round making an  incredible noise, that's a reconstruction that you   can see at Bletchley, and it's capable of trying  out wheel settings at an enormous rate. And they   essentially turned Enigma codebreaking into an  industry with this thing, and a Bombe like that,   set up properly, took about 20 minutes to try  every possible starting position for an Enigma   machine. You needed some hand analysis to set  it up, and then some hand analysis to finish   the decryption, but almost any Enigma message  could be broken within a very small number of   hours and that and the the common saying is  that very often the codebreakers at Bletchley   were reading Enigma messages decrypted on our side  of the channel before the German recipients were   reading them on the other side, because we could  break them so quickly. The Bombe is quite a large   piece of machinery, you can't get an idea of the  scale of it there, it's about seven feet long,   is six feet high, and it's about two feet deep,  and it weighs a ton. They built over 200 of these   things by the end of the war. Now at the peak, at  Bletchley, there were over 9,000 people working   at Bletchley, building, say, 200 machines like  this, the Americans also built 120 of their own   for the naval Enigma. So it's a huge commitment  to brain power and resources and money and time,   was it worth it? Well, yes. Being able to read  U-boat transmissions was the key to winning,   or at least surviving, the Battle of the  Atlantic and making sure these convoys kept   getting through. And just to show you the value,  there's a graph of the tonnage of Allied shipping   being sunk month-by-month through the war, and  also the number of U-boats that we were sinking   through the war. And if I tell you that there was  a period in the middle of the war, when the naval   German Navy decided to upgrade to their Enigma  machine, and they added a fourth rotor to it.   And that shut out Bletchley from naval Enigma for  about ten months until they'd caught up with this   and rebuilt their processes. If you look at that  chart, you can tell exactly when that four months   was, and it's right there. That is the value of  being able to read your enemy's communications,   and being able to decrypt them. Because convoys  losses here went up from 150,000 tonnes a month   to 700,000 a month almost overnight, and if losses  had continued at that rate the war would probably,   certainly in the Atlantic, have been lost.  So that's the context of the cryptanalysis,   the cryptography game of Bletchley. So now let's  take a little bit of a digression and we've got   to introduce some terminology about cryptography,  and we do this quite gently because it gets quite   complex later on. So we have a plain text, we're  going to call that P, and you put that through   some process that munches it all up and ciphers it  and encodes it and you end up with the cipher text   and we call that Z. So P becomes Z. The simplest  way of doing that and way we all know we probably   all do this children at school, is called an  alphabetic substitution cipher, where you've   just got two alphabets: an input alphabet and  output alphabet. And you encode letter by letter,   just by translating from one to the other. Very  very simple. That was used over 2,000 years ago   by Julius Caesar, and we often refer to it as a  Caesar cipher. Now that thing and techniques like   it is very easy to break using simple frequency  analysis - different letters in the English   language occur with different frequencies, you can  count up the frequencies and you can guess which   cipher letter corresponds to which plain letter  and you can work backwards and you can break   things fairly easily. Now to make it a bit more  complicated, you use a more complicated scheme   called a polyalphabetic substitution cipher.  Now here you've got a multitude of alphabets,   that example there has three, and you just use  them in rotation. So the first character gets its   cipher using the first alphabet, the second with  the second one, and the third with the third one,   then you go back to the first one, and so on.  That's much, much harder to break because unless   you know how many alphabets there, are you don't  know how to break down this this cipher text into   subsets, and then carry out frequency analysis  on the subsets and work backwards to what all   the individual alphabets are. Now, it's possible,  but it's much, much harder and that is what Enigma   does. Enigma generates essentially a non-repeating  sequence of 16,900 alphabets one after the other.   Now the average message sent with Enigma was only  a few hundred characters, so the alphabets never   repeat in an Enigma message. That's what makes  Enigma much, much, much harder to break. Frequency   analysis is essentially impossible. And an Enigma  message as it arrived at Bletchley looked like   that, that's a photograph of a real one, and you  can see it's been transcribed from Morse by one   of the listening stations. Up here tells you when  it was transcribed and likely where it came from,   and what frequency it was received on and  what-have-you. And that was fed into the   codebreaking industry at Bletchley and broken.  But then in early 1941 they started hearing a   completely different kind of transmission. And it  looked like this. It's very high-speed automated   transmission. So it's not standard Morse with a  key, it's been generated by a machine. So they   built little ticker tape machines that would  transcribe this into into ones and zeros and   printed out on a little ticker tape and then it  would be hand transcribed into letters. Because   it turns out, it's a standard international  telegraph alphabet. And if you look as they did,   at the network on which this traffic appeared, you  can use traffic analysis and direction-finding to   work out where that network is and who's connected  to what. And the first thing you notice about it   is it's not used very much. This is quite a,  quite a sparse network. It's also very long   distance and it's centres on Berlin. So the kind  of guess is that this is very high-grade traffic   it goes direct to Nazi High Command in Berlin,  and if we could break into it, this would be very   very useful indeed. So, they started looking at  it and the first person to look at it was a guy   called John Tiltman and he established that it was  something called a Vernam cipher. He was unable to   analyse it far enough to do anything with it,  but he established that it works like this:   So you take your message and you convert your  plain text using a standard teleprinter alphabet   into 5-bit binary representation. And that's the  standard International Telegraph alphabet number   two. It's perfectly standard, nothing unusual  about that. And then you combine it with a key   string using what they called "carry-less binary  addition" but we know as "exclusive or". So you   exclusive or that, bit-wise, with some kind of  pseudorandom key stream. And the key stream we   call K, and the result is your cipher text: Z.  Okay so it's a binary addition or binary operation   which now we call exclusive or. And this process  works backwards as well, just like Enigma does,   so you take this and you add it back to the  key stream and you get the plain text down so   it's it's a reversible process. That's one of the  properties of exclusive or, if you've studied that   as I'm sure you have. That if you add something  back to itself, the original result comes out.   So very, very simple. So, the receiving operator  would simply take this, punch it on a tape, feed   it into the machine set up exactly the same way  as the originating machine - and out would come   the plain text. Very, very simple. So Tiltman got  that far and no further. Then they had an enormous   stroke of luck. 31st of August 1941, an operator  in Vienna sent a message that did not get through.   So the recipient asked him "could you send that  again please 'cause I missed it." And, breaking   all the rules, he reset his machine to the same  original settings and typed out the message   again. But he didn't quite send it exactly the  same way, 'cause he was a bit bored and probably   a bit tired, maybe it was the end of the day, he  used some abbreviations in the second message so   it wasn't quite exactly the same, it was a little  bit shorter but it saved in time typing it out. So   there are the two messages, and you can see that  they were both over 4,000 characters long the   first 12 characters are the same and it starts the  same, and then you can see it starts to diverge.   So, Tiltman guessed, that standard operating  procedures in German encryption meant the first   12 characters basically told the receiving  guy "this is the way I set my machine up",   so we can ignore those because we don't know what  they mean. But the rest of it was the message,   and he noticed the second message started the  same but then it started to get different. Now,   being a Vernam cipher, which we know how that  works too - he knew that they would be encrypted   using the same key stream, and if you add the two  messages together in binary, what happens is the   key disappears. Okay? Because they're both added  together with the same key. Add them together and   the key actually disappears so what he ended up  with was the two messages added together. So:   cipher text 1 cipher text 2. Add them together  and so by comparing these two messages and very   carefully guessing at what the first character  was, subtracting it from the other one, and seeing   if that made sense, and then letter by letter,  trying to go forward through and forwards and   backwards constantly subtracting each one from the  other to see if he could make sense of it. He did,   eventually, after ten days, managed to break  out the message - both of them in fact. 4,000   characters long. Now that in intelligence terms  is useless, because the message is about six weeks   old by now. So the content of it is useless, but  he then realized that if we take these two plain   texts and we add them back together, what you  get is the keystream. So he managed to produce   4,000 characters of the keystream, which doesn't  sound massively useful but actually turns out to   be worth its weight in gold. They looked at it for  about three months, they analysed it to death and   couldn't come up with with any deductions from it  at all. Until they gave it to a guy called Bill   Tutte - another Cambridge mathematician. Cambridge  2, Oxford nil. So he took it away, locked himself   in his room, and by writing the keystream out  in grids again, and again, and again, in grids   of different sizes and different repeats - huge  quantities doubtless of pencils and coffee and   paper, he managed to work out the entire structure  of the machine from four thousand characters of   the keystream. Now just remember that I showed  you a picture of one of those teleprinters right   at the beginning of the presentation. No-one  in Britain had seen one of those machines,   and no-one did until after the war ended. But  without seeing one, Tutte managed to work out the   entire internal structure of the machine what he  deduced is that it didn't just have three wheels   like enigma does it had twelve quite a complex  machine he also managed to work out the number of   bit positions the number of steps on each wheel  and he identified that they were in three sets   of wheels the first set he called these the Chi  wheels and they advanced as a group one position   for every character that went through the machine  generating a random pseudo-random bit stream of   five bits. The second set, he called those the  Psi wheels, nobody quite knows why. But they   moved approximately 50% of the time, so roughly  every other character but in a random sequence.   And they generated another pseudo-random sequence  of five bits. And this third set of two wheels in   the middle, he called those the motor wheels or  the Mu wheels, and they controlled this stuttering   movement of the Psi wheels on the end. Tutte,  after the war, moved to Canada and he was awarded   the Officer of the Order of Canada I guess that's  a Canadian equivalent of an OBE or something,   I don't know. And in the citation for that,  what he achieved here, was described as one   of the greatest intellectual feats of World War  II. It's quite an astonishing piece of work. So,   now we can work out exactly how the Lorenz machine  actually works. The input tape on the left is the   plain text which is on the left over there, and  you add that successively to the Chi stream, and   the Psi stream, and what you end up with is your  cipher text, Z. That comes out on another tape,   you feed it through the machine, it automatically  transmits it as Morse; the receiving guy puts it   back in his machine as it's the Psi stream and  the Chi stream and out comes the plain text very   simple because it's a reversible process. Now  say it's a two-stage combination and the repeat   lengths of the Chi stream, Psi stream are huge.  The Chi stream repeats after 22 million steps. the   Psi stream after 322 million steps. So the huge  repeat lengths here, and if you combine those   and you factor in this this stuttering movement  of the Psi wheels you can see they don't move   every character, they move roughly every other,  roughly 50% of the time. To search through every   possible setting or every possible combination  of that key stream involves trying out 10 to the   19 combinations. Even with modern computing  technology that's a challenge. In 1943 it's   probably one of the definitions of impossible.  The technology they have at the time it's just not   gonna work out. Even to identify 1 bit position in  the stream would take about 4 million trials. Now,   we're going to notice something interesting,  say, as the Bletchley team did, the Psi dream   doesn't change for every character, so they move  about 50% of the time. Now the German designer   probably thought that was clever. He probably  thought he was doing something very nifty here by   not advancing the Psi dream with every character.  But actually it turns out, that that is one tiny,   tiny flaw in the machine, which the codebreakers  at Bletchley Park worked out how to exploit. And   what they realized, say, is there's the Karnaugh  map of exclusive-or, adding zero to something   does nothing. Adding something to itself always  gives zero, and if you add the same thing twice,   it undoes the first addition and you get the  original result back. So Tutte noticed this,   and noticed that because of the way Boolean  logic works you say adding something to itself   results in zero and using that, you can magnify  the results, or the effect, of repeats in the   keystream or in the plain text. Because repeats,  when you add them characters together, repeats   generate zeroes. So you can you can exploit that,  tiny. Now okay this is the tricky bit. So just pay   attention, bear with me, and I'll see if I can get  this right. So, plain text in German it turns out   contains roughly 20% of repeated letters, in  a lot of messages. German has, for instance,   a lot of double letters: mm, ee, tt, ss - are very  common in written German. So, also, the way that   the teleprinter alphabet works, it's a 5-bit code  so you can only encode 31 characters with that, so   in order to get access to punctuation and numbers,  the machine, or the the alphabet, implements   a shift technique so to encode a full stop for  instance, you pressed a key called figure shift,   we sent a shift character and you then press full  stop and you then send a letter shift to go back   to letters again. Now it turns out that operators  realized that if the receiving place missed the   figure shift character the rest of the message was  a complete gobbledygook, and because it would all   be in the shifted alphabet. So they would send the  figure shift character twice - and then they would   very often type full stop twice, and they type the  letter shift character twice at the end. So that's   three repeats in six characters. So on a good day,  say, the plain text contained approximately 20%   repeats. You also notice the Psi stream contains  about 50% repeats, and about 50% of those result   in a repeated bit position. So, Psi is about  70% zero repeats in any given bit position. Now,   how can you use that to - what can you do to  detect these repeats and exploit them. And what   they realized is that if you add each character to  the next one, the result is zero when the two bits   in those characters are the same. Right, that's  the key realization. Because adding something to   itself results in zero. So they generated what  they called a delta stream, so they would take   the cipher text and add each character to the next  one they call this the delta stream. And in that   position, a zero indicated a repeat in the cipher  stream. Now then, Delta P right, the Delta plain   text is about 60% zero. For an entirely random  sequence it would be 50% zero. But Delta P turns   out actually to be 60% zero approximately for an  average message. And Delta Psi is about 70% zero,   and using this, it gives you a tiny little window  through into the underlying message. So if you   remember that Z is P + Chi + Psi, it turns out  Delta operator distributes very nicely over that,   so the Delta stream that you generate from the  cipher stream is just Delta P added to Delta   Chi added to Delta Psi. So you rearrange that a  little bit and turns out that the cipher stream,   the Delta cipher stream, added to the Delta  Chi stream is Delta P + Delta Psi. And Delta   P + Delta Psi is zero approximately 55% of the  time. If the plain text was totally random,   it would be 50% zero. Turns out it's 55%, on a  good day. For a good message, right, sometimes,   it's a lot smaller than that. So if you could  generate a candidate Chi stream, Delta it,   add it to the Delta of the cipher stream, and  count the zeros - IF you had the starting position   of the Chi stream correct, you'd see slightly  too many zeros in the result. More than you were   expecting. Now that means you've got to generate  a candidate Chi stream, remember how long that   Chi stream is it's 22 million characters long. So  that means you've got to try 22 million possible   start positions in order to get a possibly  significant result. Now you can't do that,   that's not gonna work. So what about if you try  just a subset of the bits and it turns out that   yes, that works - if you try a Chi 0 and Chi 1 ok,  and you Delta that and you add it to Z 0 and Z 1,   Delta'd, you can get a significant result, just  out of two bits. Now the Chi stream, the two bits,   is only 1271 characters long. So you've got to try  1.6 million combinations to see if you've got the   right start position. So what that does is reduces  the number of tests you have to do from roughly 7   times 10 to the 15 tests - to 1.6 million. Right,  that's a factor of a trillion reduction in the   number of tests you've got to do by doing this  technique here. So by hand at one a second that   would take about 18 days. They tried it, believe  it or not, and it took them between four and   six weeks for one message - but it worked. And a  message that takes four to six weeks to break of   course is useless, but if you can do it a little  bit faster, then this technique does work and if   it's fast enough it would be useful. What if you  could do it at 2,000 characters a second? Then,   it would take about 14 minutes. This is useful.  Enter our third Cambridge mathematician, 3-nil,   Max Newman. St John's College mathematician, and  between January and June 1943, he conceived and   built quite a clever machine that could do just  that. It could run the tests at 2,000 characters   a second, and basically you punch your message  text on one tape, you punch the Chi stream on   another tape, and you run the two tapes through  a pair of tape readers, and you put some clever   logic in the middle that does the Delta-ing,  and does the combining, and counts the zeros and   if you glue those tapes into loops and you make  sure that the lengths of the loops are co-prime,   then you can just run the loop continuously  and over time if you do it at least 1271 times   through the machine, you will have tried all the  possible start positions. And there is a picture   of it. And they called it Heath Robinson after the  guide the cartoonist that specialised in drawing   weird machines - and it worked. And it made it  possible to break out those Chi wheel settings,   two bits at a time, and then you could break  out Psi wheels and the motor wheels by hand,   that wasn't too difficult. And it ran the tapes  through, you can see two tapes on that bedstead   frame there. Two tapes, it ran those through at  2,000 characters a second, and it completed a   single candidate run in a roughly 15 minutes.  So they reduced the time using Heath Robinson   to break a Lorenz message from four to six weeks  down to less than a few days, which is fantastic,   and it works, and it's useful. But Heath Robinson  was very hard to use, and not without its   problems. Synchronizing to tapes like that a very  high speed was very, very difficult. The Chi tapes   had to be prepared by hand every single time, and  after you'd used them a few times they stretched   and they broke so you had to keep remaking them.  Heath Robinson could only help with breaking out   the Chi stream, it couldn't do anything else, that  was all it could do, and it could only do that two   bits at a time. And it couldn't help me say with  Psi wheels and the motor wheels. But, it worked.   Now okay enter our final player Tommy Flowers,  whose name that you may have heard. And he built   that combining unit for Max Newman. He was a  Post Office research engineer at Dollis Hill,   and there he worked on telephone exchanges, and he  had a background in mechanical engineering because   all those telephone exchanges were mechanical  at the time, and he took evening classes at   University College London in electronics. And  he became convinced that it would be possible to   build an entirely electronic exchange using valve  technology. But his managers wouldn't believe him   and they wouldn't let him try it. So he built this  combine for Max Newman, to Max Newman's design   from a combination of valves and relays. And when  he saw his combining unit in use, when he saw the   use it was being put to, and the problems they had  with all these tapes, he realized that there was a   much, much better way of doing this and he pitched  to Max Newman a quite incredible idea. He said "I   can build you a machine that will generate the  Chi stream internally, entirely automatically,   electronically, and it will do all of that  calculation against an entirely generated   stream using a single tape and it will do all that  differencing, all that statistical analysis, and   he said "I'll even connect it to a teleprinter for  you and it can print out the results so you don't   have to write them down by hand." Now Neumann  was interested, thought it was a good idea,   but he wasn't convinced that Flowers could do  it. So Flowers went away back to Dollis Hill,   back to his research lab, and he came back to  Bletchley a few months later, early January 1944   with a big lorry, and on the back of the lorry  was a an enormous machine, it was instantly   dubbed Colossus just because it was so huge,  it's 7 feet high, 16 feet long, two feet wide,   and the first version was constructed from roughly  1500 valves. Flowers later expanded it to Colossus   Mark II which had 2400 valves, by far and away,  the biggest electronic machine that had been built   to date. And it processed text at 5,000 characters  a second. Colossus 2 - the expanded version,   was five times faster than that, 25,000 characters  a second. And that meant running the tape at 30   miles an hour - about 40 feet per second. Which  is quite astonishing. Now Colossus could actually   read and consume characters at 10,000 characters  a second, but they discovered that the tape would   physically disintegrate at 53 miles an hour, so  they couldn't actually run Colossus as fast as   it was capable because of the limitation of the  tape. Not the last time I'm willing to bet that   a computer has been limited by the speed of its  input device. Now Colossus was hugely parallel,   and it was capable of compressing say all five  channels at once, and doing up to a hundred   Boolean calculations simultaneously on all of  those five channels. Using Colossus over the next   year or so, the next year-and-a-half, Bletchley  Park deciphered over 63 million characters of   traffic using Colossus, successfully deciphered.  Now if you compare this to Heath Robinson there's   a few more pictures of it which survived from the  war. It wasn't just massively faster, but it could   do a huge amount more as well. Being completely  reconfigurable and somewhat programmable (but not   in the way we would understand it), it could help  with not just determining the starting positions   of the start at the Chi wheels but also the  Psi wheels and the motor wheels as well,   all at the same time. It could also do some other  things; every now and then the German operators   for instance would change the bit patterns on all  the wheels. And when they did that, Bletchley Park   basically had to go back and start regenerating  candidate Chi streams and break the wheel patterns   out again. Colossus could do that for them every  time they did it. The Germans did that roughly   every month throughout the whole war, and towards  the end of the war they started doing it once   a week.. So without something like Colossus  being able to regenerate those bit patterns   on the wheels, it would have shut them out almost  completely. So Flowers machine did all of that and   it did it at really quite astonishing speed. I've  got a couple of excerpts from Flowers' diaries   here which I think are absolutely fascinating  he's incredibly pragmatic and very practical guy,   not given to huge emotion. So you look at this:  Sunday the 16th of January 1944: Made Colossus   work. That's good, innit! Tuesday the 18th of  January: delivered Colossus to Bletchley Park.   And you look at the entry in the middle: Took  John Senior and Eileen to Peter Pan. I mean yeah,   you've got to take a break from work somewhere.  And there's another another lovely entry here,   5th of February 1944 down here: "Colossus did its  first job". That was it. So it had taken really   only a few days to install and commission Colossus  and actually get it working and decrypting real   traffic, it really only took a few days right at  the beginning of 1944. And you look underneath   the rest of that entry it's his car broke down  on the way home, picked up by Farthing in the   radio car. Home at 1am. So he wasn't the first and  probably won't be the last computer guy to pull an   all-nighter either. So they they built a total  of ten Colossi by the end of the war. Decrypting   over sixty million characters. So, if we take  a look at Colossus and its basic architecture,   there's a very basic block diagram but on  the face of it it looks incredibly simple,   and you can see all that parallelism through, it's  processing all of those characters in parallel.   One very clever touch, invented by Flowers, was  to make the machine self-timing, if you look at a   piece of teleprinter tape you can see that down  the centre is an additional row of holes - the   sprocket holes. And in a standard tape reader  you've got a mechanical gear wheel that engaged   with those sprocket holes and pulled the tape  through one character at a time. So what Flowers   did was he put an additional photocell underneath  the sprocket holes and generated a timing signal   from the sprocket holes. So Colossus, you didn't  have to synchronize the tape to the machine. The   machine synchronized itself to the tape, so  it didn't matter how fast you ran the tape,   it didn't matter whether the tape stretched a bit,  Colossus would automatically synchronize to the   tape which is a very, very clever innovation  from Flowers. There are awful lot of other   innovations that come out of it. It's the first  recorded use of what we now call shift registers,   for instance. And he used that to shift every  character that it read in, to shift them through   a shift register so he could keep each pair  of characters next to each other and do the   difference in calculation. And Colossus moved  on just from doing single Deltas - from doing   deltas at two steps and three steps and four steps  away, so he would keep all the characters in shift   registers. Using a single clock to synchronize  the entire machine, again, very clever idea,   hugely parallel - it had what I reckon has got to  be one of the first instances of an interrupt. If   the printer was already printing out a result  and the tape would just carry on going for   looking for the next one - if it found the next  possible result before the printer had finished   the printer would just flip a little signal to  say hang on a minute, and the the tape side of   the machine which is basically going to idle and  carry on idling until the printer had finished,   and then we'd carry on from where it left off,  now that's pretty close to an interrupter for   you ask me. From a hardware peripheral. An optical  tape reader capable of up to 10,000 characters a   second. When I first started computing in 1976 or  '77, we used 10 character per second tape readers.   And there was Flowers building a 10,000 character  optical reader in 1943. He also came up with a   concept something called systolic arrays, which is  a totally revolutionary concept which no one had   ever come up with before, it's a data-driven  computation architecture which people are   starting to get interested again in the field of  artificial intelligence machine learning, there's   possibilities of using systolic arrays to speed  up the execution of neural networks, for instance.   They were independently - huh, "discovered" by  a couple of researchers in 1973, he published a   paper on it and didn't realize at the time that  Flowers had already discovered them in 1943. So   it's clocks being 5 kilohertz which is basically  the speed of the tape, and it carried out say a   hundred Boolean operations simultaneously on  each of these five independent bit positions   in the input stream. And by my rough calculation  that's 2.5 million Boolean operations per second,   which comes out 2.5 megabops and I like megabops  as a unit, I think that's good. So, what about the   impact of Colossus, did it do what it was supposed  to do, did it repay its investment? Well, Flowers   and the team were put under huge pressure to have  this machine operational by June 1944. They had no   idea why. Now with hindsight, we do. We know that  Eisenhower and the Allied command were planning   the re-invasion of the European mainland - they'd  been planning it for a while. The opposition that   we now know as D-Day and it was planned  for June 1944. It's hard to underestimate,   or overestimate even, how much rests on that  operation succeeding. It's very, very risky.   There's a heavily defended coastline on the other  side, and the enemy essentially know that sooner   or later. that invasion is going to come. So the  Allies put in place a massive deception operation,   fake army groups, fictitious intelligence, false  plans that were conveniently dropped all over the   place to try and hoodwink the Nazi command that  the invasion would not come in Normandy. But   it would come in Calais. So the success of that  deception was absolutely vital, but how would they   know that they'd been successful in deceiving the  Nazi command? They needed to know what the Nazi   command would be thinking - had they been fooled?  And this is what Colossus could tell them by   reading the Nazi High Command transmissions. Now,  when you look at D-Day, the window of the weather,   season, tide, and the moon, meant that there  was a very narrow window of only a few days each   month when all of those things coincided, and the  operation could actually be launched. And if you   delayed more than four or five days each month  you then had to hop to the next month when all   these things would come in sequence again. If,  beyond June 1944 they delayed more than two or   three months, winter would set in, and they'd have  to wait till 1945. So, being able to launch that   invasion in early June was absolutely crucial and  they needed that up-to-date intelligence. So they   originally set it for June 1st. It was delayed to  June 2nd, and then the 3rd and then the 4th and   then the 5th. June the 6th was the last  opportunity that month before they'd have to   delay until July. Now on June the 5th, Eisenhower  met with his generals to decide whether to go on   June the 6th as they were meeting, so the story  goes, and I believe this story is corroborated,   a courier arrived from Bletchley Park, walked  into the room and handed Eisenhower a folded   piece of paper. Eisenhower opened that and read  it. And what he read was a message that had been   decrypted by Colossus at Bletchley Park just a few  hours earlier from Hitler in Berlin, to Rommel,   commander of the Nazi forces on the Atlantic  wall. And that message stated Hitler's categorical   belief that yes, there would be an evasion in  Normandy, but that it would be a diversion and   the real invasion would happen 5 days later in  Calais. And that message specifically refused   Rommel permission to move his tank divisions from  Calais to Normandy to reinforce his defences. And   this was exactly the intelligence that Eisenhower  had been waiting for - confirmation that the Nazi   command had been completely deceived as to  the timing and location of the attack. So,   Eisenhower was not, didn't reveal what he  just read, he wasn't allowed to because people   weren't allowed to know that we were reading the  Germans communications. He folded the paper up,   he looked around at the assembled group  of generals and he said simply "Gentlemen,   we go tomorrow." June the 6th, D-Day, the  invasion went ahead, and the war ended a year   later. So Churchill credited Bletchley Park and  the cryptanalysts who worked there with winning   the war, largely. Most historians agree that it  shortened the war by at least, or up to two years,   and it saved I don't know how many hundreds and  thousands of lives. That estimate is attributed by   the way to a guy called Harry Hinsley, one of the  first early historians of Bletchley Park, and he   made that remark at a conference in Newcastle in  1979. He was a historian from John's in Cambridge,   he worked at Bletchley Park during the war and  he came back here and eventually became master   of John's and Vice Chancellor of the University,  another Cambridge connection, a very pleasing one.   Another story that came out just towards the  towards the end of that as an amazing piece of   luck, which Tommy Flowers reported many many years  later, he was actually in Berlin on Post Office   business a few days before hostilities broke  out at the beginning of the war. He was called   by the British Embassy and told to return home  immediately, and he crossed into Holland about   three hours before the German frontier closed.  If the Germans had stopped him and interned him,   there really possibly might not have been anyone  at the Post Office with that unique combination of   mechanical technology, electronic technology, and  high-speed electronics at Bletchley Park. No-one   with the knowledge to build what became Colossus,  so the Germans had him just there, and let him go,   without knowing it. Lovely thing. At the end of  the war, what happened to it all? Well, very sadly   as we know, desperate to preserve the secret,  Churchill ordered specifically. that Colossus be   destroyed. All of them, be broken into in his own  words "pieces no bigger than a man's hand." and   Tommy Flowers writing after the war recorded that  what he did, he said it was a terrible mistake,   I was instructed to destroy all the records which  I did, I took all the drawings and all the plans   and all the information about Colossus on paper  and I put it in the boiler and I saw it burn.   That must have been an absolutely heart-breaking  moment. But we do know that two Colossi were moved   to GCHQ at Eastcote in April 1946. They were moved  to Cheltenham in the early 50s, one of them was   dismantled in 1959, and the other in 1960. So  they did at least survive for a little while,   but no documentation about them survived. The  main reason they were kept is that UK intelligence   used Enigma-like machines and Lorenz-like  machines in their communications after the war,   and we promoted them and we sold them to lots of  other governments, without telling them that we   could read everything they wrote. So that's why  they kept Colossus for a lot of research on that.   But as we moved into the 60s and all digital  computerized encryption became common, they   essentially became obsolete. So to bring it up to  date, the first books to describe Enigma came out   in about 1973, and over the the few years after  that, books started to hint at the existence of a   much much bigger secret at Bletchley without being  able to reveal what that secret was, and the name   Colossus kept coming up occasionally. The British  government finally admitted in 1976 that there was   something called Colossus - principally because  they were pushed by a guy called Brian Randall   who did more to reveal Colossus than anybody else.  He's professor of computer science at Newcastle   University and he was given a few pictures like  the ones that we've seen that had survived of   Bletchley during the war, and he was allowed to  give a presentation on it in Los Alamos in June   1976. He wasn't allowed to say that Colossus  had anything to do with codebreaking, but he   was roughly allowed to describe the scale and the  type and the function of the machine. And there   was a witness in the audience that day in in June  1976 Los Alamos he happened to be sitting in the   audience next to a guy called John Mauchly you may  remember as Eckert and Mauchly who designed ENIAC,   and ENIAC at the time was regarded as the birth  of electronic computing, and it actually became   operational shortly after the end of the war.  And sitting next to Mauchly in the audience this   witness says that's the first time I realized what  jaw-dropping actually meant. Because Mauchly's jaw   dropped as he realized he'd been comprehensively  upstaged. It's real detail about Colossus and   the detail of what we've seen today about how  it actually worked and what it actually did,   didn't come out until 2000. That's 55 years later,  when the government published a secret report on   it that was published simultaneously I think in  America and the UK. And that's really the sum   total of what we know about Colossus these days.  But, look at it's legacy. Some parts we know were   taken to Newman's Computing Machine Laboratory at  Manchester University where he went after the war,   and several historians have commented on the  explosion of electronic computing research that   happened in England immediately after the war.  And there were several laboratories - Turing at   the National Physical Laboratory, Max Newman at  Manchester and Maurice Wilkes and his team in   Cambridge. We'd say we know that Newman took some  Colossus parts to Manchester which he re-used. And   on June the 21st 1948 his Manchester Baby machine  became the first operational stored-program   computer. Newman knew about Colossus, he'd  helped design it, he'd helped build it,   he'd seen it work. Now, he knew such machines  were possible and it's inconceivable that he   didn't take that knowledge with him to Manchester,  and it's inconceivable that it didn't accelerate   the research that he did in winning the race  to build the first operational computer. There   was lots of other work going on separately from  this, a lot of pioneering work on radar at the   Telecommunications Research Establishment and  elsewhere in England during the war, to do with   high speed counting electronics and a lot of that  was very influential in the building of EDSAC. So   a conventional history of competing runs something  like this: machine called EDVAC, designed by the   late great John von Neumann that had the greatest  influence arguably on early architectures, but   EDVAC wasn't delivered until 1949. It didn't begin  operation until 1951. Then the Manchester Baby   or the small-scale experimental machine it was  called. That first ran on 21st of June 1948 was   the first thing that we would recognize today as  a stored-program computer. And then EDSAC followed   very shortly afterwards built here in Cambridge by  Maurice Wilkes, that ran its first program on May   the 6th 1949. There's another machine called ENIAC  which is often credited in a lot of literature   as being the first operational computer, but it  wasn't begun until June 1943. It wasn't announced   and operational until February 1946, it wasn't  reprogrammable and it wasn't Turing complete.   And it was definitely later than Colossus. So, I  took this picture from a U.S. publication in 1957,   which purports to show an entire family tree of  the development of computing devices at the time.   So you can see it recognizes ENIAC, and EDVAC  really is the the foundation of that machine,   now Colossus might not have been Turing complete,  but neither was the Harvard Mark 1 which you can   see down there, completed in February 1944.  At first running in 1944 in May, that wasn't   Turing complete either. ENIAC, announced February  1946 that wasn't Turing complete either. Colossus   pre-dates everything literally everything  on that tree, and it doesn't appear on that   publication from 1957, and it would not appear in  any publication for another 30 years after that at   least, because it was kept secret for 40 years  after the war. And that really denied Flowers,   principally, and others like Newman, denied  them their true place in history as far as   I'm concerned, in the development of electronic  computing. Flowers incidentally, and very sadly,   ended the war in debt because he'd used some  of his own money to buy components in order to   build Colossus because he couldn't get it funded  by Newman, he couldn't get it funded by the Post   Office either. Others, like Newman say, went  back into academic fields and were recognized   in the the official development of the history of  computing. Flowers went back to the Post Office.   And he remained in the shadows probably for  about another 50 years, if not more. And was   never really recognized in his own lifetime. Even  more sadly he went back to the Post Office and he   re-pitched his idea about building an entirely  electronic telephone exchange - to his bosses,   and his bosses said no, you can't build a machine  that big out of valves. Flowers knew that it   could be done because he'd done it, but he wasn't  allowed to say how he knew, and he wasn't allowed   to give the proof, so he was never allowed to  build his electronic exchange. But the best   thing about Colossus, its position in history, is  that you can still see it. Or at least you can see   a rebuild of it. Although as we've said all the  extant machines were destroyed either immediately   or very shortly after the war, all the drawings  were burned, all the plans had been burned,   when the existence of Colossus emerged, one man  undertook to recreate it. And as all engineers do,   it turns out that some of the people involved had  kept bits, of course they do. How many engineers   do you donate do you know who've got a garage  full of interesting bits of stuff that they   keep just in case, because it was interesting.  So people had kept photographs, they kept odd   little bits of circuit diagrams, and they kept  odd little bits of components, and using those,   Tony Sale started to try and rebuild and recreate  Colossus through the 1990s. It was completed in   2007. Took a huge amount of time to do it, mainly  because of the scarcity of information. And if   you go to Bletchley Park today, you can see  it in action and it's an incredible piece of   engineering. If you haven't been to see it, please  do. Because what you can see, and if you're lucky   and if they let you over the barrier, actually  touch - is a machine that literally changed   history. Not only did it shorten the war and save  a huge number of lives and a huge amount of pain,   but it gave birth to an industry as far as I'm  concerned. And as far as many historians are now   starting to agree. And it gave birth to modern  computing technology and computing industry. A   few more pictures there. So who do you dedicate  something like this to. Well, I think it's well   known that a lot of people who worked at Bletchley  in the war, you know, they were all told to sign   the Official Secrets Act, they never talked  about what they did. One of the most poignant   quotes that I can find about that comes from lady  called Katherine Cocking who worked there through   the war. She said afterwards "I was in fear of the  secret that I had to keep, but keep it I did, and   my greatest sadness is my beloved husband died in  1975 without ever knowing what I did in the war."   That applied to a huge number of people who worked  there, not least of all to Tommy Flowers. So if   you're going to dedicate this to anyone there's  12,000 people who worked at Bletchley Park,   and 12,000 people who kept a secret for as long  as they had to. Bill Tutte and that incredible   feat of analysis, an entirely mental exercise  that deciphered the internal structure of an   incredibly complex machine from 4000 essentially  pseudo-random characters of a key text. Tommy   Flowers sadly died in 1998 he never saw Colossus  rebuilt although he knew that Tony Sale was trying   to do it and was never really recognized in his  lifetime. Max Newman who took what had happened   at Bletchley and if you like won the race after  the war to build the first computer at Manchester,   and then Tony Sale bless him, who devoted the  last 15 years of his life to bringing Colossus   back to life again. And doubtless the million  possible lives that Colossus saved by shortening   the war. So, thank you. I hope you found that  as fascinating as I did, trying to tease out   what Colossus did, and some of the incredible  people who were involved in doing it. Thanks.
Info
Channel: The Centre for Computing History
Views: 244,817
Rating: 4.9162059 out of 5
Keywords:
Id: g2tMcMQqSbA
Channel Id: undefined
Length: 60min 26sec (3626 seconds)
Published: Mon May 04 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.