Colossus - The Greatest Secret in the History of Computing

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

My name is Chris Shore, down there, I'm not hear in my ARM capacity, although I am in product marketing at ARM, I've been there for 20 years - just over 20 years now so I'm not here in my ARM capacity I'm here in my personal capacity as a dude who has a fascination with old computing stuff and I'm sure I have that in common with with a lot of you it's a bit of background about me I've been in Cambridge for the last 37 years I came here as a student and I studied first of all physics I stuck that for two years which was tough believe me. And then I switched to computer science, so by that reckoning I think I've done two of the top degrees guaranteed to terminate conversations at parties. The number one I believe is tax law so I've not been anywhere near that. Anyway tonight, Colossus. I've always had this kind of fascination with Colossus and when the 70th anniversary of Colossus came round, in, I think it was 2014. I sort of got my act together and wrote a presentation about it which I delivered for the first time at a conference in São Paulo Brazil believe it or not, and I think they knew even less about Colossus than anybody else in the world so I'm not quite sure what they made of it. Um, but sort of conventional wisdom if you like has given us this accepted history of the development of electronic computing but the emergence of an increased knowledge of Colossus changes that and it changes it a lot, and it changes it significantly. So I started to burrow into it and try and work that out because Colossus, I'm sure you've heard of it and I'm sure you know that until relatively recently, it was it was kept on an almost total secret. Knowledge of it started to emerge in sort of mid to late 1970s that there was this "thing" called Colossus which had been built at Bletchley but really nobody knew what it did, nobody knew what it was supposed to solve, and the first people who gave presentations about it were very carefully vetted by the government and we're not allowed to say that it was anything to do with cryptography, oddly enough. They were just allowed to say that there was this "big thing", this big valve thing that was very important in electronic computing. With the publication I think it was in 2000 of document called "General Report on Tunney". Tunney was the codename that Bletchley gave to the code that Colossus was built to break, that was when people suddenly started to realize ah, that's what Colossus did. At that point Tony Sales was already rebuilding one at Bletchley so the government was was a little bit way behind at that point. So maybe you've been to Bletchley, maybe you've seen one if you haven't I encourage you to do it. So maybe you know something about the origins of it and some of the names that associated with it but as I started to dig into that you the biggest problem was teasing out of the documentation, there was very little available, even now. Teasing out exactly what Colossus was built to do and the problem that it solved and how it solved it so I eventually managed to tease that out it's a very frustrating process and I've tried to build it into here and I hope to take you through that, and I hope by the time you come out of the other end you understand it at least to the extent I do - hopefully a little bit more. A lot of the material about Colossus is known to exist still in the National Archives, and is still classified as top secret. Nobody quite knows why, I'm intrigued if any of it ever becomes declassified and and we'll see what happens. So I've tried to do a historical context, a bit of a technical context, and a bit of a personal context around the people who designed it and built it. So - Two, Colossus. So, really ok what's the connection and I'm sure you're gonna work it out what's the connection between a country house roughly halfway between Cambridge and Oxford on a railway line that doesn't exist but might now get rebuilt, the Battle of the Atlantic, a Cambridge mathematician, one of those, which is a Lorenz type SZ42 encrypting teleprinter, believe it or not, and a telephone engineer. And the connection between all of those is Colossus, designed built roughly in the second half of 1943 and when it was built it was by far the largest and most complex electronic computing machine of its time. Conceived in total secrecy, and deliberately destroyed immediately after the war to protect that secret. But there's no denying it, that these people and those places changed history. In principally two ways. First of all, it's estimated and widely accepted now that the work at Bletchley Park, principally around Enigma and Colossus shortened the war by anything up to two years. And you can just imagine how many lives and and pain that saved. And secondly, and more of interest to us I guess, it represents the the birth of an industry in which I work and I'm guessing that many of you work too. And that industry is is very exciting and we owe its birth to Colossus. So I've got to take you back to July 1st 1943 and that's a map of Europe in July 1943 which shows the extent of the domination of the Nazi regime, and you can see it goes from Norway in the North down to the African coast in the South, and from Asia Minor in the East all the way to the coast of France, at the Atlantic coast in the West. And you can see that Britain, the red bit over there the bit that we are proud to be part of, sort of stands alone on the western edge of Europe. The Battle of Britain's already happened, Hitler's already given up - at least for the time being, any hopes that he might have had of invading England, and has decided to invade Russia instead. But the UK kind of standing there, is vital. Strategically very very important to the Allied forces, it is, or will be the stepping stone for the inevitable re-invasion of the European mainland. And England, the UK, is kept alive if you like by convoys of supplies crossing the Atlantic from America and dodging the U-boat wolfpacks of the German Kriegsmarine so the most crucial struggle if you like at this time this, part of the war, is on sea, not on land. The Kriegsmarine command, the German navy, communicates with U-boats by radio, and they encrypt it using an Enigma machine. And I'm sure you all know about Eningma, Enigma is very well-known and has been known about since the mid-70s when that particular secret came out. And we know that British codebreakers at Bletchley comprehensively broke Enigma using work originally started from Polish intelligence. Enigma machine, lovely machine, it's beautifully simple to use, it's very robust, great to use in the field, you set the rotors, you press a key, one of the letters lights up and the rotors move, and you just write down what comes out, it's very easy. And the decryption process is exactly the opposite, it's a reversible machine so you type in the encrypted message and the lights light up and bingo, out comes the plain text again. And the resulting cipher text is just transmitted as standard Morse and the listening stations on the coast of England would listen to this, transcribe it, send them off to Bletchley and Bletchley would decrypt them. Now the Enigma, say, relies on the fact that there are 17,576 unique ways of setting up the initial position of those rotors. And there are lots of other complications, you can change the order of the rotors, you can choose three particular rotors from a set of up to five that you have, and then they add things like a plug board at the front which you can see down there. And that increases the possible number of letter substitutions to over a hundred billion. It's a very, very clever machine, it's beautifully simple and used properly - it should have been unbreakable with the technology of the time. But, at Bletchley Park, the government code and cipher school which has relocated out of London to Bletchley Park, and they assembled this this motley group of mathematicians, crossword solvers and linguists and goodness knows how many other people to monitor, intercept, and break German codes. I think it's part of the fascination for me about Bletchley and all the stuff that happened there, it's possibly the last time in history when a band of complete amateurs essentially changed history by doing something they were completely not expected to do. And that I find absolutely fascinating. The name of Alan Turing, the first of our Cambridge mathematicians here, Cambridge 1, Oxford nil - we're doing ok, is the best associated with Enigma. He was already famous in the maths community at the time, he just published or very recently published his seminal work on computation theory the famous paper, was it on computable numbers with an application to the Entscheidungsproblem, and they published that in 1936 and in that paper he describes something which he called the Universal Computing Machine - we now call it a Universal Turing Machine, and that's widely acknowledged as part of the birth of modern computing. Now he and his team, and the many, many, hundreds and thousands that worked with him, analysed and broke enigma comprehensively they essentially destroyed it as a communications device. And they did it using machines like this, that's a Bombe, which Turing and his crew devised, it's basically the guts of 36 Enigma machines in a single chassis, and it chunters round making an incredible noise, that's a reconstruction that you can see at Bletchley, and it's capable of trying out wheel settings at an enormous rate. And they essentially turned Enigma codebreaking into an industry with this thing, and a Bombe like that, set up properly, took about 20 minutes to try every possible starting position for an Enigma machine. You needed some hand analysis to set it up, and then some hand analysis to finish the decryption, but almost any Enigma message could be broken within a very small number of hours and that and the the common saying is that very often the codebreakers at Bletchley were reading Enigma messages decrypted on our side of the channel before the German recipients were reading them on the other side, because we could break them so quickly. The Bombe is quite a large piece of machinery, you can't get an idea of the scale of it there, it's about seven feet long, is six feet high, and it's about two feet deep, and it weighs a ton. They built over 200 of these things by the end of the war. Now at the peak, at Bletchley, there were over 9,000 people working at Bletchley, building, say, 200 machines like this, the Americans also built 120 of their own for the naval Enigma. So it's a huge commitment to brain power and resources and money and time, was it worth it? Well, yes. Being able to read U-boat transmissions was the key to winning, or at least surviving, the Battle of the Atlantic and making sure these convoys kept getting through. And just to show you the value, there's a graph of the tonnage of Allied shipping being sunk month-by-month through the war, and also the number of U-boats that we were sinking through the war. And if I tell you that there was a period in the middle of the war, when the naval German Navy decided to upgrade to their Enigma machine, and they added a fourth rotor to it. And that shut out Bletchley from naval Enigma for about ten months until they'd caught up with this and rebuilt their processes. If you look at that chart, you can tell exactly when that four months was, and it's right there. That is the value of being able to read your enemy's communications, and being able to decrypt them. Because convoys losses here went up from 150,000 tonnes a month to 700,000 a month almost overnight, and if losses had continued at that rate the war would probably, certainly in the Atlantic, have been lost. So that's the context of the cryptanalysis, the cryptography game of Bletchley. So now let's take a little bit of a digression and we've got to introduce some terminology about cryptography, and we do this quite gently because it gets quite complex later on. So we have a plain text, we're going to call that P, and you put that through some process that munches it all up and ciphers it and encodes it and you end up with the cipher text and we call that Z. So P becomes Z. The simplest way of doing that and way we all know we probably all do this children at school, is called an alphabetic substitution cipher, where you've just got two alphabets: an input alphabet and output alphabet. And you encode letter by letter, just by translating from one to the other. Very very simple. That was used over 2,000 years ago by Julius Caesar, and we often refer to it as a Caesar cipher. Now that thing and techniques like it is very easy to break using simple frequency analysis - different letters in the English language occur with different frequencies, you can count up the frequencies and you can guess which cipher letter corresponds to which plain letter and you can work backwards and you can break things fairly easily. Now to make it a bit more complicated, you use a more complicated scheme called a polyalphabetic substitution cipher. Now here you've got a multitude of alphabets, that example there has three, and you just use them in rotation. So the first character gets its cipher using the first alphabet, the second with the second one, and the third with the third one, then you go back to the first one, and so on. That's much, much harder to break because unless you know how many alphabets there, are you don't know how to break down this this cipher text into subsets, and then carry out frequency analysis on the subsets and work backwards to what all the individual alphabets are. Now, it's possible, but it's much, much harder and that is what Enigma does. Enigma generates essentially a non-repeating sequence of 16,900 alphabets one after the other. Now the average message sent with Enigma was only a few hundred characters, so the alphabets never repeat in an Enigma message. That's what makes Enigma much, much, much harder to break. Frequency analysis is essentially impossible. And an Enigma message as it arrived at Bletchley looked like that, that's a photograph of a real one, and you can see it's been transcribed from Morse by one of the listening stations. Up here tells you when it was transcribed and likely where it came from, and what frequency it was received on and what-have-you. And that was fed into the codebreaking industry at Bletchley and broken. But then in early 1941 they started hearing a completely different kind of transmission. And it looked like this. It's very high-speed automated transmission. So it's not standard Morse with a key, it's been generated by a machine. So they built little ticker tape machines that would transcribe this into into ones and zeros and printed out on a little ticker tape and then it would be hand transcribed into letters. Because it turns out, it's a standard international telegraph alphabet. And if you look as they did, at the network on which this traffic appeared, you can use traffic analysis and direction-finding to work out where that network is and who's connected to what. And the first thing you notice about it is it's not used very much. This is quite a, quite a sparse network. It's also very long distance and it's centres on Berlin. So the kind of guess is that this is very high-grade traffic it goes direct to Nazi High Command in Berlin, and if we could break into it, this would be very very useful indeed. So, they started looking at it and the first person to look at it was a guy called John Tiltman and he established that it was something called a Vernam cipher. He was unable to analyse it far enough to do anything with it, but he established that it works like this: So you take your message and you convert your plain text using a standard teleprinter alphabet into 5-bit binary representation. And that's the standard International Telegraph alphabet number two. It's perfectly standard, nothing unusual about that. And then you combine it with a key string using what they called "carry-less binary addition" but we know as "exclusive or". So you exclusive or that, bit-wise, with some kind of pseudorandom key stream. And the key stream we call K, and the result is your cipher text: Z. Okay so it's a binary addition or binary operation which now we call exclusive or. And this process works backwards as well, just like Enigma does, so you take this and you add it back to the key stream and you get the plain text down so it's it's a reversible process. That's one of the properties of exclusive or, if you've studied that as I'm sure you have. That if you add something back to itself, the original result comes out. So very, very simple. So, the receiving operator would simply take this, punch it on a tape, feed it into the machine set up exactly the same way as the originating machine - and out would come the plain text. Very, very simple. So Tiltman got that far and no further. Then they had an enormous stroke of luck. 31st of August 1941, an operator in Vienna sent a message that did not get through. So the recipient asked him "could you send that again please 'cause I missed it." And, breaking all the rules, he reset his machine to the same original settings and typed out the message again. But he didn't quite send it exactly the same way, 'cause he was a bit bored and probably a bit tired, maybe it was the end of the day, he used some abbreviations in the second message so it wasn't quite exactly the same, it was a little bit shorter but it saved in time typing it out. So there are the two messages, and you can see that they were both over 4,000 characters long the first 12 characters are the same and it starts the same, and then you can see it starts to diverge. So, Tiltman guessed, that standard operating procedures in German encryption meant the first 12 characters basically told the receiving guy "this is the way I set my machine up", so we can ignore those because we don't know what they mean. But the rest of it was the message, and he noticed the second message started the same but then it started to get different. Now, being a Vernam cipher, which we know how that works too - he knew that they would be encrypted using the same key stream, and if you add the two messages together in binary, what happens is the key disappears. Okay? Because they're both added together with the same key. Add them together and the key actually disappears so what he ended up with was the two messages added together. So: cipher text 1 cipher text 2. Add them together and so by comparing these two messages and very carefully guessing at what the first character was, subtracting it from the other one, and seeing if that made sense, and then letter by letter, trying to go forward through and forwards and backwards constantly subtracting each one from the other to see if he could make sense of it. He did, eventually, after ten days, managed to break out the message - both of them in fact. 4,000 characters long. Now that in intelligence terms is useless, because the message is about six weeks old by now. So the content of it is useless, but he then realized that if we take these two plain texts and we add them back together, what you get is the keystream. So he managed to produce 4,000 characters of the keystream, which doesn't sound massively useful but actually turns out to be worth its weight in gold. They looked at it for about three months, they analysed it to death and couldn't come up with with any deductions from it at all. Until they gave it to a guy called Bill Tutte - another Cambridge mathematician. Cambridge 2, Oxford nil. So he took it away, locked himself in his room, and by writing the keystream out in grids again, and again, and again, in grids of different sizes and different repeats - huge quantities doubtless of pencils and coffee and paper, he managed to work out the entire structure of the machine from four thousand characters of the keystream. Now just remember that I showed you a picture of one of those teleprinters right at the beginning of the presentation. No-one in Britain had seen one of those machines, and no-one did until after the war ended. But without seeing one, Tutte managed to work out the entire internal structure of the machine what he deduced is that it didn't just have three wheels like enigma does it had twelve quite a complex machine he also managed to work out the number of bit positions the number of steps on each wheel and he identified that they were in three sets of wheels the first set he called these the Chi wheels and they advanced as a group one position for every character that went through the machine generating a random pseudo-random bit stream of five bits. The second set, he called those the Psi wheels, nobody quite knows why. But they moved approximately 50% of the time, so roughly every other character but in a random sequence. And they generated another pseudo-random sequence of five bits. And this third set of two wheels in the middle, he called those the motor wheels or the Mu wheels, and they controlled this stuttering movement of the Psi wheels on the end. Tutte, after the war, moved to Canada and he was awarded the Officer of the Order of Canada I guess that's a Canadian equivalent of an OBE or something, I don't know. And in the citation for that, what he achieved here, was described as one of the greatest intellectual feats of World War II. It's quite an astonishing piece of work. So, now we can work out exactly how the Lorenz machine actually works. The input tape on the left is the plain text which is on the left over there, and you add that successively to the Chi stream, and the Psi stream, and what you end up with is your cipher text, Z. That comes out on another tape, you feed it through the machine, it automatically transmits it as Morse; the receiving guy puts it back in his machine as it's the Psi stream and the Chi stream and out comes the plain text very simple because it's a reversible process. Now say it's a two-stage combination and the repeat lengths of the Chi stream, Psi stream are huge. The Chi stream repeats after 22 million steps. the Psi stream after 322 million steps. So the huge repeat lengths here, and if you combine those and you factor in this this stuttering movement of the Psi wheels you can see they don't move every character, they move roughly every other, roughly 50% of the time. To search through every possible setting or every possible combination of that key stream involves trying out 10 to the 19 combinations. Even with modern computing technology that's a challenge. In 1943 it's probably one of the definitions of impossible. The technology they have at the time it's just not gonna work out. Even to identify 1 bit position in the stream would take about 4 million trials. Now, we're going to notice something interesting, say, as the Bletchley team did, the Psi dream doesn't change for every character, so they move about 50% of the time. Now the German designer probably thought that was clever. He probably thought he was doing something very nifty here by not advancing the Psi dream with every character. But actually it turns out, that that is one tiny, tiny flaw in the machine, which the codebreakers at Bletchley Park worked out how to exploit. And what they realized, say, is there's the Karnaugh map of exclusive-or, adding zero to something does nothing. Adding something to itself always gives zero, and if you add the same thing twice, it undoes the first addition and you get the original result back. So Tutte noticed this, and noticed that because of the way Boolean logic works you say adding something to itself results in zero and using that, you can magnify the results, or the effect, of repeats in the keystream or in the plain text. Because repeats, when you add them characters together, repeats generate zeroes. So you can you can exploit that, tiny. Now okay this is the tricky bit. So just pay attention, bear with me, and I'll see if I can get this right. So, plain text in German it turns out contains roughly 20% of repeated letters, in a lot of messages. German has, for instance, a lot of double letters: mm, ee, tt, ss - are very common in written German. So, also, the way that the teleprinter alphabet works, it's a 5-bit code so you can only encode 31 characters with that, so in order to get access to punctuation and numbers, the machine, or the the alphabet, implements a shift technique so to encode a full stop for instance, you pressed a key called figure shift, we sent a shift character and you then press full stop and you then send a letter shift to go back to letters again. Now it turns out that operators realized that if the receiving place missed the figure shift character the rest of the message was a complete gobbledygook, and because it would all be in the shifted alphabet. So they would send the figure shift character twice - and then they would very often type full stop twice, and they type the letter shift character twice at the end. So that's three repeats in six characters. So on a good day, say, the plain text contained approximately 20% repeats. You also notice the Psi stream contains about 50% repeats, and about 50% of those result in a repeated bit position. So, Psi is about 70% zero repeats in any given bit position. Now, how can you use that to - what can you do to detect these repeats and exploit them. And what they realized is that if you add each character to the next one, the result is zero when the two bits in those characters are the same. Right, that's the key realization. Because adding something to itself results in zero. So they generated what they called a delta stream, so they would take the cipher text and add each character to the next one they call this the delta stream. And in that position, a zero indicated a repeat in the cipher stream. Now then, Delta P right, the Delta plain text is about 60% zero. For an entirely random sequence it would be 50% zero. But Delta P turns out actually to be 60% zero approximately for an average message. And Delta Psi is about 70% zero, and using this, it gives you a tiny little window through into the underlying message. So if you remember that Z is P + Chi + Psi, it turns out Delta operator distributes very nicely over that, so the Delta stream that you generate from the cipher stream is just Delta P added to Delta Chi added to Delta Psi. So you rearrange that a little bit and turns out that the cipher stream, the Delta cipher stream, added to the Delta Chi stream is Delta P + Delta Psi. And Delta P + Delta Psi is zero approximately 55% of the time. If the plain text was totally random, it would be 50% zero. Turns out it's 55%, on a good day. For a good message, right, sometimes, it's a lot smaller than that. So if you could generate a candidate Chi stream, Delta it, add it to the Delta of the cipher stream, and count the zeros - IF you had the starting position of the Chi stream correct, you'd see slightly too many zeros in the result. More than you were expecting. Now that means you've got to generate a candidate Chi stream, remember how long that Chi stream is it's 22 million characters long. So that means you've got to try 22 million possible start positions in order to get a possibly significant result. Now you can't do that, that's not gonna work. So what about if you try just a subset of the bits and it turns out that yes, that works - if you try a Chi 0 and Chi 1 ok, and you Delta that and you add it to Z 0 and Z 1, Delta'd, you can get a significant result, just out of two bits. Now the Chi stream, the two bits, is only 1271 characters long. So you've got to try 1.6 million combinations to see if you've got the right start position. So what that does is reduces the number of tests you have to do from roughly 7 times 10 to the 15 tests - to 1.6 million. Right, that's a factor of a trillion reduction in the number of tests you've got to do by doing this technique here. So by hand at one a second that would take about 18 days. They tried it, believe it or not, and it took them between four and six weeks for one message - but it worked. And a message that takes four to six weeks to break of course is useless, but if you can do it a little bit faster, then this technique does work and if it's fast enough it would be useful. What if you could do it at 2,000 characters a second? Then, it would take about 14 minutes. This is useful. Enter our third Cambridge mathematician, 3-nil, Max Newman. St John's College mathematician, and between January and June 1943, he conceived and built quite a clever machine that could do just that. It could run the tests at 2,000 characters a second, and basically you punch your message text on one tape, you punch the Chi stream on another tape, and you run the two tapes through a pair of tape readers, and you put some clever logic in the middle that does the Delta-ing, and does the combining, and counts the zeros and if you glue those tapes into loops and you make sure that the lengths of the loops are co-prime, then you can just run the loop continuously and over time if you do it at least 1271 times through the machine, you will have tried all the possible start positions. And there is a picture of it. And they called it Heath Robinson after the guide the cartoonist that specialised in drawing weird machines - and it worked. And it made it possible to break out those Chi wheel settings, two bits at a time, and then you could break out Psi wheels and the motor wheels by hand, that wasn't too difficult. And it ran the tapes through, you can see two tapes on that bedstead frame there. Two tapes, it ran those through at 2,000 characters a second, and it completed a single candidate run in a roughly 15 minutes. So they reduced the time using Heath Robinson to break a Lorenz message from four to six weeks down to less than a few days, which is fantastic, and it works, and it's useful. But Heath Robinson was very hard to use, and not without its problems. Synchronizing to tapes like that a very high speed was very, very difficult. The Chi tapes had to be prepared by hand every single time, and after you'd used them a few times they stretched and they broke so you had to keep remaking them. Heath Robinson could only help with breaking out the Chi stream, it couldn't do anything else, that was all it could do, and it could only do that two bits at a time. And it couldn't help me say with Psi wheels and the motor wheels. But, it worked. Now okay enter our final player Tommy Flowers, whose name that you may have heard. And he built that combining unit for Max Newman. He was a Post Office research engineer at Dollis Hill, and there he worked on telephone exchanges, and he had a background in mechanical engineering because all those telephone exchanges were mechanical at the time, and he took evening classes at University College London in electronics. And he became convinced that it would be possible to build an entirely electronic exchange using valve technology. But his managers wouldn't believe him and they wouldn't let him try it. So he built this combine for Max Newman, to Max Newman's design from a combination of valves and relays. And when he saw his combining unit in use, when he saw the use it was being put to, and the problems they had with all these tapes, he realized that there was a much, much better way of doing this and he pitched to Max Newman a quite incredible idea. He said "I can build you a machine that will generate the Chi stream internally, entirely automatically, electronically, and it will do all of that calculation against an entirely generated stream using a single tape and it will do all that differencing, all that statistical analysis, and he said "I'll even connect it to a teleprinter for you and it can print out the results so you don't have to write them down by hand." Now Neumann was interested, thought it was a good idea, but he wasn't convinced that Flowers could do it. So Flowers went away back to Dollis Hill, back to his research lab, and he came back to Bletchley a few months later, early January 1944 with a big lorry, and on the back of the lorry was a an enormous machine, it was instantly dubbed Colossus just because it was so huge, it's 7 feet high, 16 feet long, two feet wide, and the first version was constructed from roughly 1500 valves. Flowers later expanded it to Colossus Mark II which had 2400 valves, by far and away, the biggest electronic machine that had been built to date. And it processed text at 5,000 characters a second. Colossus 2 - the expanded version, was five times faster than that, 25,000 characters a second. And that meant running the tape at 30 miles an hour - about 40 feet per second. Which is quite astonishing. Now Colossus could actually read and consume characters at 10,000 characters a second, but they discovered that the tape would physically disintegrate at 53 miles an hour, so they couldn't actually run Colossus as fast as it was capable because of the limitation of the tape. Not the last time I'm willing to bet that a computer has been limited by the speed of its input device. Now Colossus was hugely parallel, and it was capable of compressing say all five channels at once, and doing up to a hundred Boolean calculations simultaneously on all of those five channels. Using Colossus over the next year or so, the next year-and-a-half, Bletchley Park deciphered over 63 million characters of traffic using Colossus, successfully deciphered. Now if you compare this to Heath Robinson there's a few more pictures of it which survived from the war. It wasn't just massively faster, but it could do a huge amount more as well. Being completely reconfigurable and somewhat programmable (but not in the way we would understand it), it could help with not just determining the starting positions of the start at the Chi wheels but also the Psi wheels and the motor wheels as well, all at the same time. It could also do some other things; every now and then the German operators for instance would change the bit patterns on all the wheels. And when they did that, Bletchley Park basically had to go back and start regenerating candidate Chi streams and break the wheel patterns out again. Colossus could do that for them every time they did it. The Germans did that roughly every month throughout the whole war, and towards the end of the war they started doing it once a week.. So without something like Colossus being able to regenerate those bit patterns on the wheels, it would have shut them out almost completely. So Flowers machine did all of that and it did it at really quite astonishing speed. I've got a couple of excerpts from Flowers' diaries here which I think are absolutely fascinating he's incredibly pragmatic and very practical guy, not given to huge emotion. So you look at this: Sunday the 16th of January 1944: Made Colossus work. That's good, innit! Tuesday the 18th of January: delivered Colossus to Bletchley Park. And you look at the entry in the middle: Took John Senior and Eileen to Peter Pan. I mean yeah, you've got to take a break from work somewhere. And there's another another lovely entry here, 5th of February 1944 down here: "Colossus did its first job". That was it. So it had taken really only a few days to install and commission Colossus and actually get it working and decrypting real traffic, it really only took a few days right at the beginning of 1944. And you look underneath the rest of that entry it's his car broke down on the way home, picked up by Farthing in the radio car. Home at 1am. So he wasn't the first and probably won't be the last computer guy to pull an all-nighter either. So they they built a total of ten Colossi by the end of the war. Decrypting over sixty million characters. So, if we take a look at Colossus and its basic architecture, there's a very basic block diagram but on the face of it it looks incredibly simple, and you can see all that parallelism through, it's processing all of those characters in parallel. One very clever touch, invented by Flowers, was to make the machine self-timing, if you look at a piece of teleprinter tape you can see that down the centre is an additional row of holes - the sprocket holes. And in a standard tape reader you've got a mechanical gear wheel that engaged with those sprocket holes and pulled the tape through one character at a time. So what Flowers did was he put an additional photocell underneath the sprocket holes and generated a timing signal from the sprocket holes. So Colossus, you didn't have to synchronize the tape to the machine. The machine synchronized itself to the tape, so it didn't matter how fast you ran the tape, it didn't matter whether the tape stretched a bit, Colossus would automatically synchronize to the tape which is a very, very clever innovation from Flowers. There are awful lot of other innovations that come out of it. It's the first recorded use of what we now call shift registers, for instance. And he used that to shift every character that it read in, to shift them through a shift register so he could keep each pair of characters next to each other and do the difference in calculation. And Colossus moved on just from doing single Deltas - from doing deltas at two steps and three steps and four steps away, so he would keep all the characters in shift registers. Using a single clock to synchronize the entire machine, again, very clever idea, hugely parallel - it had what I reckon has got to be one of the first instances of an interrupt. If the printer was already printing out a result and the tape would just carry on going for looking for the next one - if it found the next possible result before the printer had finished the printer would just flip a little signal to say hang on a minute, and the the tape side of the machine which is basically going to idle and carry on idling until the printer had finished, and then we'd carry on from where it left off, now that's pretty close to an interrupter for you ask me. From a hardware peripheral. An optical tape reader capable of up to 10,000 characters a second. When I first started computing in 1976 or '77, we used 10 character per second tape readers. And there was Flowers building a 10,000 character optical reader in 1943. He also came up with a concept something called systolic arrays, which is a totally revolutionary concept which no one had ever come up with before, it's a data-driven computation architecture which people are starting to get interested again in the field of artificial intelligence machine learning, there's possibilities of using systolic arrays to speed up the execution of neural networks, for instance. They were independently - huh, "discovered" by a couple of researchers in 1973, he published a paper on it and didn't realize at the time that Flowers had already discovered them in 1943. So it's clocks being 5 kilohertz which is basically the speed of the tape, and it carried out say a hundred Boolean operations simultaneously on each of these five independent bit positions in the input stream. And by my rough calculation that's 2.5 million Boolean operations per second, which comes out 2.5 megabops and I like megabops as a unit, I think that's good. So, what about the impact of Colossus, did it do what it was supposed to do, did it repay its investment? Well, Flowers and the team were put under huge pressure to have this machine operational by June 1944. They had no idea why. Now with hindsight, we do. We know that Eisenhower and the Allied command were planning the re-invasion of the European mainland - they'd been planning it for a while. The opposition that we now know as D-Day and it was planned for June 1944. It's hard to underestimate, or overestimate even, how much rests on that operation succeeding. It's very, very risky. There's a heavily defended coastline on the other side, and the enemy essentially know that sooner or later. that invasion is going to come. So the Allies put in place a massive deception operation, fake army groups, fictitious intelligence, false plans that were conveniently dropped all over the place to try and hoodwink the Nazi command that the invasion would not come in Normandy. But it would come in Calais. So the success of that deception was absolutely vital, but how would they know that they'd been successful in deceiving the Nazi command? They needed to know what the Nazi command would be thinking - had they been fooled? And this is what Colossus could tell them by reading the Nazi High Command transmissions. Now, when you look at D-Day, the window of the weather, season, tide, and the moon, meant that there was a very narrow window of only a few days each month when all of those things coincided, and the operation could actually be launched. And if you delayed more than four or five days each month you then had to hop to the next month when all these things would come in sequence again. If, beyond June 1944 they delayed more than two or three months, winter would set in, and they'd have to wait till 1945. So, being able to launch that invasion in early June was absolutely crucial and they needed that up-to-date intelligence. So they originally set it for June 1st. It was delayed to June 2nd, and then the 3rd and then the 4th and then the 5th. June the 6th was the last opportunity that month before they'd have to delay until July. Now on June the 5th, Eisenhower met with his generals to decide whether to go on June the 6th as they were meeting, so the story goes, and I believe this story is corroborated, a courier arrived from Bletchley Park, walked into the room and handed Eisenhower a folded piece of paper. Eisenhower opened that and read it. And what he read was a message that had been decrypted by Colossus at Bletchley Park just a few hours earlier from Hitler in Berlin, to Rommel, commander of the Nazi forces on the Atlantic wall. And that message stated Hitler's categorical belief that yes, there would be an evasion in Normandy, but that it would be a diversion and the real invasion would happen 5 days later in Calais. And that message specifically refused Rommel permission to move his tank divisions from Calais to Normandy to reinforce his defences. And this was exactly the intelligence that Eisenhower had been waiting for - confirmation that the Nazi command had been completely deceived as to the timing and location of the attack. So, Eisenhower was not, didn't reveal what he just read, he wasn't allowed to because people weren't allowed to know that we were reading the Germans communications. He folded the paper up, he looked around at the assembled group of generals and he said simply "Gentlemen, we go tomorrow." June the 6th, D-Day, the invasion went ahead, and the war ended a year later. So Churchill credited Bletchley Park and the cryptanalysts who worked there with winning the war, largely. Most historians agree that it shortened the war by at least, or up to two years, and it saved I don't know how many hundreds and thousands of lives. That estimate is attributed by the way to a guy called Harry Hinsley, one of the first early historians of Bletchley Park, and he made that remark at a conference in Newcastle in 1979. He was a historian from John's in Cambridge, he worked at Bletchley Park during the war and he came back here and eventually became master of John's and Vice Chancellor of the University, another Cambridge connection, a very pleasing one. Another story that came out just towards the towards the end of that as an amazing piece of luck, which Tommy Flowers reported many many years later, he was actually in Berlin on Post Office business a few days before hostilities broke out at the beginning of the war. He was called by the British Embassy and told to return home immediately, and he crossed into Holland about three hours before the German frontier closed. If the Germans had stopped him and interned him, there really possibly might not have been anyone at the Post Office with that unique combination of mechanical technology, electronic technology, and high-speed electronics at Bletchley Park. No-one with the knowledge to build what became Colossus, so the Germans had him just there, and let him go, without knowing it. Lovely thing. At the end of the war, what happened to it all? Well, very sadly as we know, desperate to preserve the secret, Churchill ordered specifically. that Colossus be destroyed. All of them, be broken into in his own words "pieces no bigger than a man's hand." and Tommy Flowers writing after the war recorded that what he did, he said it was a terrible mistake, I was instructed to destroy all the records which I did, I took all the drawings and all the plans and all the information about Colossus on paper and I put it in the boiler and I saw it burn. That must have been an absolutely heart-breaking moment. But we do know that two Colossi were moved to GCHQ at Eastcote in April 1946. They were moved to Cheltenham in the early 50s, one of them was dismantled in 1959, and the other in 1960. So they did at least survive for a little while, but no documentation about them survived. The main reason they were kept is that UK intelligence used Enigma-like machines and Lorenz-like machines in their communications after the war, and we promoted them and we sold them to lots of other governments, without telling them that we could read everything they wrote. So that's why they kept Colossus for a lot of research on that. But as we moved into the 60s and all digital computerized encryption became common, they essentially became obsolete. So to bring it up to date, the first books to describe Enigma came out in about 1973, and over the the few years after that, books started to hint at the existence of a much much bigger secret at Bletchley without being able to reveal what that secret was, and the name Colossus kept coming up occasionally. The British government finally admitted in 1976 that there was something called Colossus - principally because they were pushed by a guy called Brian Randall who did more to reveal Colossus than anybody else. He's professor of computer science at Newcastle University and he was given a few pictures like the ones that we've seen that had survived of Bletchley during the war, and he was allowed to give a presentation on it in Los Alamos in June 1976. He wasn't allowed to say that Colossus had anything to do with codebreaking, but he was roughly allowed to describe the scale and the type and the function of the machine. And there was a witness in the audience that day in in June 1976 Los Alamos he happened to be sitting in the audience next to a guy called John Mauchly you may remember as Eckert and Mauchly who designed ENIAC, and ENIAC at the time was regarded as the birth of electronic computing, and it actually became operational shortly after the end of the war. And sitting next to Mauchly in the audience this witness says that's the first time I realized what jaw-dropping actually meant. Because Mauchly's jaw dropped as he realized he'd been comprehensively upstaged. It's real detail about Colossus and the detail of what we've seen today about how it actually worked and what it actually did, didn't come out until 2000. That's 55 years later, when the government published a secret report on it that was published simultaneously I think in America and the UK. And that's really the sum total of what we know about Colossus these days. But, look at it's legacy. Some parts we know were taken to Newman's Computing Machine Laboratory at Manchester University where he went after the war, and several historians have commented on the explosion of electronic computing research that happened in England immediately after the war. And there were several laboratories - Turing at the National Physical Laboratory, Max Newman at Manchester and Maurice Wilkes and his team in Cambridge. We'd say we know that Newman took some Colossus parts to Manchester which he re-used. And on June the 21st 1948 his Manchester Baby machine became the first operational stored-program computer. Newman knew about Colossus, he'd helped design it, he'd helped build it, he'd seen it work. Now, he knew such machines were possible and it's inconceivable that he didn't take that knowledge with him to Manchester, and it's inconceivable that it didn't accelerate the research that he did in winning the race to build the first operational computer. There was lots of other work going on separately from this, a lot of pioneering work on radar at the Telecommunications Research Establishment and elsewhere in England during the war, to do with high speed counting electronics and a lot of that was very influential in the building of EDSAC. So a conventional history of competing runs something like this: machine called EDVAC, designed by the late great John von Neumann that had the greatest influence arguably on early architectures, but EDVAC wasn't delivered until 1949. It didn't begin operation until 1951. Then the Manchester Baby or the small-scale experimental machine it was called. That first ran on 21st of June 1948 was the first thing that we would recognize today as a stored-program computer. And then EDSAC followed very shortly afterwards built here in Cambridge by Maurice Wilkes, that ran its first program on May the 6th 1949. There's another machine called ENIAC which is often credited in a lot of literature as being the first operational computer, but it wasn't begun until June 1943. It wasn't announced and operational until February 1946, it wasn't reprogrammable and it wasn't Turing complete. And it was definitely later than Colossus. So, I took this picture from a U.S. publication in 1957, which purports to show an entire family tree of the development of computing devices at the time. So you can see it recognizes ENIAC, and EDVAC really is the the foundation of that machine, now Colossus might not have been Turing complete, but neither was the Harvard Mark 1 which you can see down there, completed in February 1944. At first running in 1944 in May, that wasn't Turing complete either. ENIAC, announced February 1946 that wasn't Turing complete either. Colossus pre-dates everything literally everything on that tree, and it doesn't appear on that publication from 1957, and it would not appear in any publication for another 30 years after that at least, because it was kept secret for 40 years after the war. And that really denied Flowers, principally, and others like Newman, denied them their true place in history as far as I'm concerned, in the development of electronic computing. Flowers incidentally, and very sadly, ended the war in debt because he'd used some of his own money to buy components in order to build Colossus because he couldn't get it funded by Newman, he couldn't get it funded by the Post Office either. Others, like Newman say, went back into academic fields and were recognized in the the official development of the history of computing. Flowers went back to the Post Office. And he remained in the shadows probably for about another 50 years, if not more. And was never really recognized in his own lifetime. Even more sadly he went back to the Post Office and he re-pitched his idea about building an entirely electronic telephone exchange - to his bosses, and his bosses said no, you can't build a machine that big out of valves. Flowers knew that it could be done because he'd done it, but he wasn't allowed to say how he knew, and he wasn't allowed to give the proof, so he was never allowed to build his electronic exchange. But the best thing about Colossus, its position in history, is that you can still see it. Or at least you can see a rebuild of it. Although as we've said all the extant machines were destroyed either immediately or very shortly after the war, all the drawings were burned, all the plans had been burned, when the existence of Colossus emerged, one man undertook to recreate it. And as all engineers do, it turns out that some of the people involved had kept bits, of course they do. How many engineers do you donate do you know who've got a garage full of interesting bits of stuff that they keep just in case, because it was interesting. So people had kept photographs, they kept odd little bits of circuit diagrams, and they kept odd little bits of components, and using those, Tony Sale started to try and rebuild and recreate Colossus through the 1990s. It was completed in 2007. Took a huge amount of time to do it, mainly because of the scarcity of information. And if you go to Bletchley Park today, you can see it in action and it's an incredible piece of engineering. If you haven't been to see it, please do. Because what you can see, and if you're lucky and if they let you over the barrier, actually touch - is a machine that literally changed history. Not only did it shorten the war and save a huge number of lives and a huge amount of pain, but it gave birth to an industry as far as I'm concerned. And as far as many historians are now starting to agree. And it gave birth to modern computing technology and computing industry. A few more pictures there. So who do you dedicate something like this to. Well, I think it's well known that a lot of people who worked at Bletchley in the war, you know, they were all told to sign the Official Secrets Act, they never talked about what they did. One of the most poignant quotes that I can find about that comes from lady called Katherine Cocking who worked there through the war. She said afterwards "I was in fear of the secret that I had to keep, but keep it I did, and my greatest sadness is my beloved husband died in 1975 without ever knowing what I did in the war." That applied to a huge number of people who worked there, not least of all to Tommy Flowers. So if you're going to dedicate this to anyone there's 12,000 people who worked at Bletchley Park, and 12,000 people who kept a secret for as long as they had to. Bill Tutte and that incredible feat of analysis, an entirely mental exercise that deciphered the internal structure of an incredibly complex machine from 4000 essentially pseudo-random characters of a key text. Tommy Flowers sadly died in 1998 he never saw Colossus rebuilt although he knew that Tony Sale was trying to do it and was never really recognized in his lifetime. Max Newman who took what had happened at Bletchley and if you like won the race after the war to build the first computer at Manchester, and then Tony Sale bless him, who devoted the last 15 years of his life to bringing Colossus back to life again. And doubtless the million possible lives that Colossus saved by shortening the war. So, thank you. I hope you found that as fascinating as I did, trying to tease out what Colossus did, and some of the incredible people who were involved in doing it. Thanks.

Info

Channel: The Centre for Computing History

Views: 244,817

Rating: 4.9162059 out of 5

Keywords:

Id: g2tMcMQqSbA

Channel Id: undefined

Length: 60min 26sec (3626 seconds)

Published: Mon May 04 2020