Cracking Enigma in 2021 - Computerphile

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

I loved this. If the video is too long for you, spoiler alert - he cracks it. The video is really about why it is not as trivial to crack as one might think. Definitely worth watching the whole journey. He explains in great detail how surprisingly complex enigma encryption is and why his solution even has a chance.

👍︎︎ 22 👤︎︎ u/j1mmie 📅︎︎ Apr 19 2021 🗫︎ replies

Always felt the plugboard was substantially less secure than the raw numbers would suggest, and I'd say this video also shows this.

If we went with letter frequency tables rather than IoC, and we had the rotor settings, we'd be able to determine with quite a lot of confidence when we had the correct plugboard for the most common letters (t and e in English, e and n in German). They're frequent enough that there's a good chance of seeing them just from the unplugged cypher text letters.

With a known plaintext, as Bletchley park had, it's even easier. A Bombe simply tried different combinations until it reached a contradiction, and could backtrack, eliminating whole swathes of possibilities.

👍︎︎ 6 👤︎︎ u/squigs 📅︎︎ Apr 19 2021 🗫︎ replies

I could listen to Dr Mike Pound for an hour

👍︎︎ 19 👤︎︎ u/Humberd 📅︎︎ Apr 19 2021 🗫︎ replies

It's sensationalized, but if you haven't seen Imitation Game, definitely check it out! It was very cool in the movie to watch Turing's computer run, trying to decrypt the Enigma messages.

👍︎︎ 8 👤︎︎ u/prismatic-io-taylor 📅︎︎ Apr 19 2021 🗫︎ replies

Does anyone know of any other old timey encryption methods that can be easily cracked by a hobbyist today?

👍︎︎ 2 👤︎︎ u/JB-from-ATL 📅︎︎ Apr 21 2021 🗫︎ replies

10 years ago I would have enjoyed this style of video a lot better. Now I want the conclusion at the start, and the video explaining it. Having to watch the whole thing or poke the timeline at random is ridiculous these days.

👍︎︎ 10 👤︎︎ u/DragonCalypso 📅︎︎ Apr 19 2021 🗫︎ replies

How hard is it to multithread work in Java? I was looking at his code, and the longest part of the search, the rotor search, is an embarrassingly-parallel problem, but still being done in a single-threaded way.

👍︎︎ 1 👤︎︎ u/MEaster 📅︎︎ Apr 20 2021 🗫︎ replies
Captions
so it was in the news that there's a 50-pound  note coming out with alan turing on it now alan   turing has been featured from time to  time on our our channel and rightly so   dave's done some fabulous videos on all the  work that he and a lot of others were doing at   bletchley park you know cracking enigma and it got  me thinking you know now that we're in the 21st   century and i've got a laptop that's much more  powerful than all the cheering bombs put together   how easy is it to break enigma like today right  so i thought let's code up and find out the first   thing i should say is they've done some really  interesting videos on the history of enigma and uh   do go and watch those right and it'll give you an  idea for sort of things they were doing to try and   break enigma back during the 40s not to trivialize  it but it's really difficult isn't it it's really   really difficult right you know this isn't  something one does by hand right not quickly um   the enigma machine is not a stupid idea right it's  well designed the only thing that we've got now is   much much more processing power and so things that  we couldn't possibly brute force back then maybe   we can start to begin to brute force now right  and that's what we're looking at today so let's   look very briefly at what the knitting machine is  and then we'll have a go at actually breaking one   and see whether it holds up if you recall it has  some lights on some letters and it has this is my   technical description has some buttons that you  press keys are they cool sometimes so you press   an a right you press an a and it goes into the  machine and it goes through something called a   plug board and the plug board will swap it so this  is the plug board here now plugs just swap certain   letters in pairs so maybe this a comes out as a an  f right or something like that then this goes into   the first rotor and so that maybe comes out as a q  or something like that so you know it sort of just   wiggles around in here and then this comes in this  comes out as maybe you know a p i'm making this up   and then this goes into the next rotor to  stick to three rotary enigmas for today hey   uh comes out as a sort of an s and then it  goes through something called a reflector so   this is getting quite complicated but this is all  mechanical right these rotors literally physically   turn around and they just have wires soldered  from one end to the other that connect up so the   reflector just bounces s goes to something else i  can't remember which led is and then it goes back   through this like this and then it comes around  through the plug board again so maybe that d   goes to i mean a's over here that was that  was a bit silly of me wasn't it um over here   for the purposes of this it's all absolutely fine  and this comes out as a zed right so in my in in   my weird botched enigma diagram the a went in and  it came out as a zed it was encrypted as a zed   now what's hard about this problem well you  can take these rotors out right each rotor has   different wiring there's usually five or eight  rotors to choose from and you can put them in   any position you like the next one is that every  time you press a letter one of the rotors rotates   and sometimes it rotates the next rotors along and  that means that this mapping this transformation   changes every time you press a character so  you press a it's not going to do the same   thing again no actually if you press a the rotor  first turns and then lights up a letter right so   it's not going to do the same again if you press  aaaa the only thing you're guaranteed due to one   of the quirks of enigma is you won't get any  a's back but you'll get a lot of random letters   back during the war they solved this by trying  to find possible rotor configurations that   definitely couldn't work or could work based  on some guest playing text so maybe they had   this idea that the first part of the message  contained the word weather report and so if   they put that in and they then they could find all  the encryptions that couldn't possibly have come   out as weather report and so on and they could  start to narrow down what their options were now   that's called a known plane text attack but during  the war if they hadn't known the plaintext they   wouldn't be able to crack enigma they didn't have  the brute force power to do it nowadays we have   pretty fast laptops right and you know beyond my  laptop other computers are even faster than that   would you believe so theoretically we could  start to try out some of these combinations   even if we didn't know any plain text what we're  going to be looking at at least to begin with is a   ciphertext text only attack that's an attack where  we've only got the cipher text we don't have any   idea what the plain text is and we want to see if  we can guess what some of these settings would be   there's a few interesting weaknesses of enigma  that make it a little bit practical to brute force   but not actually as much as you think we talk  a lot about how a letter doesn't encrypt ever   to itself and that's quite relevant for plain  text attacks known plain text because you can   then work out where plaintext definitely can't  or could be we're not doing that today so i'm   less worried about that particular property of  enigma more interesting to me is the fact that   if you get some of the settings right like you  get this rotor correct but these two wrong that   will often be better at decrypting than if you get  them all wrong right so as we start to move slowly   towards a solution we get a little hint that maybe  this ciphertext isn't quite as good as it was   and we can start measuring it so really what  we need to do is find some way of putting some   some ciphertext into an enigma machine with  some configuration and then reading the output   and saying is that a plausible sentence or not  right and preferably we'd like to do it really   really quickly because otherwise this could take a  long time so we're not going to be doing any deep   learning can't be bored with that um we're going  to use very simple statistical properties of text   to try and measure whether one sentence is better  as english than another sentence right and there's   a few of these and i've implemented all of them  because i thought well let's just test them out   right so the first one is something called index  of coincidence so let's suppose you have a cipher   text so for example oh here it is i'm just looking  at random text right so yeah okay i'm just going   to copy some random cipher text from here right  so why i can't just randomize my own soft text   i don't know but i feel more comfortable copying  it than i do just thinking up clever interesting   letters on my own said well so there's no  number files on the fact that i don't pick   as random as i think and stuff like that yeah  so you then do s e uh b h w this is some actual   cipher text that we'll be breaking later does it  honestly start with zeus as in conrad's use yeah   the reality of random yeah yeah yeah yeah um you  see all kinds of words in the random ciphertext   and they mean absolutely nothing because it hasn't  been decrypted yet uh oh g yeah well exactly   e u so let's suppose we have some type of  text like this we guess some rotor settings   and we put it through our enig machine enigma  machine and that will decrypt it right now it   will probably decrypt it incorrectly but where  we accidentally stumble upon the right plugs   or we stumble upon the right rotor configuration  even for a briefly we'll find that this decryption   is slightly better than completely random right  because actually mostly this is completely random   right yes we have this stipulation that letters  can't turn into themselves but generally speaking   it looks completely random so this is what we get  out so h f my writing is bad today v v f l i n   g now the interesting thing about this is i mean  it's complete nonsense right let's let's i'm not   cracking any words on this but i do recognize ing  now ing is a fairly common trigram or set of three   characters in the english language right now that  doesn't necessarily mean this is correct but it's   slightly more english should we say than this one  right so if we were measuring some amount of how   english is this sentence how you know then it's  a little bit closer than this so maybe one of our   rotors is in the right position and the others  are wrong right or our rotor's in the correct   position but our plugs are wrong or something  like this right and the idea is that we slowly   go through different configurations of our enigma  machine i say slowly as fast as we can right and   we measure statistical properties of these  output sentences to find the ones that clo   most closely resemble correct decrypted text  right and we can do this without looking at them   we don't have to look at them and say well  that's a real word we just measure statistical   properties so what are some of these statistical  properties well the first one is called the index   of coincidence or i o c this is the probability  that when you pick two letters at random   they'll be the same so for example if we randomly  pick p and then we randomly pick p that's versus   the same if we pick p and then l they're not the  same you know we won't write the formula the form   is not that complicated so but what you have to  do is go through and count count every single   character and how many of each one there are you  produce a histogram and then you can calculate   the index of coincidence based on this now  for random text that is text that's been put   through the machine are not decrypted the index of  coincidence is usually something like 0.038 right   which is basically everything's evenly distributed  there's nothing interesting going on there at all   but for decrypted english text we usually  get a higher index of questions about 0.067   and i think it's something like 0.072 for german  text one way of looking at it is it measures the   the the fact that some characters have more higher  probabilities than others if everything's equally   likely you get something like this if some  characters are quite common and so they tend   to come up in pairs it starts to look a little  bit higher right it never goes higher than this   so well not really what we can do is we can work  through our rotors different rotors different   positions different settings and we can calculate  our index of coincidence and we take the best   scoring texts right so where our output has  a higher index equations we think maybe we've   got the rotor settings correct right and that's  basically how it works there's been a number file   right where they actually got to use the enigma  machine and i'm super jealous they talked about   the number of different combinations so it's all  very well saying okay we'll just go through all   the rotor settings and work out you know what  the best one is right maybe if i have you know   a super super super computer um but actually  enigma has a nifty weakness in this sense right   which is that if you get some of the settings  correct this will improve right if i get the voter   positions correct even though the rotors are in  slightly wrong position the result will be better   than if i've got the wrong rotors in place right  if i get one of the plug board settings right   the results would be better than if i got none of  the plug board settings right because basically   fewer characters will be incorrect right  subtly so you've got three out of let's say   five rotors or eight rotors so that's physically  the three that happen to be in the machine at   the time yeah and you've got their different  positions right so if for today we just talk   about five rotors just because i've always been  sitting here a little bit longer then we've got 60   possible positions one we've got to choose three  out of five and then they can all go in any slot   you don't tend to have the same motor twice right  because i mean they didn't have those duplicates   of rotors yeah one set of them yeah yeah yeah yeah  um so we've got that then for each of these we   have a start position from from uh 1 to 26 which  is you know what letter is showing on the top   right basically how how rotated it is so you've  got the start position or the indicator setting   so there's you know times let's say 26 of those  times by three of the different motors right then   you've got the ring setting now the ring setting  is essentially rotating the internals of the   rotor right now actually if you if we ignore the  notch for a minute which i'll talk about if you   rotate the ring and you rotate the the um actual  whole rotor they kind of cancel each other out so   it's about the notch position really the notch is  when the rotor turns it turns the next one along   and so the combination of your start position and  your ring setting will mean where your notch is   and then when it turns around so you've got the  ring settings right which is going to be 26 x   3 again then we've got the plug board which  as you know swaps random characters and that's   got something like 10 different pairs out  of 13 possible pairs that's 150 trillion   i think different combinations that's you know  out of reach of my laptop um certainly when i'm   doing all these decryptions as well and we're  multiplied by all these things right the number   file goes into much better detail on this we're  looking at five today because uh again i don't   want to be here all day it does get harder to  solve and we'll talk about that so this is a   lot of combinations it's too many combinations  for me to go through even with this nice little   industrial coincidence thing right even though  when i get this exactly right i will just get   the plain text out and it will have a very nice  illness coincidence and i might be able to find it   so what do we do well the weakness of enigma is  that if we get some of these things right even if   the others are wrong we get a little bit closer  to the answer usually so for example if you get   the correct three rotors in the correct positions  and you get their start positions roughly correct   if your ring settings are wrong all that'll do  is mess around with the notches so you'll get   bits of your plain text correct and then bits  of cipher text and then bits of plain text and   you get these kind of pockets of valid characters  coming out of into the into the decryption it will   still score better on ioc or any other metric  so that's what we're going to do and this is   the same with the plugs if we get the rotors and  the start positions and the ring settings correct   then we can start to guess plugs and  generally speaking if we guess one correctly   the output would better and we can then move  towards a solution there's a lot of possible   variations but the fact that we can deal with some  of them at a time makes this practical right if we   had to brute force through all the different  variations it wouldn't be possible that's the   idea so i've written some code to do this um if  you want to have a go i'll make my code available   but also there's a really good online tool called  crypto which lets you do this in a visual way   uh we'll put a link to that in the description  but i've written some pretty simple code here i   implemented an enigma machine because it was fun  um and then i implemented a number of different   fitness functions right which is how good is our  decryption in index of questions is one right   i also um maybe we can talk about some others  another time so you were kind enough to send me   some ciphertext i don't know what it is and it's  been encrypted by some enigma configuration with   i think five plugs the first thing i do is i go  through all the different rotor configurations   and i find the one that has the highest index of  coincidence score when it decrypts that message   so this is of five different rotors each one tried  in each position and at each starting position   so that's 26 for each one so that's quite a few  combinations about 17 000 but 17 000 for a laptop   in 2021 not such a big deal right takes somewhere  around 10 seconds or something like that right so   you can see what it's doing now is it's stepping  through the different rotors so one two three one   two four one two five and we've done about 10 or  15 configurations already and for each of these   it's going through all the different starting  positions but we're not looking at ring settings   and we're not looking at plug board because that  will just multiply this astronomically by the   number of things we have to do so we're already on  rota 3 in the left-hand side we're keeping going   this same thing works exactly the same for eight  rotors it doesn't really change anything it just   takes slightly longer and i'm a bit lazy so i  have actually coded up the other motors as well   um interestingly enough some of the later  rotors have two notches on right which is   not it doesn't make any difference in terms of  the cracking because um so that just means it   turns the next one twice as well yeah yeah twice  often yep um only really affects the first two   rotors the last one doesn't ever really turn that  often and it doesn't have any other motor to turn   so here what we've got is we've got a list of  the top performing rotor configurations so 2   5 3 is the best performing rotor configuration  with start positions of 21 3 and 25   i'm using zero indexing right which is not how  you would normally do it in enigma but it was   easier for my array indexing to do this right  and that has an index of coincidence of 0.043   which is a lot higher than 3 8. i say it's a  lot higher it's a little bit higher good enough   so that suggests to me i mean we actually i listed  the top 10 here because sometimes you might not   get the one on the top one you might get the next  rotor configuration or something like this it's   worth maybe if you were trying to really actually  pay attention to this what you would do is maybe   start doing further attacks on the top three  rotor configurations just to keep your options   open so we're gonna fix at two five and three  because you know it saves time right um so given   rotors two five three from left to  right and their starting positions   what we now can do is we can start to brute  force through the ring settings so we can   find the best possible ring settings right now  this is almost instant because there's now only   600 or so of those right um we don't need to  try the left-hand rotor because it doesn't   really rotate and so we do that and it's already  happened and the best ring positions were 0 3 23   right now the zero we ignore because it's not  the ring possessing remember affects where the   notch is and the left rotor doesn't turn anything  over so it doesn't have any effect so given that   this is the ciphertext we've got out this was  our original ciphertext and this is our slightly   better ciphertext now it still looks like total  nonsense right but it has a much higher index of   coincidence score than the original which means  in some sense it's less random so if you look you   might start to see groups of letters that might be  a real word they might not i don't know right but   some of the real letters are going to come out  here we might not be able to see what they are   so given this we can finally start addressing this  really problematic plug board situation remember   there's far too many plug ball combinations to  realistically just try them all but again we have   this wonderful benefit that if we get one of the  plugs correct the result will probably be better   than if we got none of them right so what we do  is we go through each of the first 300 and so   different possible plugs just one at a  time and we see which the best one is   right and then we fix it and we do the next for  the next one and that's two plugs and then we do   for next to the next one and that's three plugs  right and so on and if we do that we very quickly   come up with a few sets of plugs and our  ciphertext is starting to look a lot better right   so this is our ciphertext here the first letter  is a nonsense but then is reposed to consider   the quest que consider the quest can machines  think ah see now i'm starting to guess what   this might be oh i see what you've done here this  is the alan turing paper so some of the letters   are wrong right so it should be i propose to  consider and it's j brapos to consider right   and we're nearly there but that's because when  we optimized the motor configuration we fixed   the rings at zero zero zero so it was never going  to find the exact correct thing so essentially the   turnover is slightly wrong everything's slightly  wrong but it's still pretty good now if we go   back to our original question as to how secure  is enigma the answer is not very secure right   and the reason is not because it's trivial to  break right this took me a little bit of effort   and for short messages where these fitness  functions start to break down because you don't   have enough information they're actually actually  very robust right for a 50 character message   very very difficult to break using something like  an index of coincidence because even if some of   the letters start to appear right there's not very  many of them the index is just noise in the war   they limited or they they attempted to limit  messages to something like 200 250 characters   for this reason because index of coincidence was  already known right and there's now more powerful   metrics like trigram scores and quadrants which  i've also implemented which often work better   particularly for the plug board and so if  you have a short message you don't get very   much information on the different frequencies of  different groups of letters and so there's really   no way to know what's going on at all right and  you get very lucky or you don't and most likely   you don't right the other issue is of course the  number of plugs i've only done five plugs here so   i've cheated a bit right for most german messages  were sent using 10 plugs you're going to need 1200   to 1500 characters before fitness functions are  going to start to give you something right if you   know what the playing text might be this becomes  much much easier right because if you can fix   these characters have to be exactly this your  fitness function is much less noisy right i know   actually i've implemented that as well and it just  starts breaking it no no like nobody's business   right so it is crackable right if you know if you  can guess what plaintext is and of course modern   cryptography assumes you know what the plain  text is right at least for some of the message   for example whenever you send an http message  to a web server the beginning bit always says   http get or something similar right but there's a  very known structure to these things you can start   to guess what they would be we can't assume that  you wouldn't know what any of the plaintext was   but even if we don't you can see that these index  of coincidence and trigram scores and things can   start to tease out some information um so going  back to the beginning enigma is actually harder   to crack than i thought right people always talk  about how hard it was to crack during the war and   that's absolutely fine but you just kind of assume  that now it's 2021 laptops should be able to just   click go for all the settings find yourself  the ciphertext doesn't really work that way   right you have to try and stumble your  way towards it and often it doesn't work   and there's noise in the output and so you have  to try and work out whether what you're seeing is   actually improvement or not and so on which  i think is quite interesting modern ciphers   don't have this problem if you have a 128 bit as  key you can't brute force the first bit because   different like the zero or one won't give you any  better or worse playing text right it will just be   nonsense each time and that's true of any amount  so you can't do like the first half of the key   and then the second half of the key which is kind  of what we're doing here so modern ciphers don't   have this problem and so on we can't do a lot of  interesting things with this image after just one   set of convolutions but we're getting there so  this one is starting to be transformed some of   them are noisier than others paint associated with  it showing which was the proper setting with a at   the top an a against the dot of paint AUTOCAPS MANUALLY PUBLISHED
Info
Channel: Computerphile
Views: 787,747
Rating: 4.8946471 out of 5
Keywords: computers, computerphile, computer, science, Enigma, Turing, Mike Pound, University of Nottingham, Cipher, Cracking, Crypto, Cryptography, Alan Turing, Bletchley Park
Id: RzWB5jL5RX0
Channel Id: undefined
Length: 21min 20sec (1280 seconds)
Published: Mon Apr 12 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.