hello and welcome, i'm James Murphy, and in this
video we're going to crack some passwords. thanks to this video's sponsor, Anvil, a browser-based
full stack development platform allowing you to make web apps with nothing but python. more about
Anvil at the end of the video. sorry to disappoint any wannabe hackers in the crowd, but this is not
going to be a hacking tutorial. after watching this video you're not going to be able to just
go out and hack your friends, or your colleagues, or government databases. what you are going to get
from this video is a quick dip into the deep end: what are some of the things that make it so
incredibly difficult to write secure software? in my opinion the biggest difference between the
way that code is modeled in the theoretical world and how it actually is in the real world is the
amount of time that code takes to execute. well in the real world there's an entire class of
attacks that exploit the physical amount of time that an algorithm took to execute. these are
called timing attacks. that's what we're going to investigate in this video. we'll be able to
crack a password that's 20 characters long in no time. okay so let's get started. what's the
system that we're going to hack in this video? well i have a server, which
is really just a function, and all this does is i pass in a user and my guess
for the password and it's going to look up the password in the database, which is actually just
a dictionary up here, and after it looks it up in the database all it does is tell you whether or
not your guess was equal to the actual password. in this video, we're going to be taking the
attacker's perspective, so we don't get to see the check_password function. check_password is just a
black box. we plug in the username and the guess and we either get "you got it correct" or "you
didn't". we don't get to see anything else. i know this is silly, check_password is just a function
in the same file, this is totally ridiculous. but think about check_password as actually going out
to the internet and trying to log into a server. this is a pretty bad server because it's going
to allow us to make many many many guesses and it's not going to ban us for spamming or anything
like that, too many wrong attempts, but i think it gets the point across. okay let's get to cracking.
at first glance it might seem completely hopeless. passwords can be extremely long, 30 characters or
more, and all we're getting out of this is "is the password correct or not". if we get even a single
bit wrong, then we basically gain no information, right? well not completely. the whole theme of
this video is that we do get one extra piece of information that the server didn't tell us about.
we can measure how much time it took to respond. if we suspected that the server was
using a bad implementation like this, just equality checking the actual password
against the guess, which, by the way, the server shouldn't even store your actual
password, then we might be able to exploit something about how long it takes to compute
the answer when the guess is closer to correct. in pretty much every programming language, the
first thing that happens when you compare two strings for equality is it checks to see if
their lengths are the same. if the lengths are different, then you can pretty quickly return
false, so if we query the server with a guess, if the length of the password is wrong, then the
server will respond faster. let's see if we can use that to crack the length of this password.
okay here's what we'll do. we'll take the user and the maximum possible length that we think the
password could be. then we'll just loop over the different lengths and use the timeit.repeat method
to time how long each password attempt takes. the statement that we're timing is this check_password
of the user and a guess x, where here the user is inputted as whatever the argument was, and x is a
random string generated by this function. repeat returns a list and we just take the fastest time
from that list to use as the server response time. we take the fastest time here because the server
might have just lagged for some unknown reason and we don't want that to slow down our timing. we
want to measure its performance assuming it didn't lag at all. to recap, all i do is guess passwords
of a bunch of different lengths and then return the length that takes the longest to respond, and
here's what we see when we run it. passwords that were randomly guessed of length 20 took a little
bit longer than the rest. notice that the second longest choice was 93 percent as fast. and what do
you know? the length of the actual password is 20. and this is not a fluke. if i change the length
of the string so that now it's length 19 instead of 20 and run the program again, now we
see that the most likely length is 19. of course if you have other things on your machine
going on that are affecting the timing, like recording video, then it's going to make it a
lot less likely to give you the right answer. when i ran it again i got that 25 is the most
likely length, which is not correct, but 19 was still in the top three. so if you were a real
hacker trying to do a real timing attack you would probably be doing this on a dedicated machine that
doesn't have lots of random programs executing at the same time that might mess up the timing.
okay, finding out the length of the password is one thing but guessing the actual password is
something entirely different. so can the timing attack do that as well? well if the implementation
was one of these simple equality checks, then yes, because here's what an equality check for strings
looks like under the hood. as we mentioned before, first it does a length check, and that's what
we use to crack the length of the password, but after it does the length check it goes character
by character and if any of them are not equal then it will return immediately. because string
comparison will return early if the password is incorrect at an earlier position, again we find
that the more correct the password is, the longer the server will take to respond. this is exactly
the same thing that we can use to exploit as with the length. so here's the code that i came up with
in order to crack a password using a timing attack given that you've already cracked the length
of the password. it's similar to cracking the length itself except we have to go position by
position and get each character right in turn. we start with just a completely random guess of
the correct length. we don't know how much of the password that we've guessed so far is correct,
so we're just going to repeatedly loop over the entire string, try to get the first character
correct, then the second, then the third, and so on until the end, and then start back at
the beginning. we'll just keep repeating until we get the right password. for each position in the
password, we try every possible character that can be in the password, and then we see what happens
when we put that character in that position. we'll time how long it takes for our new guess,
we'll time how long it takes for our old guess, and then whichever one took more time we keep as
the old guess for the next iteration. notice that, by random chance, we could have a more correct
guess get replaced with a less correct guess. i'm sure that we'll see that happen when we
actually run it, but nevertheless the more correct guess is more likely to take longer and so
eventually we will build up the correct password. here we check to see if our new guess is the
correct password and if it is then we're done, we return. otherwise what we do is we take the one
that took longer and use that as the guess for the next iteration. and that's it. all we have to do
now is try it. so first we will crack the length of the password and then use that length pass it
into the cracked password function and see how long it takes to find the actual password. all
right let's just see how well it does. [music] so just on the first iteration it's gotten
a lot of the characters right already, let's see if it can finish off the
last few in the next one. [music] and there you go. it finished, password
cracked: "subscribe to mcoding". so there you have it. we just cracked a password
that was 20 characters long using nothing but timing information. this was just one example of
one timing attack. and timing attacks themselves are just a single class of examples in a huge
sea of things that you can mess up, and if you mess up anything then you're going to get yourself
hacked. don't trust the security of your customer data or of your web app in general to something
that's not extremely well tested and proven. and speaking of writing web apps, if you're
interested in that kind of thing, be sure to stick around and hear about our sponsor Anvil. that's
all i've got for now, see you in the next one. Anvil has everything you need to develop and
deploy a web app written completely in python, including the front end, so no javascript or html
required. and it's free to try. create your user interface by dragging and dropping components.
Anvil's web-based ide allows you to develop in your browser, no need to install anything. you
can use Anvil's built-in database and let Anvil handle the tricky parts like user authentication.
when you're ready, deploy your app to the cloud with a single click. check out the link in the
description to get started with Anvil today.