The Parts of JWT Security Nobody Talks About | Philippe De Ryck, Google Developer Expert

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] good evening everybody I'm glad to be here I'm here actually there was an last conference yesterday and today here in Tel Aviv so I'm actually in town for that conference whenever I travel I like to do a talk for a local meetup as well because usually organizers always looking for speakers and I don't mind speaking I usually speak for days on end so I'm I'm pretty good there I'm going to talk about JSON web tokens tonight we're gonna talk about Jason web tokens and security in particular and not about the traditional things but I want to talk about some of the hard parts some of the stuff that usually when you see a tutorial on JWT or I see people talking about jail with today they don't really talk about these things and I think that's kind of a pity and there are definitely a lot of things you should be aware of before we start let me ask you a question what do you know about JSON web tokens here's an example of a JSON web token at the bottom what do you know about these things anyone yes yeah all right other things I like that answer especially the claims part yeah that's that's good what's the data you see here what kind of format is that three yeah to Jason a signature absolutely in base64 encoding for one how do you use JSON web tokens where are they used you get any sent them yeah sure and you essentially it's what you see it's it's a way to represent claims in a secure way with a signature how we use that it's totally up to you many people have custom implementations but we also have protocols built on jet ability for example open ID connect the protocol you use if you use the log in with Google or with github kind of thing uses under the hood is gonna use gwt's to represent information about your authentication to the party you're authenticating to so that's essentially a very good introduction those are the basics of what's happening let's let's talk about some some of the aspects let's talk start easy here's a decomposed JSON web token so you can see that to Chasen's that you mentioned is this a valid joking yes or no what do you think yes so that is a correct observation the algorithm here is none so it is indeed here it's it's not signed but is it fell at yes or no yes it actually is valid very good and this is in the spec allowed by the spec the spec says like you can have a signature that's missing none that's totally fine when people implement the chart libraries a couple of years ago about five years ago they actually followed the spec and they said like oh if none that we don't check an algorithm we simply take the claims that you had in the body in the payload of the job talk and I rear turn them because that's what we're supposed to be doing of course somebody figured this out like hey this is probably not the best idea because me as an attacker I can just generate any job I want with any content I don't even have to provide a signature and a service that has this standard library will except that shop token so if you're using them for something sensitive like authentication identity information I can simply give you a fraudulent identity token and your service is very likely to accept that this is something that happened in a lot of libraries actually today if you go to the java i oj TT dot io web site or they have an overview of all the libraries that support shop there is still a warning there like watch out in 2015 we discovered this major problem and you should probably update your libraries to prevent those things from happening there was a second problem or it was some confusion between symmetric and asymmetric signatures that also allowed an attacker to craft arbitrary tokens with arbitrary claims and they would be accepted by a back-end using these libraries so that's essentially an example of how tricky it is to use charts this is something that was introduced like five six years ago and we have run into some major problems these are implementation problems so these are not necessarily things a developer that's wrong but it's mainly a library problem but nonetheless your applications would have been vulnerable if you were using these libraries this is just to set a scene I'm gonna talk about a few other things in the remainder of this presentation first a small word about myself I'm Phil indirect I'm from Belgium like I said I'm in town for the last conference and I'm doing another talk here I've been nicely introduced I'm a Google developer expert which means I'm not employed by Google just a recognition by Google for the work I've been doing for the community to spread knowledge about security my main activity today is I run pragmatic web security which is a company it's based in Belgium but that doesn't matter much because I travel the world to teach developers about secure software so I essentially have training courses I teach at companies I'll be back in town in a month to teach one of the tech companies here in tel-aviv about secure coding all aspects of web security essentially that's what I do and I kind of like what I do that's also why I don't mind presenting I do that four days in a training so let's go on with Jason web tokens I'm not going to talk about how you use them to represent authorization states there's a whole discussion going on where you can actually do that or not and some people say like yeah that's find other people are like are you crazy don't do that and I'm not gonna talk about that so I'm just gonna use a simple claims example we're gonna talk about this part here we're gonna talk about the signature when I talk about what it means to have the signature what you probably know about the signature it's important the signature is generated over this header and the payload and the signature is generated by the one who issues the jot he generates that service generates that signature and sends it along in the jaw token and whenever you change something here the signature when somebody verifies that signature they will detect that something has been changed and they will reject the token that's the whole point it's to represent claims securely these are your claims to represent securely this is some metadata and the signature is essentially your tamper detection mechanism the moment I change this rolls claim and I add like admin there the signature will no longer be valid and the server is accepting that job will say like new that's that's not good and I'm not gonna do that how does this work well the very simple mechanism to do that is call an H Mac and an H Mac works as follows so you have the data that's essentially the header and the payload of your token and you're gonna push that through a signing algorithm there's an H Mac function in the crypto library the library you had used for handling jobs will do that for you you give it a secret key and out comes this hmm and that H Mac is essentially the blue part you saw on the previous slide and that's will be added as a third part of the job and will be sent to the client along with the rest when you get this back you'll have to verify the validity when you get a job token you get the data and a signature and you have to verify is this indeed a valid top token yes or no what you're gonna do or what a library is gonna do for you if you call the verify function it's gonna recalculate that signature and it's gonna compare that to the original signature attached to the token and if they match you know that the data was the same as the data had been signed before all good if they don't match something has changed something as input to this algorithm here has changed it's not gonna be your key so it must be the data you don't care whether it's a single white space or 15 megabytes of additional data you don't care about that if it doesn't match it doesn't match and we reject that token and that is a very soon mechanism of doing that all right libraries handle all of this for you if you implement or if you use them correctly what are some of the problems with this mechanism so if you if you look at this setup again what are some of the problems with this scheme why is this maybe not very useful in most cases a very correct sir both services in this case we need to know the secret key and why is that such a big problem sure in a real world this might be different services controlled by different parties and this is a problem because it's the same key used to generate signatures and to verify signatures meaning if you set up the scheme and you have a partner and they want to verify your shots and they're like hey man just give me your secret so I can verify whether it is his fell it and you give it to them they're gonna be very happy because they can verify your shots but it can generate arbitrary shots there as well so you should never share your key one thing to remember but there's other problems because is this a secret key it's essentially a string it's called a key but it's actually just a string and usually it's a silly string if you look let's go back a slide actually I can show you this is the secret string here and in my case it's super-secret H Maki which is probably not the best secret to use as input for your H make algorithm especially not if you put this on PowerPoint slides and show them to everybody who wants to listen so this is definitely a problem and because of that people have started attacking that so you find attacks in the wild where people take a job from a service and they simply start brute-forcing the key why not it's simply a string and if a string is not very long and the string is predictable they can eventually find the string you have use to generate the shots and moment they have that they can start generating arbitrary tokens which is a very big problem this is an attack that is often observed in the wild especially if you have weak secrets for your agent function this is actually a known problem this is something they envisioned from the start and they know this is a problem in crypto use and they actually have there is advice out there in the spec saying you should pay attention to your key size you should have a key that's at least as long as output of the hash used in the H mech so four four shots this is gonna be shot 256 that's being used by default so your key should at least be 256 bits but longer is better of course and don't publish it in a repo don't publish it on in the codebase just keep it somewhere secret that no one else can get ahold of just one other example of a problem it is H max your colleague friend random at the knee of the Meetup whatever has a very very good point you don't want to start sharing that key between services but what if I want to validate a job from someone else what if I have a Google login scenario and I get a token from Google and I want to check whether it's valid or not well let's hurry use a second signature scheme child-support H max and they also support a symmetric signatures and they work like this so you have your data and you're going to generate the signature using a private key I'm gonna use again a cryptographic function for that it uses RSA under the hood there's some other algorithm support as well but you don't have to worry about this simply call the function and out comes this signature which is attached to the crowd and sent to the clients and this is where it becomes interesting when you get this back to verify the signature we're gonna use the library function again verify signature and it's gonna ask a key as input but a public key and it's gonna give you a boolean result through or false saying either the message is the same as the one that has been signed or something is wrong meaning the data might be different going in here or the public key may not be a valid public key to verify the signature and this is a much stronger mechanism it's more complicated to set up it's not simply adding a string it's a bit messy to deal with public private keys to generate them to store them to distribute them and the whole point and I'm gonna talk about that in a lot more detail in the rest of the slide key property here generating signature is done with a private key the word kind of implies you want to keep the thing private if you ask Google like hey can I get your private key I'm pretty sure they're not gonna respond to your email there doesn't doesn't work like that public-key is like the word says supposed to be public if you want to find out what Google's public key is for signing identity tokens you can easily find it I'm gonna show you later how Open ID Connect makes it even more easy to find but public keys are supposed to be public signatures generated with a private key can be verified with the public key those other properties we have here so a quick recap on shot signatures so the hmx are actually only useful if you have one service generating and consuming its own tokens so if you have a service generating a token sending it to a client getting it back and verifying the data yeah sure and a trick is gonna be easy you're gonna be up and running in no time valid use case everything else you need a somatic signatures you need the public private key even if you're running multiple services within your own organization use public private keys don't start sharing hmx across different services because if one gets compromised the whole system comes crumbling down which is a very big problem that's what I explained here and then as hermetic is used a lot more often even though not every developer is aware of the existence of that signature scheme signatures public private keys there's one very big problem that arises if we talk about these things it's called key management what why is first before we talk about solving it why is key management such a challenge what's what's so hard about key management absolutely but when the key rotation you cannot use the same key for eternity now you have to depends on how much you use it but if you sign enough stuff with one key you have to rotate it and move towards another key or somebody steals your key you probably want to switch it out and start using another key and that brings us to key management I'm not going to talk about when you rotate that's too deep down in the crypto but I'm going to talk about key management how well rotating is easy just throw away the old one and generate a new one and you have rotated your keys but what if a job comes in with the old key signed with the old key how do you verify that how do you even know which key you used to assign a child those are challenging questions all right key management if you receive a job you'll have to verify the signature to verify the signature you need a proper key yet we have the proper key you need to figure out what key was used to sign this job token key management and jobs can be very simple or very complex let's start with a simple case in a very simple case you can use a simple key identifier what you can do is sorry I'm gonna come to the details on the next slide so the simplest mechanism is a key identifier but you have a lot more distributed mechanisms to this to a set up key management as well here's the key identifier it's a header claim K ID so the spec actually allows you to specify this claim in the header K ID here's the big word and it can hold a string and that string is a key identifier how you call it that's completely up to you if you want to call your key monday for because you use it on Mondays and then use a new key on tuesday i don't recommend it but nobody stops you from doing that why not in this case we use a random identifier we generated when we generate the key as well and with that identifier we can refer to that key so if we receive a job token and we were all running on internal services we can go to the key vault and say like hey what's the key belonging to this ID and whether it's an H my key or whether it's a public private key you can grab the public key with that identifier and then you can actually verify the signature of that token you can have multiple keys in existence because token might be signed with one key fellas for a few days and you rotate your keys every 12 hours and you're gonna have multiple keys in use and this key identifier solves the whole problem it's one way of doing these things four key pairs you can have like I said a central key vault where you fetch these keys but you can also well you can use the key i ki D for that but of course you can go a bit further and one of the ways to go further is to use claims like JK u j WK x v u and x v c which is i don't know they're a big fan of three-letter abbreviations for all of these claims but essentially what they allow you to do is they allow you to specify either keys directly in the token or a way to retrieve the keys you need to verify the signature let me show you a few examples of what this means in practice by the way this is full expect in the JSON web key specification RFC 75217 if you click on the RFC numbers but essentially it's a separate spec detailing all the aspects of keys and jobs and representing them and whatnot there is a way to create a JSON web key and that's essentially a JSON format to represent the public key I don't have it on a slide because it's absolutely ugly if you look at an example it's like what the hell is this but the libraries know how to handle that so you're not sure about that if you want to use a JW WK representation the library will read that fine and we'll process that what you can do is you can use the j WK parameter to embed a key directly in the header of the token so you can provide the service you're sending it to like here's also the key used to sign this token so you can easily use that in practice people don't use this that often because first of all it's a mess to put them in there it's rather big and you have to send it over the wire over and over again in reality people usually use the JK u parameter and JK u stands for JSON web key URL and you can see an example here you have the JK you claim and it points to a location adjacent file containing a JW KS which is a JSON web key set it contains a set of keys in that file so now we're telling the receiver like this token has been signed you can receive a set of keys from this URL rest great is one of my training applications by the way and by the way the key we used has this identifier so what the receiver now can do is this receiver can now go to that server to that endpoint get the key sets parse the keys have a set of public keys and then select the one that has been identified by this token and they can verify the signature and if you want to switch our keys that's fine you can generate a new one you add it to your key set and from then on everyone going there will get a new key and will be able to verify signature just as easy as that if you want to revoke a key you can even kick it out of that file and it will no longer be followed here's an example of how you generate these things so what you typically need is a more advanced shot library if you set the default Java shot from out zero they only have basic support for these things but the more advanced one supporting the full set allows you to specify a URL and add it to the header just like you see here there's an alternative you have the x5 c and x5 few claims which again are fairly complicated name wise but what they allow you to do is they allow you to do the same thing but with a key that belongs to a certificate so we have these x.509 certificates TLS certificates they're called and essentially they're also contain a public key so what you can do with this claim is you can say either add the certificate directly with the x5 C which again almost nobody does or you can include a URL as you can see here pointing to a URL a location that contains that certificate with the public key here it's essentially the same way of doing things except it's a different way of representing piece that's essentially and in a nutshell what this comes down to this makes your question maybe a bit more interesting like how do you know whether the key is felid or not well this keys these keys are x.509 certificates so they're signed by a CA whether it's your internal CA or a public CA doesn't matter here but they're signed by another party so that already gives you a way to trust or to decide whether you want to trust a key or not but like I said you had a very valid question how do you know if you can trust that key because what you're referring to is the following scenario what if somebody gives you a job token that looks like this with your service accept it yes or no well it depends it depends whether the one building that service asked that specific question yes or no if they implemented that service simply saying like hey let's take this header oh there's a gku let's fetch this key material from evil at example.com let's look up this particular key verify the signature oh yeah everything checks out let's take the claims and decide whether we are authorized to make this up to this operation yes or no if you did not think about hey what if somebody gives me a token sign with a malicious key hosted on a malicious domain you're in trouble and I can show you applications that do it exactly this do exist so it's not because you're using the proper claims and key management mechanisms that your application is secure fixing this requires effort if it's like I said if it's a certificate based key you could verify the certificates chain and decide whether it's valid or not but that alone would also not be enough because even if the certificate is signed by CA that doesn't tell you much yeah sure that's see a sign that certificate but how would you know whether it's issued to me or to an attacker you really don't you just know that report are customers from that one CA so even there you would have to validate individual keys or the locations where you fetch the keys from so that's absolutely crucial to prevent mistakes in your application so what you can do is you can check keys using their fingerprints so a public key always has a fingerprint like a short a representation unique to that specific key and you could have a list in your application of a proof keys you could say like these keys are approved because we know they are valid but of course that makes it a bit challenging to to make this behavior dynamic this means every time there's a new key you'll have to update your list of proofed keys dynamic whitelisting is possible dynamic approval of keys is possible but only using an out-of-band or communication channel and that's exactly what Open ID Connect is doing I'm going to show you that in the next two slides something you can also do is you can limit develop sources of your keys so you have this J key you this key URL as saying load the keys from there that's a simple symbol that's a URL you can verify where that URL points to you can say like okay this is all fine but we only accept URLs point to HTTPS colon slash slash keys dot google.com I'm just making stuff up now but only there and we don't go to third-party sites we don't go to strangers web sites stuff like that yeah okay so the remark is that controlling DNS is very easy and you cannot rely on that I would beg to differ what a definition of easiest yes Dennis DNS can be controlled sure there's a potential risk that that might happen of course we're dealing with HTTPS here so you should be fetching your keys over HTTPS so that makes it already a little bit more tricky for the attacker to intercept or manipulate a traffic but yeah there's always a risk there's also DNS over HTTPS if you really want to make it more secure but you're right if you drill down drill down deep enough you might risk it there as well but HTTPS here should help prevent the first wave of attacks all right so that's that's a short word about key management how does this work and open any connect well like I said before open early connect heavily uses JSON web tokens they have to be able to to allow you to verify those identity tokens without having to approve certain keys upfront because if you implement or offer an application that does login with Google and log in with github and you don't know what the keys is gonna be and you might check it today like all Google is using this public key and tomorrow Google is like yeah we're gonna switch which are the keys and now we're using this key oh yeah how do you handle that in your application well open ID connect solves that with discovery and discovery means there's every identity provider will have a folder called dot well-known which is a reserved name and then they have this open ID configuration file that's essentially a JSON file that explains to you how this identity provider is configured and they allow you to discover everything about supported features of that identity provider but also stuff like JW KS URI the URI for the web key set and this points to this is from one of my applications as an example this points to an outsider endpoint with ads a WPS to chasten containing the files of that provider so the identity token will be signed with one of the keys listed in this website in this key set and a service using that or consuming that in the backend can use that feature to actually get the proper keys and verified it and in code this looks something like this so essentially this is again a more advanced library of handling these things what you can do is you can get the proper key material you have the domain here which is a hard code in this case for one identity provider you always need that to initialize your open a deep connect protocol that's essentially the location of the discovery file and if you feed that into the library where you can get a provider you feed it in here and it will fetch that discovery file it would look at the configuration of the Open ID Connect provider it will look at the tray wks you all right fetch those keys and load them into the library to handle that and then you use that k ID which is part of the header so you get the k ID out of the header and you get the proper key out of the key set and then you can verify these things and the reason this works over the example we had before is because now we we fix this URL of the identity provider we tell the application upfront or identity providers rr0 and Google and github you can fetch all of their configure configuration files fetch all of your keys and can start verifying the tokens that they issue and then of course after you have verified the signature the proper way so you can see we generated a public an RSA public key from that key we received we initialize the algorithm and here we verify the here we verify the actual signature along with the properties that we are suppose to check and once we have that we know we can trust token and we can use the identity information from the user in that token to link the identity from from the person at Google with a person within the application so this is one of my training applications if you use this without 0 it will link your unique identifier with a user in the application that you could resume your session if you already had one from before and that's essentially the proper way of doing these things and this is actually a very tricky problem and many people don't think about this simply hard code a secret start using it and they forget about it because it works but it only works if you get it right so let me try to put this back into / into context here so juts heavily rely on cryptography and whenever cryptography cryptography is involved it's gonna be messy there's gonna be a lot of details you need to get right just look at library implementations they all got it wrong every jot library in the beginning had these vulnerabilities of the non algorithm or the mix-up between symmetric and asymmetric and these are built by pretty knowledgeable people so it's very very tricky to deal with cryptography implementation wise but also management wise you have to think about a lot of things upfront how are you going to get the proper keys how can you manage the keys where are you going to store them how can are you gonna rotate them how will you know which key has been used and the job specs actually have support for these features but many people don't know what all of these claims mean and how to use them but I hope that's after this talk here we actually do know what these things mean in practice the reason shots are so tricky and many people actually dislike them or flat-out hate them is because we are there they contain metadata about your crypto they contain the algorithm being use they contain a key identifier and the problem is you have to use this information before you can verify the validity of the jobs every piece of information we take out of the header to verify the signature has not been verified until we verify the signature and that's why these things are so messy that's why the attacker can provide a URL controlled by the attacker because we have to use it before we know whether it's valid or not and that makes his heart and there's attacks against the confidentiality park the encryption of jobs that abuse these features as well because if you have to decrypt it you have to use part of the header to decrypt it and your attacker can start manipulating that you can start extracting the keys again these attacks are very very technical and so I'm not gonna talk about them here I'm actually wrapping up the talk here but be aware that this is a problem so the less you have to rely on the header information the better don't take the algorithm from the header information if you know it's gonna be RSA RS 256 just hard coded or configure that in your application without using it from the token itself that already saves you or reduces your exposure the only thing you actually should be using is the key identifier that's the only thing you can you probably cannot do without unless you want to start guessing keys until one of them succeeds which is also not that recommended all right so I didn't want to make it too heavy for this evening talk if you want a a good overview of what this means in practice I built a cheat sheet a security cheat sheet about JSON web tokens so you can go to this URL here and you can grab a copy of that cheat sheet they're also available on my Twitter yeah sure go ahead and it actually gives you a checklist of all of these things that might go wrong and it should be watching out for when you're implementing and chasing web tokens there's also one about angular applications if you're doing angular that might be interesting for you as well but of course that's a bit out of scope for the topic of and this talk so I know I maybe went through it rather quickly we saw some more complex aspects are any questions about these things anything you you still want to see answered so the question is what is there a level of trust you can have any systems and if some systems need more secure solutions and others honestly from our experience that it doesn't matter too much in a sense that shots by themselves are not insecure yes we have had some implementation issues for sure and we we have some misuses but you have the same in the xml rules and that's a small part about this is very frustrating because a lot of the funner abilities we saw them and we had them in the xml wrote and then people are doing the same thing with Jason now and they're making the same mistakes because there are different people and there's no like lessons learned that they can take and apply is this good to use yes I think it is because many protocols actually build on that we have open ID connect and a lot they both use JSON web tokens to represent claims in Europe you have the the open banking API is now PSD to requires banks to have an open API and what they use for an authorization framework is a lot under the hood so I haven't checked the details but I'm fairly confident that they'll also be using JSON web tokens and that's a financial role so are they're secure enough if you implement them correctly yes I don't think there's that big of an issue to represent the claims in that way that the bigger question and those cases might be how do you use the claims what are you using the job for is it to simply represent a set of claims or is it a track or or two to push state to the clients because if you're doing that things might get tricky again if you're putting sensitive information there personal identifiable data you probably want to encrypt that job before you push it to a client application because it's gonna be stored on the client one way or another and that sensitive information will remain there for a long time so I think these things matter more or are tied to real questions and I think the technology of cots is definitely good enough to use in all of these cases it depends what you put in there if the claims you're storing in your token are not sensitive if somebody would read them and then it doesn't matter so the confidentiality the encrypting would help against unauthorized reading of the content the data that temper proof temper protection is actually your signature so if you want to detect changes of the data when it comes back the signature suffices if you put someone II was talking I was talking with somebody before I talked and they said like yeah we just put a random identifier in a job and we use a signature as kind of an authentic mechanism where it comes from the proper service by verifying that with a public heat and you don't need an encryption because it's random stuff anyway if you put stuff in there they say like yes this is data but it's not personal data from a user it doesn't have any privacy impact if somebody would read this then I would stay away from encryption because encrypting juts is again a fairly complicated topic and getting that right is definitely not easy however if you are putting stuff in there if you are putting sensitive information in there I would first of all check whether you actually need to expose that information but if it's a valid use case and if you decide that you actually want to keep that in a job on the client and yes I think encryption is important to give you an ID over any connect by default does not encrypt the jobs but the spec the protocol has configuration options where you can require or request that the data is encrypted by the provider but I haven't seen any implementations using that in practice yeah yeah that's a very good remark if you encrypt it shot well I would say that a client usually shouldn't rely on the data in a job anyway but if you encrypt that the clients will not be able to read it because the moment you put your heat in the clients you're making a very very big mistake so no in the case of encrypted tokens only the receiver will be able to actually read the data in there so it might be something to consider there as well it depends on a scenario if you are building a service where you want third party services to verify tokens you have issued let's say you're building an identity provider like Google does and you want other applications to be able to verify your jobs then yes publishing your keys is the best way to go preferably through a configurable mechanism like OpenID connect but otherwise just having the key URL is fine if you're building internal services if you're building an application an enterprise set of applications and they all use shots issued by one service internally I wouldn't publish your piece if nobody needs to use them then I would highly recommend to use a key vault internally where the applications can fetch the keys using the key ID there you don't need the whole URL set up the URL service usually or it's mainly useful in a distributed scenario where one party issues just that other I just have to verify then the URL is a very easy way to get that Keef material and get a proper key to verify those jobs no in a dedicated key vault so the best place to store the keys is a dedicated key voltar's services like hasha Corp has vault it's called and those things are designed to store secrets and to manage secrets and you can assign permissions on which applications can get access to what secrets so that's the proper way of doing things there's the most popular one is Heshy code fault there's implementations for example spring has an implementation that connects to do that that's mechanism to retrieve keys whenever they're needed thanks for listening I guess we have a short break so if you have any more questions that you don't want on the record we can we can discuss those during the break all right thank you [Music]
Info
Channel: Full Stack Developers Israel
Views: 25,475
Rating: 4.9411764 out of 5
Keywords:
Id: DPrhem174Ws
Channel Id: undefined
Length: 37min 18sec (2238 seconds)
Published: Sun Jun 16 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.