Introduction to HashiCorp Vault with Armon Dadgar

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
i'm armand and today i want to spend some time talking about vault so when we talk about vault the problem we're really talking about solving is the secret management problem and so when we start talking about secret management the first question that naturally comes up is what is a secret so when we talk about secret management what we're really talking about is managing a set of different credentials right and so what we mean when we talk about these credentials is anything that might grant you authentication to a system or authorization to a system right so some examples of this might be usernames and passwords it might be things like database credentials it might be things like API tokens or it might be things like TLS certificates the point is any of these things we can use these to either login to a system and authenticate such as a username and password or we're using to prove our identity sup like a TLS certificate and so we're using it to authorize access potentially so all of these things fall in the realm of secrets and these are things they want to carefully manage we want to understand who has access to them we want to understand you know who's been using these things and in the case of most of these we want some story around how we can periodically rotate these and so when we look at the kind of state of the world of how these things get managed in practice what we see a secret sprawl right and what we mean by secret sprawl is that these end up everywhere they're in plain text inside of our source code so maybe it's hard-coded in a header what the user name and password it is it ends up inside of things like configuration management so again this is living in plain text and chef or puppet or ansible and so anyone can log in and see what these credentials are and ultimately all of this typically ends up living in a version control system like github or get lab or bitbucket and so these things end up sort of strewn about or sprawled all over our infrastructure and so what are the challenges with this world well we don't really know who has access to all of these things so we don't know does anyone in our organization with access to github can they log in and see the source code and thus see what our database credentials are right and even if they could do it we don't know if they have done it we have no audit trail that says just because I Arman could have seen that secret did he go on and so we really have no fine-grained ability to manage who has access or to even audit who's done what with it were skin how do we actually rotate any of these things so if we realize we do need to change our database credential there's been a compromise or we're doing a periodic rotation it's very very difficult if we're in a place where it's hard-coded in our source code or its Stern about in so many different systems that it's really difficult to really know how to effectively do this rotation and so this state of the world is what we refer to as secret sprawl and so one of our first goals and we started working on vault was to really look at this problem and say how can we improve it and so this is really where vault came from so vault really starts by looking at the secret sprawl problem and saying we can only solve it by centralizing right so instead of having things live everywhere we move all these secrets to a central location and vault promises that we're going to encrypt everything both at rest inside a vault as well as in transit between vault and any of the clients that want to access it right and so this gives us a few properties one unlike these systems where we're restoring the stuff in plaintext at least now if you could see where the secret is stored at rest it's encrypted so you don't get implicit access to just be able to see this secret the next thing is vault lets us overlay fine-grained access control on top of all this so instead of it being anyone in our organization who has access to github and can see the source code now we can go much more fine-grained and say you know the web server needs access to the database credential the API server needs the API tokens but everyone shouldn't have access to everything and then on top of this we have an audit trail so now we can actually see what credentials is a web server access what credentials did Arman access from the system and so we have much more visibility and control over how these things are all being managed this is sort of the level one challenge with vault is at least moving from a world of sprawl where things are everywhere to world of centrality where we have sort of strong guarantees that it's encrypted strong guarantees around who has access and strong visibility into this so this becomes our first level thing the next level challenge becomes realizing who we're giving these credentials to right so great we've store all these credentials safe in volt and now we're gonna thread these out and provide it to an application the challenges applications do a terrible job keeping secrets inevitably the application will log its credentials out to a logging system so I might write it out to standard out this gets shipped off to Splunk and is now in a central log that anyone can see it shows up in any sort of a diagnostic output so maybe our application has an exception and it shows the username and password and the trace back or inside of an error report it might be shipping it out to external monitoring systems when there is an error and so in general what we find is applications do a poor job keeping things secret so even if we do a great job centralizing it and strongly controlling it and encrypting it on the way to the application the app isn't trusted so one of the second-level capabilities of all introduces is what we call dynamic secrets and the idea behind a dynamic secret is instead of providing a long-lived credential to the application which it inevitably leaks we provide short-lived ephemeral credentials so these things are dynamically created but they're ephemeral so we might only give a credential to an application that's valid for say you know 30 days and the value of this is a few fold now even if the application leaks this credential out it's only valid for a bounded period of time so it might write it to a logging system and that becomes visible but we create a moving target for an attacker by constantly revoking and issuing new certificates the other thing that's valuable is now each credential is unique to each client so previously if I had 50 web servers all of them would come in and read a static database credential and so this means if there's a compromise and that database credential gets out it's very hard to pinpoint where the point of compromise was there's 50 servers they're all sharing the exact same credential versus in a dynamic secret world each of those 50 web servers had a unique credential so you know very specifically web machine 42 was the point of compromise right the last thing that this lets us do is have a much better revocation story so now if we know web machine 42 was our point of compromise we can revoke the password username and password for just web machine 42 and isolate that leak but if all 50 machines were sharing the same username password the moment we try and revoke it would cause the entire service to have an out right so the blast radius of a revocation is much larger when you have a shared secret versus the dynamic secret the third challenge we found was that applications are often storing data ultimately and so the challenge becomes right how do the applications protect their own data at rest because we're not going to be able to store all sort you know all information with involved well is meant just to manage secrets not anything that might be confidential so what we often see is that one is vault as being used as a centralized sort of secret management store people are storing encryption keys so we might put an encryption key inside a vault and then distribute that key back out to the application the application is doing cryptography to protect data at rest what we find though is applications generally don't implement cryptography correctly there's lots of subtle nuances and it's easy to get wrong and with these kind of you know mistakes often times it compromises the whole cryptography when those mistakes are made and so one of the challenges we often look at is how do we get away from ultros storing an encryption key and handing it to the application and assuming the app will do cryptography right so this has evolved into a capability that vault calls encrypt as a service and the idea here is instead of expecting that we're just going to deliver a key to a developer and the developer implements cryptography correctly volt will do a few things one is it will let you create a set of named keys so I might create a key that I call you know credit card information and a separate one I call a social security number and one for PII and these are just names I'm gonna just name this key and I'm not going to actually give this value out but then what we expose is a set of high-level API is to do cryptography so these API so be kind of the classic operations you expect right things like encrypt or decrypt or sign or verify so now as a developer what I'm really doing is calling volt with an API and saying you know I want to do an H Mac using my credit card key and some piece of data right and what volt is shielding is the implementation is being provided by volt so we don't have to trust that the developers implement at these hilum operations correctly and the key management is also being provided by volt the developer never actually sees the underlying key this lets us do a few things one it ensures that the cryptography is correctly implemented because we're using a vetted implementation by volt this implementation is vetted both by us by the open source community and by external auditors that we use it also lets us offload key management so if we think her prog rafi is hard key management's even harder and so in practice when you ask how many applications properly implement key versioning key rotation key decommissioning and the full lifecycle of key management the answer is very few because it's challenging but by offloading this to vault we can actually use high-level api's to do all of this so we get the full key lifecycle as well provided by vault and so in practice these end up being the three major challenges that we're trying to help developers with right how do we move these credentials out of plain text and sprawled across many different systems into a scenario where they're centrally managed with tight access control and clear visibility - then how do we go even further and protect against applications that aren't necessarily to be trusted in keeping secrets and we do this by being ephemeral so we create this moving target where what we're really managing is that the web server should have access to the database and that credential that enables it is dynamic is a dynamic one instead of static and then lastly how do we go further in helping the application protect its own data at rest and that's done through a series of key management and high level cryptographic offload so these three are kind of the core principles of vault so now maybe we'll zoom in quickly and talk a bit about high level architecture of how does this actually get implemented so when we talk about vault architecture there's a few important things to realize one is the vault is highly pluggable it has many different plug-in mechanisms so when we talk about vault it has the central core which has many responsibilities including sort of the lifecycle manager and ensuring requests are processed correctly and then there's many different extension points that allow us to fit it into our environment so the first one that's extremely important is the authentication backends these are what allow vault to allow clients to authenticate from different systems so for example if we're booting an ec2 VM this ec2 VM might Ascenta gate using our AWS authentication plugin this plugin allows us to tie back into Amazon's notion of identity to prove that the color is for example a web server but if we're have a human user they might be coming in and using something like LDAP or Active Directory to prove their identity if we're using a high level platform maybe something like kubernetes we might be using our kubernetes authentication provider and the goal of these authentication providers is to take some system we trust whether it's kubernetes LDAP or AWS and use this to provide application or human identity at the end of the day that's what we're getting out of this is a notion of the identity of the caller this is great and then we use that to connect to an auditing back-end which allows us to connect and stream out request response auditing to an external system that gives us a trail of who's done what so this might be you know Splunk as an example where we're going to send all of our different audit logs vult will allow us to have multiple different audit logs so we can also sent a Splunk as well as a system like syslog as an example the next level challenge is where does vault actually store its own data at rest right so if we're gonna read and write secrets to vault it needs to be able to store these things somewhere and so these are what we call storage backends so storage back ends are responsible for storing data at rest so this can be really a couple of different things it could be a standard our DBMS so you know my sequel Postgres it could be a system like console it could be a cloud managed database like google spanner but the goal of these back-end systems is to provide durable storage in a way that's highly available so we can tolerate the loss of one of these back-end systems the last bit is how does console actually I'm sorry vault provide access to different secrets these are the secret backends themselves and so these come in a few different forms so the biggest use of these is to enable the dynamic secret capability we talked about before so one form of secret back-end is a simple one it's just key value so I might just store a static username and password in there and I'm giving it a username and a password and these things are static and this is just a key value store that's encrypted at rest however as we get more sophisticated we might want to use the dynamic secret capability we talked about and so that is where these different plugins are coming in so we have different database plugins it's a database plugin will allow us to dynamically manage my sequel and Postgres and Oracle and etc credentials we have things like RabbitMQ so maybe we're doing dynamic credentialing for our message queues but this kind of goes on you can even apply the same principle to something like AWS we might have applications that need to read and write from s3 but we don't want to give them long-lived access to iam so instead we define a role in our AWS back in and we'll go and dynamically generate short-lived credentials as needed so this extends that sort of dynamic secret paradigm so this is an extension point that allows both to apply this same principle to many different things one common use of this is PKI so in practice certificate management tends to be a nightmare and what we often see is very long live certificates maybe five to ten year lived certificates because we don't want to go through the process of generating them versus with a vault we can define them and programmatically generate it so in practice people use very short live certificates maybe a shortest 72 24 hours and this way you're constantly moving and creating a movie target this list sort of goes on and includes things like SSH as an example so we can broker access to SSH as well so you don't have a single PEM to rule them all across a large state of machines so at its core this is what makes fault so flexible right it allows fault to manage clients that are authenticating against a different set of identity providers we can audit against a variety of different trusted sources of log management we can store data in almost any durable system and then we can extend the surface area of what types of Secrets can be either statically or dynamically managed by adding new secret backends so this becomes a vault in a single instance nutshell so as we talk about running a vault instance each instance of it is one of these and then in a broader deployment what this will look like is we run multiple instances to provide high availability so at the highest level we'd have a shared back-end for example this might be console which internally is you know three different servers as an example providing us a che and then we will run multiple vaults in front and what fault does is he'll coordinate with the shared back-end to perform leader election so one of these might be elected our current leader and so as a client when we're making requests we're talking to the leader and even if we talk to sort of a non leader will be transparently forwarded to the active leader and so in this way if any particular node dies power loss process crashes you know maybe network connectivity as an issue we will detect this detect and promote a new one to leader automatically and this instance takes over active operation and our other secondaries will begin to promote and so this is what volt looks like at a high level it operates as sort of a shared network service and we're talking to it just as an API client over the network so what volt typically exposes it's a restful JSON API so it's JSON over HTTP making it relatively easy to actually integrate with our applications I hope this was useful as a high-level introduction to volt and please check out our other resources to learn more thank you
Info
Channel: HashiCorp
Views: 127,635
Rating: 4.9299693 out of 5
Keywords: Vault, HashiCorp Vault, HashiCorp, Armon Dadgar, secrets management, Dynamic Secrets, Data Encryption, security, IT security
Id: VYfl-DpZ5wM
Channel Id: undefined
Length: 16min 52sec (1012 seconds)
Published: Fri Mar 23 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.