How HashiCorp Vault Solves The Top 3 Cloud Security Challenges

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone and welcome to today's webinar how Hashi court fault solves the top three cloud security challenges please feel free to type your questions during the webinar we compile all questions and answers at the end of the webinar and send the results to everyone registered this webinar is recorded and recording will be made available after post-processing usually within two days with that let's go ahead and get started awesome thank you so much remand ax and welcome everybody to our webinar on vault and how we can use it to solve our top security challenges briefly a little bit of context my name is Arman Dodd hair you'll find me all around the internet as just a Tarma and I am one of the co-founders and CTO of hosh record for those of you who are you know new to Harsha Corp or just learning about vault OSH Corp as a company is focused on sort of the entire application delivery process and so the way we like to think about it is there's three distinct layers in terms of how we manage and deliver our applications there's a provisioning challenge of providing the core underlying compute whether that's public cloud whether that's private cloud or some mix and there's the security challenges of how do we provide access to secret material and secure our applications and data on that infrastructure and at the highest level is the runtime of how do we run our applications or services and our appliances that that we need for our infrastructure to work together and so the way this comes together is we have six different open-source tools you know many many developers are familiar with or use vagrant on a daily basis that's where our journey starts then we have tools like Packer terraform vault nomad and console today we're going to spend some time talking about volt specifically and then there's you know for several of these products there is an enterprise equivalent that's designed to be used for organizations and teams that are leveraging the open-source version so with that let's talk about today's agenda so we're going to start off by talking about what are the a top-level use cases for ball under what scenarios would you even considerable as a possible solution then I'll do a brief introduction to what what vault is and how it works and then we're going to end some time on the new features of both 0.7 and a few of them from vault 0.65 which was the last major release then I'm going to do a quick demo and run through of what replication actually looks like with the service and and how you'd leverage something like encryption as a service since the top three use cases we see the drive people to adopting vault as really in secret's management encryption of the service and privileged access management so to talk briefly on each of them secret management is the challenge of how do we distribute to her and machines and application the secret material they need to function so this could be things like database credentials API token such as you know your AWS access team secret key it can be TLS certificates to sort of secure traffic anything that you might use to authorize or authenticate an application falls into this category the next major use case is encryption as a service so if we have data at rest to data at transit that we'd like to protect how do we do that this has challenges of key management so if we have various encryption keys how do we you know ensure that only a subset of people have access to those keys how do we rotate them how to reinsure the sort of the key life lifecycle is properly handled and then in addition to just the keys of the cryptography itself where Bullock can help us by doing cryptographic offload so instead of implementing cryptography in our end applications and making sure the producers and consumers all implement it the same way we can offload the challenge to vault and use its API to do sort of encryption for us as a service and the last one is privileged access management so if we have a central system that's storing our encryption key is our database password so on and so forth how do human operators get access to it as well so if I need to go in and perform some sort of maintenance operation against the database I don't want to maintain that secret in a separate location and have one system for my app so a separate system for my operators I'd like to centrally manage all of so the more human oriented aspect of it is the privileged access management so these are sort of the top challenges in this space what I often find useful is to give a sort of a brief primer on how should we think about the problem space itself like when we talk about secrets management what are the challenges we're talking about what this really starts with is defining what is a secret right the way we think about it is anything that's used for authentication or authorization so I can use a username and password to authenticate myself I can use API token TLS is being used to verify my identity so these are very sensitive pieces of information I'm sorry these are very secret pieces of information sensitive on the other hand is anything that we'd like to hold confidential but can't really be used directly for authentication or authorization so for example social security members credit cards emails these are sensitive pieces of information but they're not secret plan there's a delineation there because the volume we have of secret material where we maybe have you know a thousand or ten thousand or on the outside a hundred thousand pieces of secret material we might easily have you know millions tens of millions or billions of pieces of sensitive information so as we're talking about secret management there is a number of important questions we have to be able to answer right how do our applications get these secrets how do our humans our operators our DBAs get access to these secrets how are these secrets updated so if I change the databases password for example or on my Amazon token expires how do I revoke it so in the case that you know an employee who had access to it leave or you know we just have a rotation policy how to and we need to revoke it or something is compromised when we're secrets used so as part of building a comprehensive audit trail we want to know when secrets were actually used and lastly what do we do in the event of a compromise so if we find that our database password has found its way to a forum somewhere what do we do now and so the state of the world that we generally see in answering these questions is what we kind of refer to as secret all is that the keys the secrets the tokens the certificates are sort of distributed all over the place and they live you know hard coded and applications they live in configuration files they're in chef and puppet they're in github they're in Dropbox and on wiki so sort of all over the place and so the challenge is there's this extreme decentralization of where the material actually live there's very limited visibility into where where they live how they're used if they're updated who has access to them so there's sort of a challenge in terms of from a security perspective how we reason about that secret material and then lastly it is a very poorly defined break glass procedure so in the event of a compromise what we actually do the procedure that we follow is very hard to define because we really don't have much visibility or even central control because of the sort of secret sprawl problem so this is where vault comes in this is where we look just to sort of improve the state of the world so vaults goals from the onset were to be a single source for secrets so sort of merging merging the kind of privileged access management of how our operators get access with our secret management of our applications get access to give us that sort of sensuality and so we want to have programmatic access for applications that are doing it in an automated way and we want to be able to have a friendly access path for our operators who are doing it manually or you know on an as-needed basis and for this to work we really wanted to focus on practical security what we mean by that is there's always a trade-off between theoretical security you know if I have access to a physical Hardware I can freeze the memory can I pull out a key in transit versus you know how difficult is the system to use or how expensive is it to operate so our goal is to find a practical medium there and lastly when we say modern datacenter friendly we really mean thinking about sort of clouds so how do we do this in a pure software way that doesn't have dependencies on you know special hardware that we we probably don't have access to in cloud environments so the key features of bolt you know most basic core is secure storage of secrets right and this is making sure that in transit and at rest everything is encrypted then as we get sort of past the basic level storage there's dynamic secrets which we'll talk about it's a mechanism for generating credentials on-the-fly there's a leasing renewal revocation lifecycle which allows us to have audit ability and visibility and that key rolling and revocation story auditing is a critical part of this whole story so we want to make sure we have that visibility get away from that sort of low visibility state of the world rich ACL so we want to have very very fine-grained access controls over who can do what and lastly multiple authentication methods and this is important because we're going to have humans login with things like username and password versus applications which won't be logging in that way so the most basic level is how do we just make sure bulk can operate like a secure storage site right and so this promise starts by saying all data is encrypted in transit on the way from the client to vault as well as at rest so we're bolt is actually storing data on you know on disk everything is always in correcting and everything I've done was 256 bit AES in GCM mode TLS 1.2 is mandated between vault and its clients and all of this is what is done in pure software so there's no hardware requirement to make this work and the goal here is to really encourage using sort of the state of the art of the highest recommendation in terms of things like TLS and cipher mode what does this actually look like a very simplistic example is here we're just going to write a key to the past secret /foo and we're just going to write the value bar is equal to bacon the the takeaway you know here is that volt lets us store relatively arbitrary things so bar is equal to bacon doesn't actually mean anything to well but maybe it means something to our application in terms of it being a username or password and it lays out its data in sort of a hierarchical file system so here the secret forward slash is the directory and foo is sort of an entry within that directory and then we can come around later and do a read against seeker foo and we're getting back okay yeah it's not this year well sorry about that looks like we had a temporary connectivity glitch so like I was saying every secret has a lease associated with it and what this lets us do is automatically revoke at the end of a lease unless it's renewed and so what this lets us do is a few fold one is it gives us certainty that the client is checking in every once in a while to renew its lease so we have an audit trail of when a secret was accessed you know in this case of our refresh interval with 30 days every 30 days you can see the client is still making use of it but we can also change that value and time box how long a credit a claim to upgrade into the newest version so we can say we know within 30 days all of our clients are going to move into the newest version of the secret and lastly if something comes up we believe the secrets compromise something as an issue this gives us a break class for seizures that we know are my Seco credential was leaked we're now going to revoke that and use that as our break class procedure and so every client that has access to it that credential gets revoked and they can no longer make use of it so the question is like how do we make this enforceable how does they'll actually have the ability to revoke it and this is really where the idea behind dynamic secrets comes in is if we actually Gaye's the end client the true credential if we gave out you know let's say the root you know my sequel password there's no way to revoke it once the client knows that we can't wipe it to memory remotely so what we'd like to be able to do is give it a sort of a different set of credentials add the specific to that client that can be revoked and so what both does instead is it generates on demand a set of credentials that are limited access you can though it's sort of following the principle of least privilege we'll give you the least privileged capability into the system and then that token that sort of username and password becomes enforceable by replication so every client has a unique credential that's generated on demand that is tracked and if we decide to revoke it we can delete their particular credential without affecting all of the clients at once right and what this lets us do is have an audit trail that can pinpoint a point of compromise there's only a single client for example that has that username and password so under the hood what that flow looks like is you know suppose we have an operator they were across access to a database that request goes to fault bolt verifies the user has the right you know privileges to actually be able to do this and involved connects to the database and issues you know a create command so we create a dynamic new username and password that has just the appropriate permissions the user needs this is all flowing through the audit brokers so we get audit visibility's the user requested that we generated such-and-such credential and then we provide this dynamic credential back to our user the vault sort of sits in the pre-op flow then the user goes and connects to the database and use that credential so critically the data isn't flowing through both just sort of a pre authentication flow and the way we service support the so is having this notion of pluggable backends within volts so this way there's sort of that last mile integration glue or bolt needs to be able to understand that API of the particular system but by making this very pluggable it's easy for us to grow support over time so today there's you know almost a dozen providers it's growing all the time everything from you know cloud providers like AWS to our DBMS systems like MS sequel my sequel Postgres message cube like rabbit so on and so forth they're sort of a broad range of support for generating add the sub credential and the key with all of this is really what we're trying to provide as we're centralizing our secret access is a way of providing authentication authorization and auditing in a uniform way across all of these different systems right and to that end we have a few challenges on the authorization side we have two different real classes of you know entities that need to authenticate there's a machine which made things like mutual TLS tokens we have a mechanism called a Perl which integrates with configuration management and so these are automated workflows for authenticating machines and applications and there's user oriented methods username and password LDAP github which are sort of more user name/password MFA traditional authorization methods then this system exposes a single rich ACL system for doing authorization by do you know everything in the system is defaults and I so access is provided that sort of them need to know they by whitelist in our experience this tends to be much much more scalable as your organization grows just because it's easier to reason about what you need access to versus what you don't get access to and lastly you get auditing across everything so you have a request/response auditing and the system is designed to fail closed so if they can't audit to any of the configure auditing backend it will reject the client request preferring to sort of reject an operation as opposed to do an operation that can't be audited that has a no visibility so as you might imagine a system like vault which is in this sort of pre-authorization request flow is highly availability sensitive right if you need access to your database and vault is down and you can't get your credential that's a problem so from the very GetGo the system was designed in it with an active scan line model so it could do leader election with council elect a primary instance if that primary fails it will failover to any number of its standbys in so you can have a high level of availability of the service going one step further 0/7 latest release adds multi data center replication so before you have one data center with an active instance in many standbys now you actually have many clusters and those operate in a primary secondary model so one cluster is sort of the source of truth the authority of what the records should be and then it mirrors to any number of secondaries so that you can lose entire data centers and bulk can continue to function one of the interesting sort of challenges with vault is if your data at rest is encrypted you know itself needs a decryption key there's sort of a chicken-and-egg of how does both access its own data if it's encrypted and when of all sources is that this key must be provided online to the system and the reason we do that is to avoid that key itself ending up and it config file thus an interval which itself is them you know in plain text managed you know eventually ending up in something like github where your root encryption key is there into that key that key that needs to be provided to the system the master key is in some sense the key to the kingdom right if you have that key you can ultimately decrypt all of the data at rest and so how do we protect against insider threat right where we have an operator who has this key besides they want to bypass the ACLS and access the data at rest the way we do this is using what's known as sort of the two-man rule or in our case that's sort of an any person rule but it's sort of you know you can imagine the Red October scene of turning two keys to launch the new where how this actually works is there is the encryption key which protects data at rest there is the master key which itself protects an encryption key and then this master key gets split into n different pieces so we have n different shares at e of which are required to recombine into that master key so the default is that we'll generate five shares three of which are required to recombine into the master key so we need a quorum or a majority of our key holders to be present to do this and now what we're really distributing to our operators is this key share that on its own isn't particularly valuable you need a majority of these keys to really be able to reconstruct access to the system and so this lets us sort of avoid being concerned with a single malicious operator attacking data at rest because we've been briefed summary the challenge the vault is looking to solve is the Secrets troll problem and it really looks at two classes of threat protecting an insider threats largely through the ACL system and the secret sharing mechanism and it protects us against external threats using a modern and sophisticated crypto system so briefly looking at encryption as a service is a bit different than both privileged access and secret management is there's a different set of challenges here which is we have a large volume of sensitive data that needs to be protected unfortunately cryptography is hard it's very easy to get the subtle nuances wrong and securely storing keys and managing their lifecycle is also a challenge so what both does is have a back-end called transit where allows us to name these encryption keys you know foo bar Baz or name an after application maybe API and then there's api's that allow you to do operations using these named keys such as encryption decryption and what this lets us do is prevent our applications from ever having access to the underlying encryption keys so we don't have to worry about them exposing it and by leveraging the API is don't have to worry about the application implementing encryption and decryption of these other operations correctly so data can now be encrypted and the keys managed by a vault but then the ultimate data that is generated the sensitive data can beat a coupled from vault and stored in our traditional databases or Hadoop or other mechanisms restoring potentially a very large number of Records and so this lets our applications outsource some of the heavy duty challenges of secret management encryption to volt what it looks like sort of from a high level perspective as you know potentially we have a user that submits a request to our web servers with sensitive data the application sends the plaintext development says please encrypt using let say key web server at full audit side request but that ultimately generates the the ciphertext and returns that back to the web server the let server sent free to store it in its database the equivalent is now you know the decryption route is to fetch the ciphertext slow that to vault ask it to do the decryption and the web server receives the plain text so in this way bolt is only seeing sensitive data in transit it's not actually storing it at rest so volt doesn't necessarily have to be able to scale the store billions of pieces of sensitive data you can continue to use your existing scale-out storage that but leverage bulbs do the encryption cool that was sort of a high-level intro into sort of the use cases and the architecture vault and I wanted to spend a little bit of time to touch highlighting some of the new features that have just landed the biggest temple that came in 0-7 as I said is multi data center replication its model is one of a primary secondary so one cluster is Alecto sort of the source of truth they get to mirrored into all of the other one and so what this is really focusing on is the availability story right so it allows us to lose the primary data center or lose connectivity to it or lose any of our secondaries and have all of the other sites continue to function one of the big challenges especially if you're using things like encrypted service you might be doing thousands or tens of thousands of requests per second so this replication model allows us to scale the request byte byte load sharing across multiple clusters nine of the system is fundamentally asynchronous and so the replication is not synchronous on right this is because availability is a top priority for us so if we're unable to replicate to a secondary kind of offline or there's a network connectivity challenge we want the system to continue to function replication is transparent to the client so clients a bulk continue to just speak the same protocol unmodified their reads can be serviced locally by a primary or secondary any right they do gets forwarded to the primary so the source of truth gets updated and so the core the implementation uses a mechanism called right ahead logging so this is very familiar for folks in the database world is transactions get written to this right only append log and we ship that log down to our secondary sites if our two sites become too far out of sync either because it's a brand-new secondary or maybe a data Suns been offline for hours or days we may have to resort to reconciling the underlying data there's too many logs to ship and so the system makes use of a hash index to recover and then we make use of areas recovery algorithm from sort of database land to be able to deal with things like power loss in the middle of transaction so the general model is we have our primary and secondary clusters within each cluster we have our active and standby instances and their sharing access to their storage back-end in this case for example council and we're simply shipping our logs so as new things are changing on the primary we ship the logs to all of our secondaries if things get overly out of date instead of the very lightweight log ship they'll switch into a more active index reconcile to figure out exactly what keys have different and use this process to sort of bring the secondary back up to speed so they can switch back to log ship the next big feature is an overhaul of the enterprise UI so I'll show that briefly my demo demonstrate replication specifically some of the new changes around being able to do the encryption as a service of the UI as well as management of replication itself some other cool tweaks come through is enhancing the ACL system it's already an incredibly fine grain system in terms of what you can control over in terms of you know can I create a key versus use the key versus delete the key and now we can go even more fine-grain so particularly we can allow and disallow the whitelist and blacklist different parameters to the API endpoints of bolt so now a brief example of that is with the transit back and then I mentioned we can use that to create named key and so we're using those keys to do encrypt and decrypt operation so in this case we might actually specify our policy to say the only types of keys you're allowed to create with the system in this sense we're going to restrict a lot of parameters to only AES types key so both supports a number of other types of keys as well and what we're going to deny is you being able to specify that the key can ever be exported from the system so in the rare cases you may want to support exporting a key if you need to be able to share that with third parties but if you don't you know why even allow an application to potentially export it so here we can use the ACL system to enforce that a key doesn't get exported another option is for strapping so one mode vault supports us any time we do to a read against it or we can have volt wrap the response so instead of actually giving us the response directly you can think about it putting the answer in sort of this one time unwrapped shell and so this lets us do things like have an audit trail to ensure that only the app that was supposed to expose that and look at what the database pasture is for example was the one that did it multiple people along the chain weren't exposed to the database password and so largely this is a mode that when you do the read you can issue to the database and it will do it for you but now you can force it through the ACL system as well so here's an example where maybe we're generating a certificate authority and we would like to say is at minimum you have to wrap the certificate authority for 15 seconds and at maximum 300 seconds so we're going to time box the availability of this thing and we're going to force you to make sure you're wrapping it and there's only a single person who got exposed to it the SSH certificate authority is a very new neat extension so the SSH back-end lets fault broker access to SSH into machines instead giving every developer access SSH everywhere by distributing sort of the one pen file or putting every developers time on every machine instead we give volt at the ability to ssh into these machine and it brokers access and it can do it in three different ways one is dynamic generation of RSA keys it can generate SSH keys on the slide one is a one-time password based mechanism and the newest mode is a certificate authority and so in this mode what happened is a developer since their their SSH key to bulbs and so I'd like to SSH into this target machine both verifies they have the permission to do that and then it's high in fat key so it finds it using a well-known public certificate authority key the user then SSA Shazahn to that machine normally but now using their signed key as opposed to the plain key and then the - arc is representing the server is doing a verification of that signature so it's not actually communicating with fault it's just making sure the cryptographic signature checks out as long as that looks good the user is allowed to SSH in now what's nice about this approach is a few things right is that the client only communicates with both pre-flight well doesn't have to be in the SSH path there's very minimal computational overhead of doing this both basically just doing a signature on top of the existing public key there's no operating system specific integration and it's a very simple and secure mechanism it uses a lot of well-known and will study cryptographic primitives batch concretion and decryption so if we're making heavy use of the transit back-end to do our cryptographic offload it can often be more efficient to incorrect or decrypt many pieces of data at the same time as the most at one request per so this is a relatively new enhancement we can basically batch many different inputs to be encrypted or decrypted improved auditing of limited use tokens so we can generate access of all that says this user allowed to perform one operation or five operation and it was sort of hard to audit you know exactly how many operations were used before now this just shows up you know in the audit log in a very obvious way we can see the number of remaining uses for every request and finally a number of new backends were added so the octave back-end has been added to allow us to do authentication against octa ray yes similarly an STD version 3 has been added to support story data at rest now I'm just going to briefly do a very very quick demo of this I'm going to start by just bringing up vault 2 instances so these two instances are just fresh locally running instances nothing special about them then I'm going to fire up the web UI for both of them so in tab 1 we have bolt 1 and in tab 2 we have bolt 2 so I'm going to do is configure bolt one to be my primary instance so I'm going to go to replication currently we're not enabling replication I'm simply going to turn that on now I'm going to generate a secondary token this is going to allow us to authenticate a secondary test generate that we copy this activation token ok then I'm going to come back over to our other instance of bulb that's running currently its own independent instance go to replication enable this is a secondary and enable and great well one in volt to are now connected to each other this one is configured as a secondary this one is configured as a primary so now click in we should see that they both have the same setup sort of backends mounted I'm going to add a new back-end so here we're just going to mount it on our primary add our transit back-end a little description so now if we go to our secondary and refresh we should see boom the new back-end has replicated and it's showed up it's over here I'm going to create a new encryption key call it foo just put in my name go ahead and encrypt that now I'm going to copy the cipher text that we put in plane types and got our cipher text back and then I'm going to actually just decrypt it on the secondary cluster I'm going to flip back over here we'll go into our transit back-end we can see the encryption key foo as replicated so go into that go to decryption I'm just going to paste in the cipher text the vault 1 encrypted decrypt and we can see we got the same plain text back so into sort of zooming out we can think about what one involved to have as having been two different data centers so we replicated from data center one to data center to that we were going to use well as an encryption of the service using the transit back-end we defined an encryption key named food here that got replicated across and now we can see how an application could simply span multiple data centers using both to encrypt and decrypt data in a way that's going to ensure there's consistency in both sides can access it that's all I had for my demo I'm gonna flip back and I'm going to hand it over to Derek now who's going to tell us about how Atlassian was using both my name is Derek tomorrow I am the senior security architect Atlassian and I'm based out of Austin Texas and like Arman you can find me pretty much everywhere as we read in the sky that's what we're handle below and I'm talking today about how we laugh and we'll manage secret that scale so for those of you who are unfamiliar with who we are we make software development and collaboration tools for both on-premise and in the cloud here is a bit of our portfolio we make it bucket which is the good solution for the enterprise confluence which is team collaboration software Tim chat which is instant messaging and chat routines and JIRA issue and project tracking for teams and through some recent acquisitions we acquired critical status page which monitors your paid status as well as trillo for creating project boards so where are we we are over 1,700 employees now and although we started in Sydney we now have seven offices across three other countries we have many physical data centers but we are rapidly increasing our cloud presence in AWS which are which is running in local region this is a quote from Labrador best book but it also reminds us they're the only secrets and in some ways we hide them in the most unlikely places as well that being said how did we get here cleaning how do we realize that we had a problem with the keepers management well we grew up meaning grew as a company when you go from 10 employees over 1,700 employees the tools to develop should grow and scale as well our development teams group considerably and unfortunately we realize that secrets management had become a bit of an oversight when we looked into what our dev teams were using procedures management we found that we lack a solution that the entire company could use some teams were using ad hoc solutions or homegrown things for a super desk or L of Occupation but others were not without a global solution it also leaves one vulnerable to recovery procedures imagine having to update atomic secret stores for multiple services and that whole process being extremely time-consuming at that point we realized that building a cloud rady keepers management solution will become a requirement without a global solution we couldn't enforce consistent policies concepts like key rolling revocation and auditing and not consistent with homegrown solutions developed internally and at that point we realized that we could do a lot better so perfectly on mutate this for our solution first and foremost we needed a secure storage solution for existing keys second we wanted the ability to create teams dynamically for encryption this would allow us to encrypt data from applications and then store the encrypted data securely third we wanted to extend our existing TKA di program to build out trusted service level CAS this will allow our services to get x.509 certificates dynamically based off of some kind of role and finally we wanted the ability to generate and store AWFS keys and temporary credentials these credentials could be generated when needed and then revoked when they lock reduce so from our youth cases we started defining what our future criteria would be we want a solution that had fine grained access control policies that our service teams could administer for their own specific use cases we wanted the ability to bring your own key and this is a requirement for our existing services as well as a lot of services that still provision their own API keys we needed the ability to rotate and roll keys to comply with our current password rotation policies and that's something that we shares with our qubit management and from that the ability to audit their usage so we can track things like expired key usage creating dynamic keys and wrapping them with a separate key what's important as well the arm and invention is bringing encryption of the services and at the same time also we wanted the solution that could be accessible services it lives either in our physical data centers or within a cloud environment and we needed that solution to be highly available and resistant to service disruption and an HSM integration was important to us because we wanted to ensure that master keys were getting securely backed up the last feature itself is important because it's related to our service mirroring model which I'll show you in the next slide so as the form of disaster recovery will provision or mayor services across multiple regions as a way we're covering from a total regional collapse should it happen we didn't think it could happen previously but as you were we're a degree suffered a massive escrow outage a few weeks ago so this type of design is meant to mitigate that type of collapse so refers provision set of service clusters within the US to as an example here along with their relevant API keys then what we do is we create the symbol bureau the same service clusters in a separate region and so to do the same services we want them technically to share the exact same teams securely in some form secure fashion but unfortunately after this point we couldn't find a solution that could do this for us so we serve the vendor selection process based off our existing feature criteria and really came down to two potential vendors first we started with kms which is short for key management service and for those who are unfamiliar it's an Amazon Web Services managed service for creating and controlling encryption keys for taking from our last major criteria we found that AWS did support for service policies in the form of iron policies and while they can be fine-grained you really need to understand the full gamut optic model to ensure that you are limiting specific key access unfortunately we cannot bring your own key so this wouldn't address in keeping ish or to practice it ends up being able to create your own API to use for your service kms does support rotation until of audit and through Quattro and it does support the ability to create dynamic keys and then wrap them with a customer master key / prints customer master key attributes but it does not address change or services hosted our physical data centers kms is only meant for AWS resources and will only create even handle amenities for you for it gets dispersed on resources although not needed you know we want a solution so would it also a highly available because it's integrated within an HSM and it does allow that help additional aid or layer of protection for its master keys but the one thing where the damask support is from my service mirroring model with the ability to support key rectification across multiple regions kms is backed by that HSM and does not well she has location outside of this respective region and after spending weeks with the product team you realize this wasn't something they could do so after kms we looked at vault in particular both enterprise to see how it matched our criteria and as off the back the access control policies involved are very fine-grained mapping authentication back-end and set of policies to a user or service and in the next slide I'll show you our design which were use console for a storage back-end so this allows our dev to use our existing keys to be source securely and so sick to our models of services bringing their own keys these are leased in roles as well revoked all of which are actions that are compatible the transit back-end as seen in the demo allows wall to handle data in transit by making that envelope encryption and since we present both as internally reachable service both our on-premise and cloud services are capable of reaching acts we've designed a bit me highly available and have integrated vault with our ATM service and that's a key feature evolved enterprise and there's that one modded demonstrated teaser cable being replicated to different ball clusters so in the next slide I'll show you our design there's web stuff going on here so I'll break it down for you in a bit and you should kind of show you the flow so first we launch a vault server and a constable sugar and the same availability zone a constellation sits on the boss server that receives enforce requests to the its respective console cluster partitions are created on the cloud HSN and we use cloud HSM as our HSN item AWS service and we just use it because it was something that was easily available for us so the create partitions on cloud extensible greater than server of all clusters and redundancy is built-in within the HSM client binary as we point the clients both pkcs 11 interfaces for each HSM insulins this interface is used to remastered key generation and escrow as well to securely Auto unseal walls during a service restart so if there is a failure with one HSM cicada technically written to both partitions we always have a redundancy built in as a normal traffic what the normal flow is as follows so our internal client clients the top will make an API call through our shared core infrastructure and the shared core infrastructure is connected to Amazon Web Services through the Amazon native us Direct Connect connection and neither redundant links so if there's a failure on one link we still have other links that connect to this per region that that API call is forwarded to an eld instance within the respective region and with the new replication feature a client is writing a secret then it will replicate to all clusters so in conclusion by providing a single enterprise-wide solutions we avoid simply management models as well as the as well as provide a solution that is available to build cloud and physical environments this allows our development teams to focus on building bathing services and not worrying on where to store keys thus avoiding secret sprawl with a centralized solution we can build policies that match our security fort area you know peer service will access key rolling invocation as well as provide a full audit trail you finally actually expand a more region so our platform thank the key features like revocation that leaves last at the company human pretty darn good and that's all I have thank you very much thank you everyone for joining us for today's webinar and thank you to our one and derreck's representing as we said at the beginning this webinar was recorded and we will make the recording available after processing we will also send an email to everyone who registered through the webinar the complete list of questions and answers after we summarize all that data thank you so much for joining and have a great day
Info
Channel: HashiCorp
Views: 15,937
Rating: undefined out of 5
Keywords:
Id: 3q9_gxqkZA8
Channel Id: undefined
Length: 44min 25sec (2665 seconds)
Published: Fri Mar 24 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.