[MUSIC PLAYING] SETH VARGO: Hello, and everyone. [CHEERING] Welcome to the final session
of the day in this room, Using HashiCorp Vault
with Kubernetes. My name is Seth Vargo. I'm a developer advocate
for Google Cloud. You might recognize me from
the "DevOps versus SRE" series, where we say, "class
SRE implements DevOps." I'm excited to be here today. Prior to Google Cloud, I
actually worked at HashiCorp. I worked for our other
speaker, Armon Dadgar, who is the CTO and
cofounder of HashiCorp. And I will bring him onstage in
a minute to talk about Vault. Just a quick overview
of today's agenda-- we have four distinct sections. You'll recognize
them by the colors that they're
associated with here. We're going to start
with a quick overview that Armon is going to give us. I know some of you might
be familiar with Vault, but we want to make sure that
we have a common baseline. Next, I'll tell you how
to run Vault in Kubernetes as a service. We're going to get
really technical, so I hope everyone's ready. There will be a live demo. Then, I'll show you how to
connect to that same Vault cluster using
Kubernetes applications from different pods and
different services, again, accompanied by a live demo. And then finally, Armon
will talk about some of the new and exciting things
that are coming on the Vault roadmap in the future. So with that, I'd like
to welcome Armon onstage. [APPLAUSE] ARMON DADGAR: Thanks
so much, Seth. I'm hopeful we will get
through this session without running out of disk
space, but fingers crossed. So like Seth said,
what I was hoping to do was, for folks who are
less familiar with Vault, start off with a bit of a
high-level introduction. What is the problem motivation? I think we're probably all
a little tool-overwhelmed. So hopefully, there's a
good enough reason for us to consider adding yet another. So before we really start
talking about Vault, I think it's useful to
talk about the larger problem, right-- secret
management as a whole, which is really what Vault
attempts to solve. And so when we talk
about secret management, I think the first question, kind
of the most natural question is, what exactly
are we managing? What is a secret? And so one of the things we
talk about is, what is a secret, and how is it different
from sensitive information? So when we talked about a
piece of secret information, it's something that allows
us to either authenticate or authorize against a service. So an example might be
a username or password. We can provide that to, let's
say, a database to authenticate a session against that
database, prove our identity, and then the
database is allowing us to request data from it. Similarly, something
like an API token, we can present that when
we make a call up to, let's say, Google Cloud to read
and write from Blobstore, it's authenticating the client. It's authorizing us to
actually do that request. And finally, something
like a TLS certificate-- and you might not
log into a system exactly with a TLS certificate. But you're presenting
it to a client, and that's authorizing you
as, you know, let's say, www.hashicorp.com,
and so it's allowing us to sort of elevate ourselves
into this trusted position. So any of these types
of materials-- username and password, certificates--
are sort of secrets, right? And the reason we want
to keep them secret is, if they fell into
the wrong party's hands, they could use it as a
mechanism to authenticate themselves or authorize
themselves to perform actions they shouldn't, right? They are a mechanism of
access, and so those mechanisms need to be tightly kept. On the other hand, we have
sensitive information, right? This is a little different
because it's things that you'd like to
keep confidential, but can't directly be
used to authenticate. I can't use my customer's
social security number to log into the
database, but it doesn't mean I'd like that
piece of information on the public internet. So it's important to make this
distinction between sensitive things, which Vault
can help us manage-- and we'll talk a
little bit about how-- versus secret
things, which Vault intends to put inside of it. You use it as a database to
hold onto secret information. So when we go from saying, this
is what a secret is to, then, talking about secret
management, the way I like to talk about it is,
what are the set of questions we want to be able to answer? So when we talk about
secret management, these are the key ones, right? How do our applications
get these secrets? My web application boots, it
needs to talk to a database, how does it get a
database password? On the same hand, how do our
human operators get access? If I'm a DBA, and I need
to connect to the database to do some maintenance on it
or fix some issue with it, how do I get that same
database credential? Then, how do we think about
updating these things? We don't want them to be, sort
of, static credentials that live forever. We want it to be
periodically rotating them, whether for compliance
or just good hygiene. So how do we think about, if
I change my database password, how does that impact
my applications? How does that
impact my operators? How do I actually
automate that flow? Then there are the bad
things that could happen. What if this thing
is compromised? How do I think about
revocation of access? How do I do forensics
on who's done what when if I'm trying
to figure out, are we in a
compromised situation? Where has this key leaked? And so what do we do in
the event of a compromise, everything from that
forensic understanding, to reacting to it,
to rotating it? So these are the hard questions
that we'd like answers to. So the state of the
world, when we generally talk about secret
management as a problem is not necessarily great. What we often describe
it as is secret sprawl. This is a scenario in
which secret material is sort of sprawled all
over our infrastructure. It's hard coded in source code. It's in plain text
in our configuration. It's checked into
GitHub in plain text. It just litters everywhere. And the challenge with this
is it's very decentralized, and we have very limited
access controls around it. Like, who can access
GitHub in our organization? Probably anyone with
single sign-on, right? Who can see this plain
text configuration? Who knows? It's not something we're
tightly access-controlling. And so we have limited
access control around it, but also even worse visibility. So then, if we
actually wanted to do an audit of who all has looked
at this particular source file in GitHub, do we
actually have an ability to do fine-grain auditing
and forensics on that? The state of the world--
not necessarily great when it comes to answering
some of those secret management questions. And so this is really
where Vault enters. Vault's original purpose
was to try and look at secret management and say,
can we provide a good answer to these set of questions? And so our goals at the
top line are starting with a single source of truth. How do we move away from
this decentralized sprawl to a world that is centralized
and managed in a single place? And so part of this is really
understanding those two different kind of consumers. There is our
application consumers that want programmatic access. The web server is going to boot,
it wants to fetch a credential and do it in an automated way. We don't want a point
and click interface for that-- so API-driven
automated access for applications
doing automation. On the flip side, we still have
our human operators, right? They want something
nicer than an API. They want command lines. They want user interfaces. And so we want these
two access patterns to both be really
optimized so that we have the single source of
truth, and we want a solution that's cloud-friendly. So what that really means is
it should be easy to operate. It should be not dependent
on specific hardware that we need to deploy
in a private data center. It should be [? pure ?]
software that can easily operate in a cloud environment. So then, when we
translate that into, OK, these are Vault's
top-line goals-- how do we actually achieve
that, the sort of table stakes where things start, is just
a secure place to put bits. We want to put bits in,
have it securely stored, get those same bits out,
whether that's, let's say, database username and password. And so in-transit, when
clients are talking to Vault, Vault forces end-to-end
TLS unless you change it, by default, it asks
for the highest level of 1.2. And everything at rest
is encrypted and managed with a AES. So we have AES 256-bit, both
encrypted and authenticated so we can detect
tampering of data at rest. And this is exposed
to the end user as a hierarchical
key value store. So you can apply
some organization around folder structure
and grouping things together in a way that
makes sense for you. So to just make this a little
bit less theory, a little more concrete, here's an example
from the command line interface of both writing
and reading a secret. So what we see here, the
first command, is we're doing a Vault write
against a path, secret/foo. It's like I said, it's a
hierarchical file system. So in this case, secret is a
directory, and foo is the file, the leaf node we're writing to. And then we can
write opaque values. So we're writing, bar
is equal to bacon. But you could imagine that
would be username is Armon. Password is 1234, as an example. And then we can come
back around, use the CLI, and do a read against
that same path. So we're doing a read
again secret/foo, and what we see is we're getting
back bar is equal to bacon. So what this is really
masking for us is really thinking about, as a client, is
this encrypted over the wire? The CLI is talking over an
encrypted channel to Vault, hitting its API. And then Vault is
transparently doing the encrypt as it writes it out to disk,
doing the decrypt before it provides it back
to us as a client-- so doing the right
things, making sure we're doing this with
hygiene, but kind of making this easy as an end user. Just to illustrate the
equivalence of that, here's the same example as
what we just did on the CLI, but hitting Vault's API instead. So like I said, one
primary consumer Vault is our applications
or automation tools. And they want to consume it
programmatically with an API. So here's just a simple example
of doing cURL, where we're basically doing a write against
the path same path of foo and encoding the data
that we want stored, and then doing a readback
against the same thing. So it's meant to be a
simple RESTful API that's easy to integrate with. And so that's one level. That's table stakes is
putting our secrets somewhere that we know it's encrypted. We have audit logging and
access control around it-- all of that kind of good stuff. But there's still a bunch
of hard questions, right? There are questions like,
what if an operator connects to Vault, and they just
find it annoying to have to go to a Vault.
So they're going to copy everything out and put
it in passwords.txt so they don't have to bother to log in? Or an application starts-- and I'm sure we've never seen
this-- the first thing it does is write all of
its configuration out to standard out
or its log file. So it just reads all of the
database username and password, dumps that to disk, and ships
that off to our central logging system in plain text. And the third sort
of a classic case is we had a diagnostic
error, right? It's a 500 error on rendering,
or we're in a debug mode. And it says, here's my
database username and password, by the way, if this
helps you with debugging. So the thing you
find is the entities we're giving these
credentials to, whether it's applications or operators, do
a terrible job keeping secrets. So this kind of raises an
interesting challenge of, how do we still
protect these secrets, even though all of these
scenarios are very likely? So Vault's approach
to this, the way Vault thinks about
this problem is the challenge is we have these
long-lived static credentials. And the challenge with a
long-lived static credentials is if they leak anywhere, ever,
either through passwords.txt, or through a log file, or
through a diagnostic page, it's game over. We've arranged it
as a zero-sum game. It's either a perfectly
kept secret, or the secret's out there, and we're hosed. So how do we reduce the stakes? How do we make this less
severe than all or nothing? And the way you have to do it
is making secrets ephemeral. You don't want to have these
long-lived credentials that are valid forever. You want these things to be
short-lived, time-bounded, things so that, even if
they get logged to disk-- oops-- they're only
valid for so long before we've moved on to a
different set of credentials. So Vault's approach
to this is what we call dynamic secrets, right? And the idea is, as opposed
to static secrets, what we want to do is generate
a unique set of credentials on demand. So when a web server says, I
want a database credential, we generate that credential as
in point-in-time right there. We have a unique audit trail
now that ties us back and says, this web server has
this unique credential. And by the way, that's
valid only for 24 hours. So when the web server renders
a 500 diagnostic page and leaks out our database password,
we've time-bounded how long that credential's
valid before moving on to another credential. And if there is a
breach, and we know that that credential's exposed,
we can revoke it early. So what does this actually
look like from the perspective of a client? They're just doing
a read against Vault in the same way we just
saw before where they just did a read of secret/foo. And behind the scenes,
Vault is then going out and talking to the endpoint. So it's going and talking
to the database and saying, create a new dynamic credential. It's creating an audit
trail of that credential, and then it's providing that
credential back to the user. In this example,
I'm using a database because I think it's
sort of an intuitive one. But you can imagine
this exact same diagram where you replace the
database with any sort of endpoint system that
supports credentialing, right? A prime example would
be Google Cloud, itself. So I want to give my
application a credential that lets it read and write from
Blobstore, as an example. But I don't necessarily want
to give it a credential that lives forever. I want to give it a credential
that's valid for 24 hours, and I'm rotating that
credential all the time. So in that same setting, Vault
would connect to Google Cloud, generate a short-lived
credential, create the audit trail, and
provide it to the application or end user, maybe a developer's
doing some debugging, so this example holds. So when we talk about this
secret back end approach-- so far we talked about Google. We talked about using it
with a database-type system. But it's really a general idea. The general idea
is, any system that uses credentialing to manage
authentication or authorization has this similar sort
of a challenge, which is our consumers need that
credential so they can consume the end point system. But we don't want these
things to be static. And so our goal
with Vault is, this is actually a plug-in point. It's an extensibility point
where, over time, we're constantly adding new
systems and adding support for them to be consumed
in this dynamic way. So cloud providers
are one great example where they need API
tokens to interface with, and maybe are giving
those to humans, maybe are giving
those to applications, maybe your automation
systems are consuming it. How do we make sure
that we don't have a forever token in Jenkins? Databases are great
a example of systems of a record that have data. But then you get into all
sorts of other systems-- messaging queues, things
like RabbitMQ, SSH. I think we could probably--
maybe I'll spare everyone from raising hands. But I'm sure many of us
have seen the master shared PEM file with a few
hundred developers having access to the same PEM
that secures a huge fleet. And so that's an
interesting challenge where we have this shared
mechanism, this PEM file, that too many people
have access to. So the list goes on and on. Other stuff that
we don't have time to touch onto in
much more detail is classic security
bread and butter-- the three A's. As you'd expect, Vault has
many different mechanisms to authenticate against it. So whether it's an app
running in Kubernetes, and we're tying
into the platform, it's a VM just running
in Google Cloud, or it's a user who's using
LDAP to authenticate-- those many different
authentication mechanisms-- all
of that comes into, then, a central
authorization system. So there's a
central way in Vault where you authorize who
is allowed to do what. So one huge goal with
Vault is, how do we get to least privilege? You should only
be able to access exactly what you need,
not everything because you were able to log in. So Vault has a very
fine grain AuthZ system that lets you scope
down who can do what based on their identity. And then lastly
is rich auditing. So we want to have that audit
trail of who did what when. Cool. And with that, I'm going
to hand it back off to Seth who's going to talk
to us about how we actually use this and operationalize
this within Kubernetes. Thank you. [APPLAUSE] SETH VARGO: Great. Thank you, Armon. So the goal of the
next section here is to really answer
two questions. Why would I run
this on Kubernetes? Is it a good idea? Should I? And how do I do it? The first answer
is yes, so we'll move onto the second question. So what does this
actually look like? So the following diagram
here is also captured as code on GitHub,
github.com/sethvargo/vaultongke. But I want to walk through
it quickly together. And before I do,
I want to mention that this follows the
HashiCorp best practices guide for production deployments
and production hardening. This was built in collaboration
with some of the Google Security engineers as
well as some of the people on the SIC security
team on Kubernetes. So it represents what we like to
think of as the best-practices way to run Vault on GKE
or any Kubernetes cluster. So at the top level, we have
some type of load balancer. In this example,
we're using cloud load balancing because we're making
this Vault server publicly accessible. In your own
environment, you might want to use VPCs or
VPN-type networking, where it's only accessible
in certain subnets. And then we have a highly
available Vault cluster. So we might have three, five,
seven servers all running behind this load balancer. And then below that,
we have, obviously, the Kubernetes engine itself. And then we're using Google
Cloud Storage as the underlying storage back end. We also support
Google Cloud Spanner. And we are using
Google Cloud KMS to do the initial encryption
of the initial root token and the initial unsealed keys. So let's walk
through some of those in a little bit more detail. So first, we're using a
dedicated Kubernetes cluster. And there are a few
reasons for this. But we don't want Vault running
alongside the same Kubernetes cluster where your
applications are running. In fact, we might even recommend
running it in a separate Google project that only the security
teams and the networking teams-- least privilege access. And to the other clusters and
other application developers, Vault is just a thing
at an IP address. We're leveraging
Google Cloud Storage. Again, we could leverage
Google Cloud Spanner. But both of those storage back
ends support high availability. As a production
deployment, we need this to be highly available. There's no reason
why we wouldn't. So we're going to leverage
Google Cloud Storage for our high availability
and our storage back end. Another advantage there is
that we can back that data up. We can do replication. We can do customer-provided
encryption keys as well. We're going to leverage
Google Cloud KMS. Everyone taking
a picture-- nice. We're going to leverage
Google Cloud KMS because our Vault servers
are going to be automatically initialized and unsealed. So those initial unsealed keys
and that initial route token eventually need to get into
the hands of an operator or an automation
system, but we don't want them stored in plain text. So we're going to
leverage Google Cloud KMS to encrypt those values and put
them in the storage back end. Additionally, we're leveraging
our own custom certificate authority for TLS. So one of the questions
I get is, like, oh, why wouldn't you
use Let's Encrypt? Let's Encrypt is
amazing, but the problem is, I don't actually want
to trust any certificate on the internet. Instead, I want to use
mutual TLS validation, and I want to use my own
certificate authority and only trust certificates
signed by that authority, not even my system certificates. This helps secure
Vault and make sure that only people
with the certificates and the proper
certificate authority have the ability to
communicate with the server. Additionally, we're
leveraging L4 load balancing. Instead of using something
like the Global Load Balancer, the GLB, we're
actually using a regional load balancers. And the key reason
here is that an L7 load balancer is going to do
the TLS termination for us. And we don't actually want that. We want Vault to do the
TLS termination, primarily, so that Vault can handle
mutual TLS authentication. But also, we don't want
to decrypt data and then have a different re-encrypted
connection on the back end, right? Using an L4 balancer lets us
pass that through directly to Vault and lets Vault
handle its own TLS. And if Vault ever rotates
those TLS certificates that it's using, those
automatically happen. We don't have to
also rotate them in an upstream load balancer. And then lastly-- and this
is a really key point-- we want to think of Vault
as a service, right? So in my internal operations
team, or my internal security team, yes, I'm running
Vault under Kubernetes, and I'm leveraging a
lot of the functionality that Kubernetes provides. But ultimately, I'm delivering
secrets as a service or secrets management as a service. So there is Vault and an
API and an IP address. And users ultimately
shouldn't care whether Vault is running on a
VM or a container or bare metal. It doesn't really matter. So we really want to think about
this as Kubernetes is providing the building blocks
that let us run Vault as a highly-available service. And we get some
really cool benefits. One of them is you might
see that we're going to leverage a StatefulSet here. And the first thing you
might think of is, wait, this isn't a database. There's actually no state. Vault's state is
stored elsewhere. It's stored in GCS. Why would he use a StatefulSet? And there are a few
reasons why we might want to use a StatefulSet. First, it gives us
predictable naming. Our actual container names
will be like vault-0, vault-1, vault-2. This is really
helpful for debugging, audit logging, et cetera. The real reason, though, is
that the StatefulSets give us exactly-once semantics for free. So as our servers are starting
up, one will start up, it will unseal itself,
it will be ready to go. As soon as its health
check is passing, the next one will move along. So we actually avoid
any real-life [? race ?] conditions with
initialization or unsealing. The same is true for upgrading. You don't want to upgrade all
of your Vault clusters at once. You generally want to upgrade
one, and let the leadership transition to a new server,
upgrade that one, et cetera, until the entire
cluster is upgraded. This is true with any software. Using StatefulSets, a single
change to the YAML file, or whatever you're using for the
description of this deployment, automatically does
a rolling update. And if one of them
fails, it'll stop. We get a free circuit
breaker pattern. So there's a lot of
reasons why we might want to use a StatefulSet here. Additionally, we're leveraging
anti-affinity rules. So anti-affinity rules
are a Kubernetes future which let us do things
that affect or influence the pod scheduling. By default, if we did
not have this YAML here, Kubernetes would put all
three Vault containers on the first node. And that's not highly available. If that one VM dies, we lose
all of our Vault servers. So what we're doing is
leveraging anti-affinity, basically, suggesting to
Kubernetes, or in this example requiring Kubernetes,
to spread these out across all of our
available nodes or hosts. This means, if one VM
dies or two VMs die, that our Vault servers-- the actual containers
themselves-- are spread across the node pool. So this adds an additional
layer of availability for us. We're also leveraging
auto unsealing. The auto unsealing is
also an auto initializer. This allows you to spin up
a Vault cluster very quickly without operator intervention. And the keys will be
encrypted with KMS and put in the storage back end. We're also doing
some little things that make a big difference. For example, we're
giving the container the IPC_LOCK capability. By default, Vault
supports memory locking. This is something the
Vault does in its codebase. Containers don't give
you that privilege by default, so we
explicitly ask Kubernetes to give our containers the
ability to do memory locking-- again, just an added
layer of security. And then our readiness
check is basically asking Vault for its own
health, but we explicitly want to make sure that
our standbys are also added to the load balancer. Because this
readiness check here is going to determine
when the container is able to receive requests
from the load balancer, and we want to make sure that
our standby servers are also able to receive requests so
that they can forward them appropriately to the leader. Notice that we're
communicating via TLS. And then lastly, as
we saw in the diagram, we have this load balancer. Again, this might be an internal
load balancer in your own VPC. Or it might be a
public-facing load balancer. I don't recommend that, but
there's no reason it can't be. So with that, let's
jump over to demo one-- so if we can switch
over to the laptop. All right. Here we go. Did everyone pray
to the demo gods? Here we go. All right. So the first demo
here, I'm just going to show you what it's like
to run Vault as a service on Kubernetes, basically, using
those exact YAML files that I just showed you. First, I want to show you, I
have a Kubernetes cluster here. It's running on GKE. So I'll go to Kubernetes
Engine, and we'll take a look at the workloads here. And there's nothing, right? We have no workloads here. We haven't deployed Vault yet. So what I'm going
to do is, over here, I'll just take a look
at my vault.yaml. And you can see that this is
very, very similar to what I just showed you on the slides. There are some additional
pieces of information here, like, I'm deploying
a replica set of three. So I'm going to have
three Vault servers. Again, there's
that anti-affinity that we saw before. And all of this is open
source, by the way. So don't feel like you have to
capture all of this right now. We have our first
container, which is our init container
that's going to do the Vault initialization. And then we have
some configuration for that container. And then we have the
actual Vault server itself. We're pulling that directly
from the Docker hub. Again, there's that
IPC_LOCK capability. And we're opening
some container ports that will be available
for requests-- some basic memory
and CPU requests, and then, again,
some configuration. And here, you can see
where we're actually supplying Vault its
startup configuration via the magic VAULT_LOCAL_CONFIG
environment variable. So we don't have to worry about
writing out a file to disk. We do this all in the
environment variable. So with that, let me go
ahead and apply this. I'm terrible at typing,
so I wrote fancy scripts for you all. So what this is
going to do is it's going to just grab that
Kubernetes end point, and it's going to apply
kubeclt apply-f that particular StatefulSet. So what we should see
here in a second-- and this is where everyone's
prayers would be helpful-- OK, there we go. So these pods are
actually starting up. And you'll notice that 1 of
3 of them are starting up. And that's because we're
using the StatefulSet, which is doing that one at a time,
exactly-once semantics. So vault-0 is running. It has some
warnings, but I think that's just because
it's booting up. And if we go back, we'll
start to see, hopefully-- OK, vault-1 is starting as well. And then once we get to
vault-3, this entire replica set will be happy. And at that point, we
will have a vault cluster. Because what's actually
happening, under the hood, is we're not just running
that Vault container, we're also running that
Vault init container. And because this is a
fresh, new container, it's actually connecting
to the storage back end, doing the
initialization, putting those unsealed keys encrypted
via KMS into that storage back end, as well as the
initial route token. And then the other
servers, when they start up, they look,
and they say, oh, I'm already initialized. They grab those encryption
keys and do automatic unsealing for themselves. If we were to
scale-up or scale-down the cluster, because the
initialization has already happened, the initializer
will just unseal them. It won't actually do the
initialization anymore. Let's go ahead and refresh this
and check where our status is. I see a green check
mark, which is great. I'm going to go ahead and jump
over to Google Cloud Storage here, just real quick
to show you all. So this is the storage
bucket that I'm using. And in here, you'll
see the folders here-- core and sys, which
are created by Vault. But then the root token
and the unsealed keys are both encrypted
via Google Cloud KMS. And then they're stored in this
Google Cloud Storage bucket. So when I want to interact
with Vault, as an operator, I need to download
those via like gsutil or via the Google
Cloud API, decrypt them using Google Cloud KMS
and the proper IAM permissions, and then I can interact
with Vault directly. So let me go ahead and
switch back to Kubernetes, and I'll show you what
all of that looks like. So I'm going to jump over here. I have another
magical shell script, which I will show everyone. So this is my
magical shell script where I super can't type. But basically, all of this does
is it grabs the CA certificate. Again, we're using a custom CA-- so putting the output of
that, the address to my Vault cluster. And I provisioned this Vault
cluster with Terraform, so I'm delegating
to Terraform there-- and then the token
decrypt command, which is just a really
long gsutl command with some basic Z4
and some G Cloud KMS. So what I'm going to go
ahead and do is run this. And again, this is
all open source. Oh, look. See, I left myself
this note because I knew I was going to mess it up. This is when you know
you've done too many demos. So this takes a little
bit because it's actually making about 5 or
10 API calls, but great. Now, what I can do is
I can actually just run env grep vault. We
see that we have our VAULT_TOKEN, our VAULT_ADDR,
and our VAULT_CACERT. These are magic environment
variables the Vault reads. I can also supply them
via the CLI, but I'm lazy. So now what I can
do is, hopefully, I can run Vault status. And look at that. We now have a Vault
cluster up and running-- five key shares, a
threshold of three. It's in high-availability mode,
and this particular node I hit is in standby. So if we look at that
second to last line there, we'll see that this particular
node that we hit is in standby. If I run Vault status again,
now I hit the active node, coincidentally. Because we're doing round-robin
load balancing on that load balancer there. Anytime I make a
request to a standby, that standby will automatically
forward the request to the active leader. So I don't have to know
where the leader is. I just know the load
balancer address. So this is how you run Vault
as a service on Kubernetes. If I kill one of these pods,
it will automatically restart. It'll maintain a
quorum of three. When it comes back online, it
will be automatically unsealed. And until that point, it won't
be added to the load balancer as well, which is why that
health check is so important. So with that, can we
jump back to the slides? Awesome. So that concludes
the first demo. You can clap. [APPLAUSE] So now we have to
forget everything. Forget everything you
just talked about. Because the next section is,
now that I have Vault running, how do I connect to
it from Kubernetes? But I want you to forget that
Vault is running in Kubernetes. Think, it's running in
some third-party service like Heroku, or it's
running on VM somewhere. Forget that Vault is
running in Kubernetes. Instead, Vault is just a thing
available at an IP address. Everything we're
about to do can work with any Kubernetes cluster. We're going to use
GKE, but it could work with your on-premise
Kubernetes cluster, et cetera. So again, forget that Vault is
running on GKE or Kubernetes. It's just a thing
at an IP address. Because the next
thing we're going to do is create another
Kubernetes cluster. All right. Clicky. All right. So very quickly, we have to
talk about Vault authentication methods. So Armon mentioned this very
briefly in the beginning. Authentication in
Vault is the process by which users or
machines supply information to Vault. Vault
verifies that information with, either internal data or
some external third party, and then, assuming
that it is valid, converts that into a token with
policy authorization attached to it. Here's an example of what that
looks like in a graphic form. We have the Vault server
in the middle, which would receive a
request from a user or a machine or an application. It would then take
that information to this third-party system. Assuming that's successful,
it would go to its own policy engine and map that
to a Vault policy, and then return the token,
which is kind of like a session ID or a session token to the
user, which would then be used for all future requests. It's like logging
into a website. You know, sometimes you log in
with a username and a password, and your browser
sets a session ID. But sometimes you can
also log in with Twitter or log in with GitHub. Vault is conceptually
similar to that. So what does that look
like with Kubernetes? Well, it's actually
very straightforward. We just replace that third-party
system with a token reviewer API from Kubernetes. So a user, or in
this case a service account representing
a user or a pod, makes a request to the HashiCorp
Vault server with their service account token, their JWT token. That JWT token is given
to the TokenReviewer API. The TokenReviewer API verifies
that JWT token's validity, and time stamp, et cetera. And assuming that's successful,
it goes back to Vault and says, yeah, this looks great. Vault then goes to its
own internal policy engine and says, oh, OK. Well, the things in this
particular namespace get this particular
policy attached to them. Here's a Vault token with
that policy attached. And this is how the
Kubernetes auth method works at a very, very high level. There are technical
details here, which we're not
going to get into. But this is, at a very high
level, how this functions. So to do this, you could write
your own little BASH script. Or I've actually open-sourced
the Vault Kubernetes authenticator, which does
this whole process for you. It will grab your
service account from var/run,
Kubernetes, secrets, the service account JWT token. It will post that
up to the Vault server which is configured
via an environment variable. It will get the
token back, and it'll put it in an in-memory
volume mount for you. So then, all your applications
and services have to do is extract that token from
that in-memory volume mount. So you don't have
to actually think about Vault authentication
in Kubernetes anymore. You could just
leverage this and know that the token will be there,
assuming authentication was successful. It runs as an init container,
not a sidecar container. So it must succeed in order for
the other containers to run. Then we leverage tools
like Consul Template. So despite its
name, you don't need Consul to use Consul Template. But Consul Template
is a tool that is designed to provide a
configuration language by which we extract secrets and
credentials from Vault and present them on-disk or on
a temp disk or an in-memory disk to an application or service
without the application or service knowing that it's
communicating with Vault. Many of you have brownfield
applications or, even, greenfield applications
that you likely don't want to directly integrate
with Vault. You might not want to call the Vault API
directly using the Golang library or the Ruby library. Instead, you might
want to just continue to leverage a file
system or environment variables for credentials. And we like that abstraction
because then you're not tightly coupling
your applications to a secret manager. Instead, you're saying,
my config is on disk, or my config is from an
environment variable. What Consul Template
allows us to do is to provide that
same experience, but under the hood, as
a sidecar application, it is actually
managing the connection to Vault, the renewal and
the acquisition of all of those secrets. And as we'll talk
about in a minute, what happens when those secrets
are revoked or are rotated? So I know what you're thinking. What happens, though, if
my credentials change? So I have this
sidecar application that wrote this file out. And my application is
reading those credentials from that file. But how does this
Consul Template thing actually inform my application
that my credentials have changed? So let's think of a
real-world scenario. I have a database credential. Vault tells me that
database credential is valid for 24 hours. At the end of 24
hours, Vault is going to revoke that credential,
if it's not revoked earlier by an operator. At that point, my application,
or Consul Template, has to go back to Vault and
say, OK, I'm still authorized. I would like another
database credential. And it will be different. It will be a different username
and a different password. So I'll write that
out to my config file, but my application is probably
already loaded its config. Most web applications don't go
to disk and read their config every user request. They load that into memory,
and they kind of cache it until they're restarted or
received some sort of signal. So how can Consul Template
signal my application to reread its credentials,
where, in a Kubernetes world, where all those
containers are isolated? Well, an interesting feature,
and an alpha feature right now in Kubernetes is this idea
of shared process namespaces. So shared process namespaces are
the solution to this challenge. And again, this is
an alpha feature. So it's available in GKE alpha. And if you're running your
own Kubernetes clusters, you'll have to enable it with
Kubeadm or a similar tool. But this shared
process namespace allows our containers to
communicate with one another. So instead of having every
container be PID 0 or PID 1, they're actually a
root container and then these subcontainers
with the ability to communicate with one another. So if we give our
Consul Template process the SYS_PTRACE sys call, the
security context capability, we can actually leverage the
standard Unix tools like pidof and kill to signal applications. And then, all we have to
do is provide a command to Consul Template. And it says, hey, when
this file changes, or when this
environment is updated, or this secret is revoked,
just send, in this example HUP to the pid of my app. And we can leverage the
built-in pidof tool for that. Or perhaps it's
something more complex. We might have some
50-line bash script that we have to run because
of some arcane Java-based application that
is very finicky. We capture that as code,
and Consul Template executes that for us. So this can either be
as simple as a signal, or it can be as complex
as a very systematic and procedural restart. So with that, let's go ahead
and jump over to the laptop, and we'll take a look
at what demo two is. So what I'm going to do here is,
in the Google Project Switcher, I'm actually going to switch
to a different project. So I have another project
here called My App. So I'm in a completely
different project now. Remember, one of
our best practices we recommended earlier was
that we have a dedicated project and a dedicated
Kubernetes cluster for Vault. And then, we're only thinking
about Vault as a service right now. It's just a thing
at an IP address. So I have this other
Kubernetes cluster where I run all my applications
and services for production. All my web apps, all
my stateless services, all my stateful
services, they're all in this My Apps
Kubernetes cluster. Right now, I don't
have any apps, but we're going to
create a few in a second. All right. So I have a few scripts here. The first is we're going to set
up the Kubernetes auth method. What the Kubernetes
auth method is going to let us do is, like
we described in that diagram, it's going to configure Vault to
talk to the Kubernetes cluster with the right
certificate authority, so that we can verify via the
Token Reviewer API, the JWT tokens. So let's go ahead and run that. So it creates a CRB
and a few other things. And then we actually configure
the Kubernetes auth method, and we create a new
role for authenticating to that configuration. So now, at a very high
level, what this means is, when I have a
pod in my cluster, and it goes to Vault
with its service account, that service account
will be authenticated to get a Vault token back
with a predefined policy that I've configured. So this is configuration that
you could capture as code, or you could run it
manually via the API. But then next,
what we want to do is we want to configure
our GCP secrets. So the use case here is that-- Armon's example was
database credentials. You might be Postgres
or MySQL, et cetera. One of the other secrets
engines that we have for GCP is the IAM GCP secrets engine. This lets you programmatically
generate both service accounts and service account
keys for GCP. So instead of going
into the web UI, and clicking, or using
the API directly, you can programmatically
generate on-the-fly tokens that are very short lived
for interacting with the GCP API, the same way that you might
generate like an AWS access key. But we can do that
with the GCP API. So what I've done
here is I've just configured Vault, very briefly,
to interact with those systems. So the last thing
we need to do then is actually apply
our application. So let me show you what
our application looks like. So I'm doing a
bunch of stuff here, mostly is just because
I'm really bad at typing. But I'm grabbing the
credentials from our cluster, and then I'm
setting the context, and then I'm applying
this Kubernetes YAML file. So I'll take a look at
that YAML file here. So there are a few things
that I want to point out here. The first is that we're
running a pod now. We're now no longer
running a StatefulSet. This is just like
a straight-up pod. It could be replicas, but
there's only one at this point. We're leveraging that shared
process namespace again. This is important so that we
can signal our applications so Consul Template can
talk to our other process. We have a few volume mounts. We have our secrets. This is where Consul
Template's actually going to write its
configurations because there are going to be secrets there. We have our own
TLS configuration. This is how our pod talks
to Vault because, again, we need that certificate authority,
and client-side TLS cert, and then the Vault
token, which is an in-memory volume mount where
the token will be written. So that never gets
persisted to disk. We have our init container here,
which is, again, just pulling from the Docker Hub. And we have a few
volume mounts there. We set some
environment variables, like where is the Vault server,
where is its CA certificate. And that's going to run at boot. And assuming that's
successful, it'll put our Vault token at a
well-known path on disk/home Vault. And at that point,
all of our other containers-- in this example,
Consul Template-- can then pull that token from
the in-memory file system and interact with it. So here we have Consul Template. We've given it the SYS_PTRACE
capability so that it can signal our other application. And if we scroll down
here a little bit, again, same volume mount,
same configuration. And then here's our Consul
Template configuration. And at the very end here,
you'll notice that we have our arbitrary command. Our arbitrary command
is just to send kill-HUP to the pit of our
Vault demo app. And then, as you might
suspect, the last thing in our configuration here is
our Vault demo app itself, which is just getting pulled
directly from the Docker Hub. And actually, all it
does is log its own file, so it's not very exciting. But it's really good for
demonstration purposes. So let me go ahead
and apply this now. So now, if we go back here, we
should see this workload start to provision. And you'll see that the
pod is initializing. And if I'm fast enough-- I wasn't fast enough. If you're fast
enough, you can see that it's waiting for the
init container to run. But if we take a look
at this init container-- and I'm using
Stackdriver Logging here, so you'll actually
see the logs directly in the Google Console UI here. We see that, a few seconds
ago, that the init container successfully stored
the Vault token. So it did that whole walkthrough
of going to the Vault server, presenting its JWT
token, interacting with the Vault cluster. And then what we presented
to Consul Template was just a Vault token
at a well-known path. Consul Template then
grabbed that Vault token and used that token to
interact with the Vault API to grab our service account key. So now, when I go back to my
pod and refresh this here, we can see that both
Consul Template and My App are running. And if we take a look at
the logs for all of these, you'll notice that Consul
Template sent a signal to My App causing it to reload. So at boot, Consul
Template detected a change in configuration. There was no config, and
then it went to some config. So it sent the HUP signal
to the Vault demo app. The Vault demo app, or My App
in this log, got that signal and printed out a log
line that it reloaded. What you see here is a
soon-to-be-revoked access credential for interacting with
this particular GCP project. This is an actual IAM access
token for interacting with GCP. I'm aware of this. I will revoke it, but it's cool. So I'm showing it to all of you. And what's happening
is, at some interval, this token just
gets printed out, basically, every 30 seconds. However, at the end of about
five minutes, what will happen is Vault will revoke that token. So by default-- like
GCP IAM credentials live forever until a human
comes in and deletes them. Vault actually
enforces a lifetime, and it makes the necessary
API calls to Google Cloud to revoke those credentials
when time permits. So, unfortunately, we're
running short on time, so I can't actually
wait and show you that. But if we can switch back
to the slides very quickly. So the last section now I'd like
to invite Armon back on stage to share a little bit
more about the Vault roadmap and the future. So, Armon? [APPLAUSE] ARMON DADGAR: Thank you. Awesome. We are definitely
running short on time. OK, we're going to
make this light speed. So thank you, Seth, for walking
us through a little bit more hands-on than I had a chance
to go to in terms of how we actually work with it. Some of the stuff I
wanted to talk about-- so for folks that are maybe
familiar with Vault, or have already
used it, I wanted to give a little
bit of a sneak peek in terms of where Vault's
going and a little bit of our thinking in
terms of directionality. I think one of the things
that's become clear to us is that, when we talk about this
space, secret management, tools like Vault, integrating it
with these platforms is, it's complicated. And I think you kind of
saw a snapshot of that is, it's not exactly simple and
obvious how all of these things kind of come together because
it's a hard problem we're trying to solve. We're trying to solve the
problems of application identity, brokering between
many different systems that don't necessarily
understand each other-- like Postgres has really no
idea what Google Cloud's IAM model is and vise versa. So what we're trying to do is
solve this challenge of each of these systems, implement
different mechanisms of access control, and we're sort
of brokering between them. So it's a complex problem. And so how do we start making
that a little bit simpler? So one of the things
we're starting to work on within Vault itself is-- I jokingly refer to
as sort of Clippy for Vault. As you can see
a little pop-in wizard in the corner here,
which says, "It looks like you're trying to
configure Vault. Can I help you with that?" But in more seriousness, how
do we help guide users through, hey, these are
the different kind of core workflows
in terms of how do you configure Vault
to understand Kubernetes as an authentication back end? How do you configure Vault to
integrate it with Postgres? How does the role-mapping
work between this. And so it's meant to be
an unobtrusive wizard-like capability for
different workflows that you can enable as
you hit them to say, hey, please walk me through
how this is supposed to work and have it be a little bit more
interactive as you're getting started with the system. Another common challenge
we see with Vault is, how do I go from-- I think Seth did a great
job laying out, what's a best practice deployment
guide look like in terms of isolating it with a separate
cluster and treating it as just opaquely living behind
a load balancer and building HA. And so an interesting
pattern is, how do we go from providing that
service, Vault as a service, for a single application
owner or a single group to, all of a sudden, in
many application owners, many groups? We want Vault as a service
to be highly multitenant between many different potential
groups or lines of business. And so one approach
to this, and the kind of historical
approach, is you lean on Vault's very fine-grained
capability and an authorization system. So you can segment out
different parts of the system. Maybe you say, [? eng ?] star
is managed by one set of people and [? marketing ?] star is
managed by a different set of people. And we sort of segment using
Vault's hierarchical folder structure. Another capability, a different
way of thinking about it, is full namespacing. And this is what's coming on
our roadmap is the ability to carve out entire
namespaces within Vault and say, you know what? I don't want to
think about managing permissions of these folders. I want to carve out a
whole namespace and say, this belongs to engineering,
this belongs to marketing, this belongs to another team. And those teams can then sub
administer these namespaces independently. So looking at, as we go and
think about Vault as a service, how do we scale up the
number of consumers, the different folks who can be
onbaorded onto this service? Lastly is some of the challenge
around integrating with Vault, as Seth highlighted is there is
this tension between what we'd like to be able
to do is maintain operational flexibility
and not tightly couple our applications. We don't want all of our
applications to embed vault SDK and directly talk to the API
and have to be sort of highly Vault-aware. That sort of limits our
operational flexibility. So instead, we'd like
to decouple that and use tools like Consul Template
and scripts like Seth's auto initializater to thread
through the right credentials so that our app boots,
reads its secrets, and it's blissfully unaware. A really common
challenge, though, is how do we do that
authentication before we can actually read secrets? And so one of the things we're
starting to work on-- and it actually just shipped
this morning in beta-- is the Vault Agent. And so this is an
optional agent you can choose to run alongside
Vault Server, which you have to run. And what it does is
manage some of that flow for you, the flow
of fetching the JWT, authenticating against Vault,
periodically reauthenticating, integrating with tools
like Consul Template to be able to generate and
put configuration on disk in a more
application-friendly format. So these are some of the goals
of the Agent is making it easier to authenticate, easier
to manage token renewal, easier to cache some
of this data locally, and making those integrations
a little more out-of-the-box. So this is something
that just shipped today. So if this is something
that you find useful, please feel free
to play with it. So with that, thank you, guys
so much for sitting through with us and joining us. And thank you to Seth as
well for helping with this. [MUSIC PLAYING]