[MUSIC PLAYING] WESTON HUTCHINS: Hello,
Google Cloud NEXT. How's everyone doing? Good? All right, lunch session. Everyone's eating while we talk. This is good, brown bag lunch. So welcome to getting
started with containers and Google Kubernetes Engine. We have 50 minutes to
give as many details as we can about GKE to
make you all experts. Before we begin,
quick show of hands. How many people have ever
deployed an app to Kubernetes before? Not that many. That's exactly what this
sessions' all about then. So my name-- real quick intro,
my name's Weston Hutchins. I'm a product manager on GKE. ANTHONY BUSHONG: And I'm Anthony
Bushong, a customer engineer specialist focusing on
AppDev and Kubernetes. WESTON HUTCHINS: So, everyone
said they hadn't deployed. We're going to go straight
into a demo, no slides. Show you how easy it
is to get started. Take it away, Anthony. ANTHONY BUSHONG:
[CHUCKLES] All right. Thanks, Wes. So today, we're going
to demonstrate, again, how easy it is to get
a few containers up and running on Google
Kubernetes Engines. So let's just dive
right into it. What you're looking at here
is a Google Cloud console that is highlighting a
Kubernetes Engine cluster. I've already created this. If you haven't created a
Kubernetes engine cluster before, it just
takes a few clicks, or a simple gcloud
command, or even a simple Terraform resource. It's actually quite simple. But since we already have
this, let's actually-- how do we use this cluster? So I'm going to click
on this Connect button. And I'll actually get an
auto-generated gcloud command to fetch credentials
to be able to interact with this cluster
via kube control. So gcloud is our ST case. You'll see here that
I have a GK cluster. But that's not the
one we want to use. We want to use this one. So it's this. Great. So now let's test
that we've actually connected to our
Kubernetes cluster by getting the nodes that
comprise our cluster. And there we go. So, as you can see,
here are the three nodes that comprise the cluster
that I showed you earlier. At the end of the day, these
are just virtual machines. What's different is
it that Kubernetes allows us to interact with these
VMs in a better, cleaner way. So we're not going to
SSH into these nodes. We're not going to write
imperative instructions to actually install
our application. Instead, once our app
is built in containers, all we have to do is write
some declarative Kubernetes manifest. And these containers that
comprise our application will be scheduled by Kubernetes
somewhere on these three nodes. Right So the application we'll
be working with today is comprised of an
Angular UI, a Golang API, and a PostgreSQL database. As containers, they allow
us to package and isolate these components
independently, really enables the concept
of microservices. And so within that design,
each of these components has their own code
and their own Docker file which specifies
how to build the image into a container. And so as I mentioned, once
these containers are built, all we need are Kubernetes
manifest to implement our desired state
in the cluster. So let's actually look
at what that looks like. So in this YAML file is
our Kubernetes manifest. And this is for the UI
aspect of our application. So what you'll see here is
that we're telling Kubernetes that we want a load balancer
to actually front what is called a Kubernetes pod
that is based on this template. Now, a Kubernetes
pod is a grouping of one or n containers. So in this case, are UI pod is
going to run our UI and our API containers. And we want two
replicas of that pod. Cool. The magic of Kubernetes is that
once we provide this desired state, Kubernetes
will, A, implement it and B, take action at all
times to reconcile the observed state with the desired
state that we have here. So let's get this
application up and running. We have to build
a container image. We have to apply the
manifest to Kubernetes. And we're going to
use skaffold, which is a tool that was
built in open source by the amazing container
tools team here at Google. We're going to do all of
it with a single command. So let's look at that. And that command
is skaffold dev. Great. So we'll see that all those
steps that I had mentioned are now being kicked
off by this tool. We'll dive into the details
of skaffold a little bit later in the talk. But while it's
doing its thing, Wes I wanted to ask
you, how familiar are you with our GCP logos? WESTON HUTCHINS:
Not really at all. ANTHONY BUSHONG: OK. Well, it's a good thing then
that our demo application is actually a quiz. So we have a pop
quiz for you to see if you can match the product
or use case with it's appropriate GCP hexagon. So now we see that this
application is actually deployed. We're going to have to-- before we get into
the app, let's look at the Kubernetes
resources that were created. So I think before we
mentioned that we want pods, so let's look at those. Great. So we have two
replicas of the UI pod and then one replica
of the database pod. I didn't show that
manifest, but that was also recursively applied as it was in
the Kubernetes manifest folder. We also will be able to
look at the load balancers. So right now, GKE
in the back end is actually provisioning
a network load balancer to front our UI pods. And so you'll see that it's
actually also provisioning an external IP address. Again, this is really
the magic of Kubernetes. All we have to do is write
these declarative files, and we get our desired state. So let's look at that again. And we'll see that we now
have an external IP address to access our application. So, let's throw that
into our browser. And Wes, this is
made just for you. So, you're on the spot now. WESTON HUTCHINS: I swear we
didn't practice this at all. And I failed when we earlier on. A hosted key management service? I'm going to go with the one
on the left for that one. ANTHONY BUSHONG: OK. I don't know. Are you sure it's
not the right one? WESTON HUTCHINS: No. ANTHONY BUSHONG: All
right, that's correct. Let's run it again. WESTON HUTCHINS: And Cloud
Tools for Android Studio? I don't think it's
either of those. But probably the one
on the left as well. ANTHONY BUSHONG: So this
is the placeholder for ones that actually don't a hexagon. WESTON HUTCHINS:
[LAUGHS] OK, all right. ANTHONY BUSHONG: One more time. WESTON HUTCHINS:
Let's do one more. ANTHONY BUSHONG: Yeah,
let's find a different one. This one also does
not have a logo. There we go. It's a debugger. WESTON HUTCHINS:
Debugger's going to be the one on the right. ANTHONY BUSHONG: Awesome. Hey, how about round
of applause for Wes. WESTON HUTCHINS: I passed, yeah. Thank you. [APPLAUSE] You have to drill on
these quite a bit. ANTHONY BUSHONG: So it's
clear that Wes knows GCP. But what's also clear is that
it can be really easy to get up and running with containers
and Kubernetes Engine. So let's quickly recap
what this demo did. We had a GKE cluster. Again, this is really easy to
create with a single click. We've connected to this cluster. We built our applications
in containers. We've declaratively
deployed them to Kubernetes. And we've exposed those
containers behind a load balancer so that we could all
go through the demo today. And so-- and Wes, again,
aced the GCP quiz. So we did this all with skaffold
and Google Kubernetes Engine in just a few commands. And so today, we'll
show you how to get started really easily
with a lot of the things that you'll need to do to be
able to have a demo like this. So back to you, Wes. WESTON HUTCHINS: Cool. Thank you. So I know that was a
quick-- you probably saw a lot of things on
screen that you didn't quite understand. We're going to go into a bunch
of the YAML file definitions and kind of how we build this
out in a little bit later. Can we jump back to the slides? Cool. So a brief history of
containers and Kubernetes. So containers have been
around for a long time. They've been a part of
Linux for quite a while. We really started to see
container adoption pick up with the introduction
of Docker in about 2013. And that was a key moment where
public adoption of containers really started to take off. Docker offered a lightweight
container runtime and an amazing packaging
and deployment model where you could bundle your
app and all its dependencies and deploy that consistently
across different environments. Now, when Docker came
out, they were really focused on that dev workflow
and solving the packaging and deployment for a single
node across a single container as well. And we all realized that it
wasn't the full solution. The eventual goal would be
to manage multiple containers across many different nodes. And that's when Kubernetes
really started to-- the seeds of Kubernetes started
to get planted at Google was around 2013 time frame. So Kubernetes really at its
heart is an orchestration tool. Now, what does that mean? It really means that you
can package and deploy multiple containers
across multiple hosts, and Kubernetes
takes care of most of the heavy lifting for you
with things like scheduling, and application health checking,
and more of those types of-- more of those types of things. But in my opinion, Kubernetes
is really two powerful concepts. The first is that
its an abstraction over infrastructure. And this is really, really
important for your developers. We wanted to make it so that
most developers didn't really have to retrain their
skills depending on whatever cloud
they were using or if they wanted to move
from something on-prem over to a cloud
platform as well. Kubernetes gives you that
consistent base layer that allows the operators to
worry about infrastructure and the developers to get a
consistent tooling platform. And the second real
powerful tool of Kubernetes is that it's a declarative API. Everything in
Kubernetes is driven off this model of declare
your state, declare what you want to have
happen, and then Kubernetes has built-in feedback loops that
will make that state reality. And they're
constantly observing. We can write even
custom controllers to handle a bunch of different
state reconciliations for even new objects
that you want to create. And the best part
about Kubernetes is that it tries to encapsulate
a bunch of best practices around application
deployment and scheduling, so things like, where does
a particular container need to get scheduled on which node? Is that node healthy? Can I do-- can I restart
that node if it goes down? How do I provision storage? We have storage
providers for a number of different platforms
built into Kubernetes through the ecosystem. Logging and monitoring
is also integrated. And we've seen a
lot of extensibility with tools like
Prometheus and Grafana that work on top of
the Kubernetes system, and even going so far
as things like identity and authorization. There's a ton of work that
happens in the community every month on adding new and
new features to the platform. It's actually quite astounding
how fast Kubernetes iterates. And it's really
important to always make sure you're keeping up
with the latest developments. But, with great power
comes great responsibility. Kubernetes can be
a bit overwhelming. There's a lot of power
under the hood here. And there is inherent complexity
with a system as powerful as Kubernetes. And when we talk to
users, we typically break people up into two camps. One is people that are operating
the cluster, cluster operators. And the second one is
application developers. And from a cluster
operator perspective, installing Kubernetes is
relatively straightforward. The real pain comes when you
have to manage the day two operations. This is things like
setting up security in TLS between my nodes,
encrypting etcd and making sure that no one can access it,
doing disaster recovery, like backup and restore, and
even things like node lifecycle management if a node goes down. These are all things that
cluster operators don't always like to handle themselves. And we see a lot of
questions in forum posts on how to operate
Kubernetes at scale. One of the key missions of
GKE when it first came out was to take the heavy lifting
off of the cluster operators and give it to Google. And that's really the big power
of Google Kubernetes Engine is that we wanted to offer
this fully managed solution where Google runs
your clusters and you as the developers get to
focus on the applications. So GKE is our hosted,
cloud-managed Kubernetes solution. We were GA in August
of 2015, so we've been running a number of
production clusters for three years now. One of the nice
things about GKE is that we take option
Kubernetes and we are we try to keep as close
to upstream as possible. And they're always releasing
new versions of GKE as new versions of
Kubernetes are coming out. But we wanted to add
deep integration with GCP so we make it really easy to
use other features like logging, and monitoring, and
identity right out of the box with
Kubernetes and GKE. So, let's look at the
architecture of how GKE is deployed on top of GCP. So, there's typically
two parts to Kubernetes, two major parts to Kubernetes. We have the control
plane and then we have the nodes that
make up your cluster. And on GKE, the control plane
is a fully managed control plane controlled by Google. We run this in our own project. And we have a team of
site reliability engineers that are constantly doing
things that I described on the day two operations. They're scaling the clusters. They're health checking it. They're backing it up. They're making sure etcd is
always upgraded with the latest and greatest security patches. This is really the
operational burden taken off of the
end user and put into the hands of
the Google engineers. There are two ways to spin
up the control plane for GKE. We have the classic way,
which is zonal, where we spend up a single node. And just recently we
announced general availability of regional clusters,
which allows you to spin up a multi-master
high-availability control plane, which I'll cover in a
couple of the later slides. Now, the second
piece are the nodes. And these run in your project. These are where
you're actually going to deploy your containers. And we call these the workers
of the Kubernetes cluster. Now, on GKE, we have a
concept of node pools, which is a group of
homogeneous machines with the same configuration. I'll talk about that
on the next slide. For these nodes, we are
the only cloud platform to offer a fully
managed node experience. And what that means is that
we provide the base image, we constantly keep
it updated, and we have features like node
auto repair and node auto upgrade that allow your cluster
to keep revving with the latest and greatest upgrades without
having you to do anything at all. So, node pools are
also a GK concept. And this is-- there's a lot
of advantages to building out things with node pools. So, node pulls are really just
a group of related machines. It's the same configuration. If I can have an
n1-standard-1 instance, I can create a node
pool of size three, and that will give me three
n1-standard-1 instances. Now, what's neat
about node pools is that it makes it really
easy to mix and match different instance types. So you can spin up node
pool A with the standard configuration, then you
can add a second node pool with things like
preemptable VMs or GPUs so that you can target specific
workloads to those node pools. Even better, node
pools make it really, really easy to add availability
to your application. So if I'm running in a single
zone, let's say US-Central1-B, and I want to add a
failover to US-Central1-A, all you have to do is create a
new node pool in that new zone and then attach it
back to the cluster. And this is just
a really nice way to group related instances
together and then scale out your application. So let's actually walk
through what happens when you upgrade a node. One of the features that
we offer to GK customers is this feature called
Maintenance Windows. Now, Maintenance Windows give-- it's basically you telling us
a four-hour window when we are allowed to upgrade your nodes. So, if you have high
traffic during the day, and you want to make sure that
upgrades happen at midnight or whatever, you can
specify this window. We will only upgrade
your master or your nodes during that time frame. So, what happens is we
all go and drain a node. And what this does is it
removes traffic from that node. And then we do an
operation in Kubernetes is called cordon which makes
it so that new pods don't get scheduled to that node. At this point, we
spin up a new VM on the latest version
of Kubernetes, and then we attach it
back to your cluster. So we do this one by one and
a rolling upgrade fashion. There is a couple
other ways to upgrade that I'm going to cover
and a little bit later. But that's generally
how most of our users keep their nodes
healthy and up to date. The other feature that we
offer for node management is node auto-repair. And node auto-repair is
basically a health checking service. There's a lot of reasons
why nodes can go down, whether it's a
kernel panics or disk errors, out-of-memory
errors, et cetera. Node auto-repair is
constantly monitoring the health of your cluster
for particular errors. And if the health check
fails, like it does here, and it goes into
a not ready state, it will go and do
the same operation that we did during upgrade
where we'll drain the node, we'll spin up a new one, and
then we'll reattach it back to the cluster. This makes it really easy to
make sure that your nodes are always healthy and
at the set amount that you want in your cluster. And as I said, all of this
is really the end journey to get you to
focus on the things that matter, which is your
actual application development, and let Google do the heavy
lifting for Kubernetes. Cool. With that, I'm going
to hand it back to Anthony, who's
going to go into much more depth on the build
and deploy lifecycle. ANTHONY BUSHONG: Awesome. Thanks Wes. So 18 minutes into
the session, and we finally have reached
the title of our talk. So we're going to talk about
getting started with containers in Kubernetes Engine. And these are the key concepts
that we'll be discussing today, all focused, again, around
the second narrative of actually using and
deploying things to Kubernetes. So let's start with building
and deploying containers. So when it comes to
this space, there are two key areas
we want to consider. One is enabling
developers and two is automating deployment
pipelines into environments like staging or production. And both are important. But let's start
with the developers. So for developers who
are new to technologies, like containers and
Kubernetes, this is often what the workflow
feels like initially. There's a lot of unknowns. The magic in the middle could
be Docker Run, kubectl apply, bash scripts, makefiles, a whole
bunch of different approaches. And this can be a bad time
for developers, right? What you end up with is a
bunch of differing workflows with long feedback loops
between code change and actually being
able to interact with their application running
in its runtime environment. And so there's overhead of
maintaining this workflow. There's no consistency
across what should be a standard set of steps. This means that each
developer has their own method of running locally. And this makes reusing
and replicating things in the development
lifecycle really hard. And so, if portability is one
of the key tenets of Kubernetes and containers, then
this is a problem that needs to be solved. And so that's where
skaffold comes into play. So if you remember, skaffold
was the command that we-- skaffold was a tool that we used
to get our quiz application up and running on GKE. Now, skaffold is perfect for
standardizing and enabling iterative development with
containers in Kubernetes. So with skaffold,
all developers have to do is focus on step one,
working with their code. Skaffold takes care of the
rest of the steps in the middle and will not only give that end
point back to the developer, but will actually
stream out logs to the developer to be able
to allow them to debug or make changes. So, let's actually
switch back into the demo and revisit our
quiz application. So here I am with-- back in the terminal. And what we see here
is that skaffold dev-- this is a command
that will actually continue to run once our
application has been deployed. And you'll see here that
it's actually streaming back logs for my application. Now, what's pretty cool is
that what it's also doing is watching for changes. So if I actually want to
change that quiz application, we're not having
to re kind of go through the process of
building our containers and then applying those
Kubernetes manifest. All we have to do is
focus on the code. So let's actually go
into the application. Let's see here. So we'll see here we're back
at our application directory. So I'm actually going to go
into the index HTML file. I'm going to make
some serious changes. And there's nothing
more nerve racking than using VM in front
of 500 engineers. But what I'm going to do is-- let's say that-- you will
see here I initially wanted to quiz Wes, but maybe I want to
quiz everyone in the audience. So what we're going to do here
is change this and say, hello, IO124 attendees. Oh, spelling is hard. All right, cool. So we're going to
change that file. And great, I've changed my code. And notice what I haven't done. I haven't gone into
any Docker file. I haven't gone
into Kube Control. But what I have is this
skaffold dev process that's continuously watching
for changes in my code. And now, we've already
completed our deploy and build process
within, oh, 12 seconds. And so our application is
now up and running again. And so if we revisit it, we
see that our code has actually changed behind
that same endpoint. Is anyone brave enough
to take this one? It's the one on
the left, I'll do-- I'll-- oh, my goodness,
it's the one on the right. OK, well-- [CROWD LAUGHS] --that was not planned. OK, great. So let's actually cut
back to the slides. So skaffold-- let's see, sorry. So skaffold is actually
a pluggable architecture. So one thing to note about that
is that while you saw me deploy into a Kubernetes Engine cluster
that was running in the cloud, we actually use skaffold
with Kubernetes environments that are running
locally in our laptop. So that's things like minikube. This would result in a much
faster iterative deployment process. But it all depends on the
culture of your development teams, and if you're
doing remote development or local development. What we can also do is not only
use our local Docker daemon. So to build these
Docker containers, I actually was using
Docker on my laptop. But what you can do is
actually begin to outsource and push some of that heavy
lifting, save our battery, with tools like Google
Cloud Build, which is our manage build service. So skaffold is a
really useful tool that makes for very happy
and productive developers. Skaffold can also work well
for our automated deployment pipelines, but
we'll get into that. First, we want to
dive into a few tips. So throughout the
session, we'll just be sprinkling in,
if you are just getting started, what
are some things that you have to keep in mind? And so our first tips are around
actually building containers. We have a tip on performance
and a tip on security. So first, around performance,
make sure that you slim down your container images. So if you're coming from the
land of running everything in virtual machines, it can be
tempting to stuff everything that we want into containers
and treat them like VMs. That also makes for
a bad time because it makes for slower pole times. And if containers--
if their value prop is really around increasing
deployment velocity, slow image pull times
will only hamper teams, and especially if you're
working at the scale of hundreds if not thousands of
containers and microservices. So an approach to slim
down container images is to actually use
multistage build. This is basically an approach
to build your binaries in one container and then
copy that final binary to another container
without all of the build tools, SDKs, or
other dependencies that were needed for the build-- to build that artifact. Once you do this,
then you can actually deploy things faster as we have
a more lightweight container image. The second tip here is scan
your images for vulnerabilities. So Google's Hosted
Container Registries Service provides vulnerability scanning. This is an alpha feature. But basically what this
does is that anything that you push to a Docker-- to the Container
Registry in Google will actually be scanned
with this feature enabled. And so you can also
enable it to send you notifications should there be
a vulnerability via Pub/Sub. Along those same lines
when getting started, there are a swath of
public images out there-- an example is public
Docker Hub repos-- that are awesome for
learning and awesome for just experimenting with. But they introduce inherent
risk if we start to take some of these public containers--
they're untrusted and unverified-- and actually implement them
into our application stack. So avoid doing that. So now that we have
containers, we're still-- now that we have a container
workflow for our developers, we're still missing how
we deploy containers into production at
scale in Kubernetes. So if the angle of
containers in Kubernetes is to get teams to move
faster, than a large part of this actual magic starts
happening when we have automated build and
deployment pipelines across environments for
staging or production. Now, let me preface
this with the opinion that there's no one right way
to do continuous integration and continuous delivery,
also known as CI/CD, but we can point to a few
patterns with an example pipeline here. So if a developer is
pushing to a feature branch or maybe tagging a release, we
should expect the developer's workflow to end there. It should end up in a Kubernetes
environment with a provided end point assuming tests pass. And the way that this-- the way that it does this should
be an implementation detail and actually hidden away
from the developers. So in this scenario, we're
using Cloud Build again. In this case, Cloud
Build has built triggers that can kick off
a build with multiple steps. Those steps can actually
include building the binaries, maybe running unit tests,
building the container image, and finally pushing
that out to a registry. Google Cloud Build is awesome
because it is able to provision and manage multiple steps either
sequentially or in parallel. And so each of
these steps actually is executed within
a Docker container. And then across
all of these steps, we have a shared workspace. With that being
said, there are a lot of other tools in the ecosystem
that can fulfill this CI need. And some of these may already
be present in your existing workflows. Jenkins is a popular one,
for example, that we see. So once we have this newly
tagged image in our registry, we can now have that trigger
a deployment pipeline for a continuous delivery. So Cloud Build,
Jenkins, these tools can also perform
this functionality. But in this example,
we're using a tool called Spinnaker to perform
our continuous delivery. It's an open source
tool from Netflix that Google is
heavily invested in. Any CD tools should be able
to have a pipeline triggered via a newly detected image
and kickoff additional steps to check out Kubernetes
manifest, maybe deploy them in two different
environments like staging, production. Spinnaker here gives
us additional features like execution windows. Maybe we only want to automate
deployment at a specific time. Or maybe we will want to
enforce manual approvals from our operations team before
releasing into production. There's a whole bunch of
things that Spinnaker enables, safe rollbacks, being
able to have a nice UI around your actual pipeline. And so it's important
to, again, when looking at these
different tools, make sure that you're choosing
a tool not because of the hype but because it actually
fulfills the use cases that you are looking to achieve. Because like I said,
Cloud Build can also do this continuous delivery
and deployment phase. So a couple of tips when
deploying into production. One, very simple, limit
access to production. You'd be surprised. I've seen a handful
of customers that will give Kube Control, kubectl
access to their developers in production. And what we actually
want to do is use some of GCP's native
isolation primitives. And so in this kind
of diagram, we're using a GCP project,
which owns IAM hierarchy, to actually limit access
to the Kubernetes cluster and running in production. So maybe we'll just give this
access to cluster operators and then give this access
to a service account for our CI/CD pipelines. And then second tip is to keep
your Kubernetes configuration as code. This is pretty key
because we'll want to, one, store the existing state. Maybe we are templatizing
our Kubernetes manifest and want to customize that
depending on each pipeline. But we also want to track a
history of applied manifests. And Git and source
control management is a great way to do that. So now that we've covered
building containers, we have to expose them. To be able to
interact with them, how do we actually expose them
to internal or external users? So let's actually
jump back to the demo to take a look at how
Kubernetes networking works. Actually I'm going
to close this. So I'm just ending the
skaffold dev process that was running earlier. And I actually want
that to be redeployed. So this will
actually not keep it in that iterative
development state. We're going to run and
deploy that application without the feedback loop. But in the meantime, let's
actually switch our clusters, so Kubernetes config. So, we're actually get to
use the not-this cluster, Kubernetes cluster. OK. So, Kubernetes is
networking regardless of GKE or elsewhere requires
that a single-- every single pod gets
its own IP address. So let's look at that. It's kind of nice-- if you look at other
orchestration tools, these might actually require you
to do port mapping or something complex which at scale
can be really difficult. But you'll see here, we have-- I'm actually
running-- I'm running the same application in a
different Kubernetes cluster. And I have three UI
pods and a database pod, and they all have
their own IP address. So to achieve this,
Kubernetes actually-- the open source Kubernetes
doesn't enforce any opinions. And so there are
ways to achieve this. One is to hard code
routes per node. Another is to use
a network overlay in place of routing tables. But ultimately, this can run
into limitations or overhead to be able to just implement
basic networking features. But in GKE, we actually
get a VPC native way of allocating IP
addresses to pods. So we can, within
our Google VPC, create a known range for
these pod IP addresses. These are known as IP aliases. And then a slash
24 from that range will be assigned to each
node to give to pods that are running on that node. So we can actually look at that. So I also have three
nodes in this cluster. And we'll look at the pod CIDR
given to each of these nodes. And so you'll see here
that, again, what's important to note about this
is that these are natively implemented in GCP's SDN. We are not having to say-- creating a route per node
to actually implement this and say, if you're looking
for an address in this range here, go to this node. Instead, our VPC actually
handles that natively and is aware of them. And so what's really
great about that is that this is useful when
we're bumping up against quotas for routes in large
clusters or for clusters with hybrid connectivity
where we can actually advertise these ranges. But one thing that's core to
know about Kubernetes pods is that they can come
and go, which means that these IP addresses
that you see here will actually also do the same. So Kubernetes gives
us a stable way of fronting these pods with
a concept called services. So let's actually look at
some of our services here. So let's probably start
with our UI service. Service types build
upon each other. You'll see here that we have
a base cluster IP service. And what this means is that
we have a stable virtual IP internal to our
Kubernetes cluster that fronts pods with this key
value-- these key value pairs. So we can actually
provide these key value pairs, otherwise
known as labels, in our Kubernetes manifest. And so this service is actually
associated with a dynamic map called Endpoints. So if we get Endpoints,
we'll actually see here that I have a
map of all of the pod IPs that are running my UI. And this Endpoint
map is actually associated with the UI service. So if we scale that UI
deployment up and out pod, how many replicas? Anyone out there? Six? WESTON HUTCHINS: Six. ANTHONY BUSHONG: I
think we can that. Actually, I didn't-- I don't think I enabled cluster
auto scale in this cluster, so let's keep it at four
and not go too wild. So if we actually look
at Endpoints again, we'll actually see some
of the control loops that Wes alluded to in action. And so we see that our
Endpoints object has dynamically been updated to incorporate
this new pod that was created. And so as we add more
and more pod replicas, we can expect this
map to be updated. And there's actually
some magic under the hood where we have a daemon
running on every single node, configuring IP tables on
all of our Kubernetes nodes to forward these
stable VPs to this-- one of these pod IPs
in the Endpoints map. And so this is really good. We don't have to think about
how we configure load balancing. Kubernetes makes this so just
by declaring our service object. What I also want to call
out is that, like I said, services build upon each other. So we have an external IP
address for our UI pods as well. And so if we actually
go into the UI, we'll see that we'll
have network load balancers provisioned for us. And then these will actually
route to all of the nodes in our cluster-- so each
cluster has three nodes-- and then we're back to
the IP tables forwarding to our actual pod traffic. So we've actually talked
a lot about how we expose an individual service. But what happens when we
have multiple services, which I'm sure you're bound
to have if you're running Kubernetes engine? How do we have a single
entry point to route to those services? So can we actually cut
back to the slides? So that's where the Kubernetes
Ingress object comes into play. So in GKE, the
default Ingress object is actually implemented as
our global HTTPS low balancer. And so if you create
an Ingress object, you'll see here that I have a
Service Foo and a Service Bar, we're actually able
to route users. One method is route by path. So if we're going
to path Foo, we're routed to our Service Foo. And then if we
route to path Bar, we're routed to our Service Bar. And so we also expect this
to do a lot of basic things at layer 7, like
TLS termination. But what's really cool
about GKE is that now we get to, again, like Wes
mentioned, integrate some of the great GCP
services with our Kubernetes applications. So if we're using the
global HTTPS load balancer, we're actually able to use
things like Cloud Armor to protect against DOS attacks. Or we're able to use Identity
Wear Proxy, which is basically enforcing identity-based
access control in front of services, Cloud CDN. And then I want to call
out one thing that's really cool and very unique
to Google Cloud Platform, and that's Kubemci. So what we can actually
do, I think as you-- this might not be a
getting started topic, but certainly a lot of customers
have directed there or headed in that direction-- is
running multiple clusters in different geo regions. And so with Kubemci,
we can actually configure our load
balancer not only to front multiple
services, but to actually front multiple different
clusters across the globe. And so just a
quick demo of that. You'll see here that I have a-- can we switch to the demo? So you'll see here that I have
two Kubernetes clusters, one running in US-West and run
one running in Europe-West. And so we also have our
global load balancer the has a single IP address. So we have a single
entry point that is fronting both of our
Kubernetes Engine clusters. So if I go to this, because we
are in wonderful San Francisco, we should actually be routed
to the US-West1-A cluster. And what's really cool
is that if we actually go into a Europe-West1-B virtual
machine and actually hit that same endpoint, we're going
to be-- we should, assuming, I don't want to jinx the demo-- we should be routed to
our Europe-West cluster. And what's really great about
this is that previously, a lot of people
were trying to think about how to implement
this at the DNS layer. But with Kubernetes Engine
and our global load balancer, we can actually think about
this again at layer 7. So if we ping this,
we'll actually see that we hit that
same IP address. And we were served from
Europe-West1-B. So no tricks up my sleeves. This is something that
is just really cool and, again, built into our
actual global load balancer using any cast protocol. Great. So going back to the
slides, the last thing I want to wrap up with
as far as load balancing is really making containers
a first class citizen. So as I mentioned
before with IP aliases, we're able to front IP addresses
with known IP addresses to the VPC. And what we can actually do with
that now is implement something called network Endpoint groups. So as I mentioned
before, there is a lot of IP tables magic in
traditional load balancing in Kubernetes, which could lead
to a bunch of extra network hops and really a
suboptimal path. And I think if you
look on the right here, we'll see because we are
aware of pod IP addresses, we're actually able to configure
them as back ends to our load balancers in Google
Cloud Platform. And so what this makes for
is for an experience that makes load balancing much like
the way we do it across VMs. We're able to health check
against the individual pods. We're able to route directly
to them without being rerouted through IP tables rules. And it really, again,
demonstrates unique properties of Google Kubernetes Engine. This is a feature in Alpha, but
certainly keep an eye on it. So the last tip I
have around networking is sometimes we need private
and internal applications to run on Kubernetes. And so Google
Kubernetes Engine also helps you to get enabled with
that via private clusters and internal load balancers. So private clusters means that
all of our virtual machines running our containers
are actually running in a private
VPC without any access to the public internet, so we
can really harden our clusters using private clusters. And if we have
other applications in the same VPC that are not
running in Kubernetes Engine, we can actually also have
network load balancers that are internal to a VPC. So moving on to
the next session. Now that we have our
app in containers, and we have a way for
people to access them, how do we handle a scenario
when a lot more people want to access them? And that's where
autoscaling comes into play. So before we talk
about autoscaling, it's paramount to introduce a
concept of requests and limits. The first tip here is use them. They define a floor and
ceiling for containers in pods around resources
like CPU and memory. And they actually inform
how our control plane schedules pods across nodes. And you'll see that they
actually inform autoscaling. A quick note here is that if
you define limits for CPU, your containers will be
throttled at that limit. But for memory,
it's incompressible, and so they will actually
be killed and restarted. So really, again,
it does require a lot of understanding of
what are acceptable resource utilization of your containers. So the two most well-known
autoscaling paradigms are horizontal pod autoscaling
and cluster autoscaling. So for horizontal
pod autoscaling, we can actually target
groups of pods and scale on metrics like CPU. Or in GKE, we can actually
provide horizontal pod autoscaling based
on custom metrics. So if you're doing
something like a task queue, and you want to track
that and scale more pods to handle, let's
say, more work, you can configure HPA to do that. We also have the concept
of cluster autoscaling. So eventually if you're scaling
a numerous amount of pods, you'll eventually need to
add more physical-- or not physical-- virtual resources
to your Kubernetes cluster to run those pods. And that's where cluster
autoscaling comes into play. Pending pods that
need resources-- maybe they're not
able to be scheduled-- will trigger cluster
autoscaling to spin out more machines in our new pools. And so these two autoscaling
paradigms are relatively new, but I do think it's
important to call out. One is being able to scale
workloads vertically. So if we want-- if we think of
a compute intensive application, maybe like a database, adding
more horizontal replicas won't necessarily
solve our issue. And so maybe we actually
need to scale up the amount of resources that
that individual pod has. So with vertical
pod autoscaling, you'll actually
get recommendations that can be either
suggested or automatically applied to resize the
requests of your pods. And then on the right-- or
on the-- yeah, on the right here, you'll see that
we also have the ability to scale infrastructure
dynamically. And so what this means
is that if we're-- if we have a certain set of
node pools in our Kubernetes cluster, but we are-- we don't have any
resources to accommodate, let's say, a 16-CPU
request, we could actually use something like
node auto provisioning to deploy a new node pool to
handle that request that you can scale out as many
one-CPU machines as you want, it's not going to fix that. So node auto
provisioning really helps customers that are
operating at scale and can't necessarily keep
track of every single one of their hundreds of
microservices and requests and to have node pools that
fit each of those workloads. Awesome. So high availability
is important. And I'm now going
to kick it over to Wes to tell us more about it. WESTON HUTCHINS:
Awesome So I'll try to go through this relatively
quickly because we're running a little low on time. So HA, we added a
number of new features, actually in the last
couple of months, around high availability
applications. The first one is we added
regional persistent disk support to GKE. Quite simply what
this does is it replicates your persistent disks
across two zones in a region. Now, why is this nice? Well, you don't have
to handle replication at the application layer
and worry about distributed databases. You can just store
it all to a PD, and Google Cloud Platform will
automatically replicate those to different zones. So the abstraction happens
at the storage layer, which you get out of
the box automatically. The other concept
around high availability is what we call
regional clusters. This is a multi-node
control plane that spreads both the
Kubernetes API server and all of the control plane
components along with etcd across three zones in a region. Now, the really nice thing
about regional clusters is that it increases the
uptime of the cluster from 2.5 nines to 3.5 nines and
you get zero downtime upgrades. So in the zonal
cluster case where you have a single node
running your control plane, it will go down temporarily
when we upgrade the Kubernetes version. But if you are using regional
clusters for your production workloads, we'll do
a one-by-one upgrade. And we'll always
have two masters ready to serve any workloads
that come into the cluster. The best part about
this is it's available at no additional charge
over the zonal clusters. We really want everyone
to be able to use multi-master, high-visibility
clusters for their production apps. So a few quick tips on
upgrades just so people understand how
this works on GKE. For the control plane,
Google engineers will constantly keep
this up to date. We will upgrade behind
the scenes automatically. This is not something you
can control outside of a few maintenance Windows settings
that tell us when we can or cannot upgrade your cluster. Now, we almost never upgrade
to the latest version of Kubernetes. We'll upgrade you to a
more stable version that's had a bunch of patch fixes
and keep you on that, and slowly roll
you up to the new-- to the next version once
another version of Kubernetes comes out. You can also trigger
this manually if you really want the
latest version of Kubernetes by going to our Cloud
console and clicking upgrade. Now, the upgrade for the control
plane happens automatically. For nodes, if you
enable auto-upgrade, it'll work for that as well. However, if you don't
enable auto-upgrade, this is much more control over
when your Kubernetes nodes get upgraded to the next version. And the thing I
wanted to call out is that your cluster can
work in this mixed mode state via Kubernetes
backwards compatibility. So we can have nodes running on
1.8 and a master that's running on 1.10. And we actually won't get to
the next version of Kubernetes until we go and
upgrade the nodes. Now, when you're
upgrading your nodes, there's a few things
to think about. You can do the rolling upgrade
that just happens by default. And this is just by
clicking the node upgrade button in the window. Or you can do something called
migration with node pools. There's a blog post about this
on the Google Cloud Platform blog that goes into
much more detail. But the short of it
is, you can actually create another node pool on the
newer version of Kubernetes, split traffic over
to that node pool using kubectl drain
and cordon, and then you can test a little bit of
the traffic on the new version. If you see your
app is misbehaving, you still have the
other node pool that you can route back
to the previous version while you debug. Once the new version looks
good, you scale that one up, and you scale the
other node pool down. And we actually see a lot
of production users doing this who want to have a much
more testable way of rolling to a new version of Kubernetes. And the last section is
logging and monitoring. So I wanted to call
out that at QCon in Europe this
year in Copenhagen, we announced Stackdriver
for Kubernetes. What this allows you to do is
take open source Prometheus instrumentation and inject
that into Stackdriver's UI. And the really cool
thing about this is it's multi-cluster
aggregated, and you get to use all of the
greatest Stackdriver features like the pre-built
dashboards and the ability to do things like set
alerts so that you're notified when something
goes wrong in your cluster. The other feature that we have
on GKE is GKE audit logging. So if people are
changing your cluster, or you want to figure out who
was the last person to deploy, you can go into our logging
UI under Stackdriver and see both pod-level and
deployment-level information along with information
about the resources as well. So if somebody added a
note to your cluster, this is the screen
that you would go to in order to figure that out. Now, there are a ton
more topics that we just didn't have enough time to
cover in 50 minutes, things like security, identity,
policy management. There's a lot of
sessions that are going on that cover these
topics in much more depth. Anthony and I will be outside
to help answer any lingering questions you have. With that, try out GKE. And thank you very much. [MUSIC PLAYING]