Getting Started with Containers and Google Kubernetes Engine (Cloud Next '18)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] WESTON HUTCHINS: Hello, Google Cloud NEXT. How's everyone doing? Good? All right, lunch session. Everyone's eating while we talk. This is good, brown bag lunch. So welcome to getting started with containers and Google Kubernetes Engine. We have 50 minutes to give as many details as we can about GKE to make you all experts. Before we begin, quick show of hands. How many people have ever deployed an app to Kubernetes before? Not that many. That's exactly what this sessions' all about then. So my name-- real quick intro, my name's Weston Hutchins. I'm a product manager on GKE. ANTHONY BUSHONG: And I'm Anthony Bushong, a customer engineer specialist focusing on AppDev and Kubernetes. WESTON HUTCHINS: So, everyone said they hadn't deployed. We're going to go straight into a demo, no slides. Show you how easy it is to get started. Take it away, Anthony. ANTHONY BUSHONG: [CHUCKLES] All right. Thanks, Wes. So today, we're going to demonstrate, again, how easy it is to get a few containers up and running on Google Kubernetes Engines. So let's just dive right into it. What you're looking at here is a Google Cloud console that is highlighting a Kubernetes Engine cluster. I've already created this. If you haven't created a Kubernetes engine cluster before, it just takes a few clicks, or a simple gcloud command, or even a simple Terraform resource. It's actually quite simple. But since we already have this, let's actually-- how do we use this cluster? So I'm going to click on this Connect button. And I'll actually get an auto-generated gcloud command to fetch credentials to be able to interact with this cluster via kube control. So gcloud is our ST case. You'll see here that I have a GK cluster. But that's not the one we want to use. We want to use this one. So it's this. Great. So now let's test that we've actually connected to our Kubernetes cluster by getting the nodes that comprise our cluster. And there we go. So, as you can see, here are the three nodes that comprise the cluster that I showed you earlier. At the end of the day, these are just virtual machines. What's different is it that Kubernetes allows us to interact with these VMs in a better, cleaner way. So we're not going to SSH into these nodes. We're not going to write imperative instructions to actually install our application. Instead, once our app is built in containers, all we have to do is write some declarative Kubernetes manifest. And these containers that comprise our application will be scheduled by Kubernetes somewhere on these three nodes. Right So the application we'll be working with today is comprised of an Angular UI, a Golang API, and a PostgreSQL database. As containers, they allow us to package and isolate these components independently, really enables the concept of microservices. And so within that design, each of these components has their own code and their own Docker file which specifies how to build the image into a container. And so as I mentioned, once these containers are built, all we need are Kubernetes manifest to implement our desired state in the cluster. So let's actually look at what that looks like. So in this YAML file is our Kubernetes manifest. And this is for the UI aspect of our application. So what you'll see here is that we're telling Kubernetes that we want a load balancer to actually front what is called a Kubernetes pod that is based on this template. Now, a Kubernetes pod is a grouping of one or n containers. So in this case, are UI pod is going to run our UI and our API containers. And we want two replicas of that pod. Cool. The magic of Kubernetes is that once we provide this desired state, Kubernetes will, A, implement it and B, take action at all times to reconcile the observed state with the desired state that we have here. So let's get this application up and running. We have to build a container image. We have to apply the manifest to Kubernetes. And we're going to use skaffold, which is a tool that was built in open source by the amazing container tools team here at Google. We're going to do all of it with a single command. So let's look at that. And that command is skaffold dev. Great. So we'll see that all those steps that I had mentioned are now being kicked off by this tool. We'll dive into the details of skaffold a little bit later in the talk. But while it's doing its thing, Wes I wanted to ask you, how familiar are you with our GCP logos? WESTON HUTCHINS: Not really at all. ANTHONY BUSHONG: OK. Well, it's a good thing then that our demo application is actually a quiz. So we have a pop quiz for you to see if you can match the product or use case with it's appropriate GCP hexagon. So now we see that this application is actually deployed. We're going to have to-- before we get into the app, let's look at the Kubernetes resources that were created. So I think before we mentioned that we want pods, so let's look at those. Great. So we have two replicas of the UI pod and then one replica of the database pod. I didn't show that manifest, but that was also recursively applied as it was in the Kubernetes manifest folder. We also will be able to look at the load balancers. So right now, GKE in the back end is actually provisioning a network load balancer to front our UI pods. And so you'll see that it's actually also provisioning an external IP address. Again, this is really the magic of Kubernetes. All we have to do is write these declarative files, and we get our desired state. So let's look at that again. And we'll see that we now have an external IP address to access our application. So, let's throw that into our browser. And Wes, this is made just for you. So, you're on the spot now. WESTON HUTCHINS: I swear we didn't practice this at all. And I failed when we earlier on. A hosted key management service? I'm going to go with the one on the left for that one. ANTHONY BUSHONG: OK. I don't know. Are you sure it's not the right one? WESTON HUTCHINS: No. ANTHONY BUSHONG: All right, that's correct. Let's run it again. WESTON HUTCHINS: And Cloud Tools for Android Studio? I don't think it's either of those. But probably the one on the left as well. ANTHONY BUSHONG: So this is the placeholder for ones that actually don't a hexagon. WESTON HUTCHINS: [LAUGHS] OK, all right. ANTHONY BUSHONG: One more time. WESTON HUTCHINS: Let's do one more. ANTHONY BUSHONG: Yeah, let's find a different one. This one also does not have a logo. There we go. It's a debugger. WESTON HUTCHINS: Debugger's going to be the one on the right. ANTHONY BUSHONG: Awesome. Hey, how about round of applause for Wes. WESTON HUTCHINS: I passed, yeah. Thank you. [APPLAUSE] You have to drill on these quite a bit. ANTHONY BUSHONG: So it's clear that Wes knows GCP. But what's also clear is that it can be really easy to get up and running with containers and Kubernetes Engine. So let's quickly recap what this demo did. We had a GKE cluster. Again, this is really easy to create with a single click. We've connected to this cluster. We built our applications in containers. We've declaratively deployed them to Kubernetes. And we've exposed those containers behind a load balancer so that we could all go through the demo today. And so-- and Wes, again, aced the GCP quiz. So we did this all with skaffold and Google Kubernetes Engine in just a few commands. And so today, we'll show you how to get started really easily with a lot of the things that you'll need to do to be able to have a demo like this. So back to you, Wes. WESTON HUTCHINS: Cool. Thank you. So I know that was a quick-- you probably saw a lot of things on screen that you didn't quite understand. We're going to go into a bunch of the YAML file definitions and kind of how we build this out in a little bit later. Can we jump back to the slides? Cool. So a brief history of containers and Kubernetes. So containers have been around for a long time. They've been a part of Linux for quite a while. We really started to see container adoption pick up with the introduction of Docker in about 2013. And that was a key moment where public adoption of containers really started to take off. Docker offered a lightweight container runtime and an amazing packaging and deployment model where you could bundle your app and all its dependencies and deploy that consistently across different environments. Now, when Docker came out, they were really focused on that dev workflow and solving the packaging and deployment for a single node across a single container as well. And we all realized that it wasn't the full solution. The eventual goal would be to manage multiple containers across many different nodes. And that's when Kubernetes really started to-- the seeds of Kubernetes started to get planted at Google was around 2013 time frame. So Kubernetes really at its heart is an orchestration tool. Now, what does that mean? It really means that you can package and deploy multiple containers across multiple hosts, and Kubernetes takes care of most of the heavy lifting for you with things like scheduling, and application health checking, and more of those types of-- more of those types of things. But in my opinion, Kubernetes is really two powerful concepts. The first is that its an abstraction over infrastructure. And this is really, really important for your developers. We wanted to make it so that most developers didn't really have to retrain their skills depending on whatever cloud they were using or if they wanted to move from something on-prem over to a cloud platform as well. Kubernetes gives you that consistent base layer that allows the operators to worry about infrastructure and the developers to get a consistent tooling platform. And the second real powerful tool of Kubernetes is that it's a declarative API. Everything in Kubernetes is driven off this model of declare your state, declare what you want to have happen, and then Kubernetes has built-in feedback loops that will make that state reality. And they're constantly observing. We can write even custom controllers to handle a bunch of different state reconciliations for even new objects that you want to create. And the best part about Kubernetes is that it tries to encapsulate a bunch of best practices around application deployment and scheduling, so things like, where does a particular container need to get scheduled on which node? Is that node healthy? Can I do-- can I restart that node if it goes down? How do I provision storage? We have storage providers for a number of different platforms built into Kubernetes through the ecosystem. Logging and monitoring is also integrated. And we've seen a lot of extensibility with tools like Prometheus and Grafana that work on top of the Kubernetes system, and even going so far as things like identity and authorization. There's a ton of work that happens in the community every month on adding new and new features to the platform. It's actually quite astounding how fast Kubernetes iterates. And it's really important to always make sure you're keeping up with the latest developments. But, with great power comes great responsibility. Kubernetes can be a bit overwhelming. There's a lot of power under the hood here. And there is inherent complexity with a system as powerful as Kubernetes. And when we talk to users, we typically break people up into two camps. One is people that are operating the cluster, cluster operators. And the second one is application developers. And from a cluster operator perspective, installing Kubernetes is relatively straightforward. The real pain comes when you have to manage the day two operations. This is things like setting up security in TLS between my nodes, encrypting etcd and making sure that no one can access it, doing disaster recovery, like backup and restore, and even things like node lifecycle management if a node goes down. These are all things that cluster operators don't always like to handle themselves. And we see a lot of questions in forum posts on how to operate Kubernetes at scale. One of the key missions of GKE when it first came out was to take the heavy lifting off of the cluster operators and give it to Google. And that's really the big power of Google Kubernetes Engine is that we wanted to offer this fully managed solution where Google runs your clusters and you as the developers get to focus on the applications. So GKE is our hosted, cloud-managed Kubernetes solution. We were GA in August of 2015, so we've been running a number of production clusters for three years now. One of the nice things about GKE is that we take option Kubernetes and we are we try to keep as close to upstream as possible. And they're always releasing new versions of GKE as new versions of Kubernetes are coming out. But we wanted to add deep integration with GCP so we make it really easy to use other features like logging, and monitoring, and identity right out of the box with Kubernetes and GKE. So, let's look at the architecture of how GKE is deployed on top of GCP. So, there's typically two parts to Kubernetes, two major parts to Kubernetes. We have the control plane and then we have the nodes that make up your cluster. And on GKE, the control plane is a fully managed control plane controlled by Google. We run this in our own project. And we have a team of site reliability engineers that are constantly doing things that I described on the day two operations. They're scaling the clusters. They're health checking it. They're backing it up. They're making sure etcd is always upgraded with the latest and greatest security patches. This is really the operational burden taken off of the end user and put into the hands of the Google engineers. There are two ways to spin up the control plane for GKE. We have the classic way, which is zonal, where we spend up a single node. And just recently we announced general availability of regional clusters, which allows you to spin up a multi-master high-availability control plane, which I'll cover in a couple of the later slides. Now, the second piece are the nodes. And these run in your project. These are where you're actually going to deploy your containers. And we call these the workers of the Kubernetes cluster. Now, on GKE, we have a concept of node pools, which is a group of homogeneous machines with the same configuration. I'll talk about that on the next slide. For these nodes, we are the only cloud platform to offer a fully managed node experience. And what that means is that we provide the base image, we constantly keep it updated, and we have features like node auto repair and node auto upgrade that allow your cluster to keep revving with the latest and greatest upgrades without having you to do anything at all. So, node pools are also a GK concept. And this is-- there's a lot of advantages to building out things with node pools. So, node pulls are really just a group of related machines. It's the same configuration. If I can have an n1-standard-1 instance, I can create a node pool of size three, and that will give me three n1-standard-1 instances. Now, what's neat about node pools is that it makes it really easy to mix and match different instance types. So you can spin up node pool A with the standard configuration, then you can add a second node pool with things like preemptable VMs or GPUs so that you can target specific workloads to those node pools. Even better, node pools make it really, really easy to add availability to your application. So if I'm running in a single zone, let's say US-Central1-B, and I want to add a failover to US-Central1-A, all you have to do is create a new node pool in that new zone and then attach it back to the cluster. And this is just a really nice way to group related instances together and then scale out your application. So let's actually walk through what happens when you upgrade a node. One of the features that we offer to GK customers is this feature called Maintenance Windows. Now, Maintenance Windows give-- it's basically you telling us a four-hour window when we are allowed to upgrade your nodes. So, if you have high traffic during the day, and you want to make sure that upgrades happen at midnight or whatever, you can specify this window. We will only upgrade your master or your nodes during that time frame. So, what happens is we all go and drain a node. And what this does is it removes traffic from that node. And then we do an operation in Kubernetes is called cordon which makes it so that new pods don't get scheduled to that node. At this point, we spin up a new VM on the latest version of Kubernetes, and then we attach it back to your cluster. So we do this one by one and a rolling upgrade fashion. There is a couple other ways to upgrade that I'm going to cover and a little bit later. But that's generally how most of our users keep their nodes healthy and up to date. The other feature that we offer for node management is node auto-repair. And node auto-repair is basically a health checking service. There's a lot of reasons why nodes can go down, whether it's a kernel panics or disk errors, out-of-memory errors, et cetera. Node auto-repair is constantly monitoring the health of your cluster for particular errors. And if the health check fails, like it does here, and it goes into a not ready state, it will go and do the same operation that we did during upgrade where we'll drain the node, we'll spin up a new one, and then we'll reattach it back to the cluster. This makes it really easy to make sure that your nodes are always healthy and at the set amount that you want in your cluster. And as I said, all of this is really the end journey to get you to focus on the things that matter, which is your actual application development, and let Google do the heavy lifting for Kubernetes. Cool. With that, I'm going to hand it back to Anthony, who's going to go into much more depth on the build and deploy lifecycle. ANTHONY BUSHONG: Awesome. Thanks Wes. So 18 minutes into the session, and we finally have reached the title of our talk. So we're going to talk about getting started with containers in Kubernetes Engine. And these are the key concepts that we'll be discussing today, all focused, again, around the second narrative of actually using and deploying things to Kubernetes. So let's start with building and deploying containers. So when it comes to this space, there are two key areas we want to consider. One is enabling developers and two is automating deployment pipelines into environments like staging or production. And both are important. But let's start with the developers. So for developers who are new to technologies, like containers and Kubernetes, this is often what the workflow feels like initially. There's a lot of unknowns. The magic in the middle could be Docker Run, kubectl apply, bash scripts, makefiles, a whole bunch of different approaches. And this can be a bad time for developers, right? What you end up with is a bunch of differing workflows with long feedback loops between code change and actually being able to interact with their application running in its runtime environment. And so there's overhead of maintaining this workflow. There's no consistency across what should be a standard set of steps. This means that each developer has their own method of running locally. And this makes reusing and replicating things in the development lifecycle really hard. And so, if portability is one of the key tenets of Kubernetes and containers, then this is a problem that needs to be solved. And so that's where skaffold comes into play. So if you remember, skaffold was the command that we-- skaffold was a tool that we used to get our quiz application up and running on GKE. Now, skaffold is perfect for standardizing and enabling iterative development with containers in Kubernetes. So with skaffold, all developers have to do is focus on step one, working with their code. Skaffold takes care of the rest of the steps in the middle and will not only give that end point back to the developer, but will actually stream out logs to the developer to be able to allow them to debug or make changes. So, let's actually switch back into the demo and revisit our quiz application. So here I am with-- back in the terminal. And what we see here is that skaffold dev-- this is a command that will actually continue to run once our application has been deployed. And you'll see here that it's actually streaming back logs for my application. Now, what's pretty cool is that what it's also doing is watching for changes. So if I actually want to change that quiz application, we're not having to re kind of go through the process of building our containers and then applying those Kubernetes manifest. All we have to do is focus on the code. So let's actually go into the application. Let's see here. So we'll see here we're back at our application directory. So I'm actually going to go into the index HTML file. I'm going to make some serious changes. And there's nothing more nerve racking than using VM in front of 500 engineers. But what I'm going to do is-- let's say that-- you will see here I initially wanted to quiz Wes, but maybe I want to quiz everyone in the audience. So what we're going to do here is change this and say, hello, IO124 attendees. Oh, spelling is hard. All right, cool. So we're going to change that file. And great, I've changed my code. And notice what I haven't done. I haven't gone into any Docker file. I haven't gone into Kube Control. But what I have is this skaffold dev process that's continuously watching for changes in my code. And now, we've already completed our deploy and build process within, oh, 12 seconds. And so our application is now up and running again. And so if we revisit it, we see that our code has actually changed behind that same endpoint. Is anyone brave enough to take this one? It's the one on the left, I'll do-- I'll-- oh, my goodness, it's the one on the right. OK, well-- [CROWD LAUGHS] --that was not planned. OK, great. So let's actually cut back to the slides. So skaffold-- let's see, sorry. So skaffold is actually a pluggable architecture. So one thing to note about that is that while you saw me deploy into a Kubernetes Engine cluster that was running in the cloud, we actually use skaffold with Kubernetes environments that are running locally in our laptop. So that's things like minikube. This would result in a much faster iterative deployment process. But it all depends on the culture of your development teams, and if you're doing remote development or local development. What we can also do is not only use our local Docker daemon. So to build these Docker containers, I actually was using Docker on my laptop. But what you can do is actually begin to outsource and push some of that heavy lifting, save our battery, with tools like Google Cloud Build, which is our manage build service. So skaffold is a really useful tool that makes for very happy and productive developers. Skaffold can also work well for our automated deployment pipelines, but we'll get into that. First, we want to dive into a few tips. So throughout the session, we'll just be sprinkling in, if you are just getting started, what are some things that you have to keep in mind? And so our first tips are around actually building containers. We have a tip on performance and a tip on security. So first, around performance, make sure that you slim down your container images. So if you're coming from the land of running everything in virtual machines, it can be tempting to stuff everything that we want into containers and treat them like VMs. That also makes for a bad time because it makes for slower pole times. And if containers-- if their value prop is really around increasing deployment velocity, slow image pull times will only hamper teams, and especially if you're working at the scale of hundreds if not thousands of containers and microservices. So an approach to slim down container images is to actually use multistage build. This is basically an approach to build your binaries in one container and then copy that final binary to another container without all of the build tools, SDKs, or other dependencies that were needed for the build-- to build that artifact. Once you do this, then you can actually deploy things faster as we have a more lightweight container image. The second tip here is scan your images for vulnerabilities. So Google's Hosted Container Registries Service provides vulnerability scanning. This is an alpha feature. But basically what this does is that anything that you push to a Docker-- to the Container Registry in Google will actually be scanned with this feature enabled. And so you can also enable it to send you notifications should there be a vulnerability via Pub/Sub. Along those same lines when getting started, there are a swath of public images out there-- an example is public Docker Hub repos-- that are awesome for learning and awesome for just experimenting with. But they introduce inherent risk if we start to take some of these public containers-- they're untrusted and unverified-- and actually implement them into our application stack. So avoid doing that. So now that we have containers, we're still-- now that we have a container workflow for our developers, we're still missing how we deploy containers into production at scale in Kubernetes. So if the angle of containers in Kubernetes is to get teams to move faster, than a large part of this actual magic starts happening when we have automated build and deployment pipelines across environments for staging or production. Now, let me preface this with the opinion that there's no one right way to do continuous integration and continuous delivery, also known as CI/CD, but we can point to a few patterns with an example pipeline here. So if a developer is pushing to a feature branch or maybe tagging a release, we should expect the developer's workflow to end there. It should end up in a Kubernetes environment with a provided end point assuming tests pass. And the way that this-- the way that it does this should be an implementation detail and actually hidden away from the developers. So in this scenario, we're using Cloud Build again. In this case, Cloud Build has built triggers that can kick off a build with multiple steps. Those steps can actually include building the binaries, maybe running unit tests, building the container image, and finally pushing that out to a registry. Google Cloud Build is awesome because it is able to provision and manage multiple steps either sequentially or in parallel. And so each of these steps actually is executed within a Docker container. And then across all of these steps, we have a shared workspace. With that being said, there are a lot of other tools in the ecosystem that can fulfill this CI need. And some of these may already be present in your existing workflows. Jenkins is a popular one, for example, that we see. So once we have this newly tagged image in our registry, we can now have that trigger a deployment pipeline for a continuous delivery. So Cloud Build, Jenkins, these tools can also perform this functionality. But in this example, we're using a tool called Spinnaker to perform our continuous delivery. It's an open source tool from Netflix that Google is heavily invested in. Any CD tools should be able to have a pipeline triggered via a newly detected image and kickoff additional steps to check out Kubernetes manifest, maybe deploy them in two different environments like staging, production. Spinnaker here gives us additional features like execution windows. Maybe we only want to automate deployment at a specific time. Or maybe we will want to enforce manual approvals from our operations team before releasing into production. There's a whole bunch of things that Spinnaker enables, safe rollbacks, being able to have a nice UI around your actual pipeline. And so it's important to, again, when looking at these different tools, make sure that you're choosing a tool not because of the hype but because it actually fulfills the use cases that you are looking to achieve. Because like I said, Cloud Build can also do this continuous delivery and deployment phase. So a couple of tips when deploying into production. One, very simple, limit access to production. You'd be surprised. I've seen a handful of customers that will give Kube Control, kubectl access to their developers in production. And what we actually want to do is use some of GCP's native isolation primitives. And so in this kind of diagram, we're using a GCP project, which owns IAM hierarchy, to actually limit access to the Kubernetes cluster and running in production. So maybe we'll just give this access to cluster operators and then give this access to a service account for our CI/CD pipelines. And then second tip is to keep your Kubernetes configuration as code. This is pretty key because we'll want to, one, store the existing state. Maybe we are templatizing our Kubernetes manifest and want to customize that depending on each pipeline. But we also want to track a history of applied manifests. And Git and source control management is a great way to do that. So now that we've covered building containers, we have to expose them. To be able to interact with them, how do we actually expose them to internal or external users? So let's actually jump back to the demo to take a look at how Kubernetes networking works. Actually I'm going to close this. So I'm just ending the skaffold dev process that was running earlier. And I actually want that to be redeployed. So this will actually not keep it in that iterative development state. We're going to run and deploy that application without the feedback loop. But in the meantime, let's actually switch our clusters, so Kubernetes config. So, we're actually get to use the not-this cluster, Kubernetes cluster. OK. So, Kubernetes is networking regardless of GKE or elsewhere requires that a single-- every single pod gets its own IP address. So let's look at that. It's kind of nice-- if you look at other orchestration tools, these might actually require you to do port mapping or something complex which at scale can be really difficult. But you'll see here, we have-- I'm actually running-- I'm running the same application in a different Kubernetes cluster. And I have three UI pods and a database pod, and they all have their own IP address. So to achieve this, Kubernetes actually-- the open source Kubernetes doesn't enforce any opinions. And so there are ways to achieve this. One is to hard code routes per node. Another is to use a network overlay in place of routing tables. But ultimately, this can run into limitations or overhead to be able to just implement basic networking features. But in GKE, we actually get a VPC native way of allocating IP addresses to pods. So we can, within our Google VPC, create a known range for these pod IP addresses. These are known as IP aliases. And then a slash 24 from that range will be assigned to each node to give to pods that are running on that node. So we can actually look at that. So I also have three nodes in this cluster. And we'll look at the pod CIDR given to each of these nodes. And so you'll see here that, again, what's important to note about this is that these are natively implemented in GCP's SDN. We are not having to say-- creating a route per node to actually implement this and say, if you're looking for an address in this range here, go to this node. Instead, our VPC actually handles that natively and is aware of them. And so what's really great about that is that this is useful when we're bumping up against quotas for routes in large clusters or for clusters with hybrid connectivity where we can actually advertise these ranges. But one thing that's core to know about Kubernetes pods is that they can come and go, which means that these IP addresses that you see here will actually also do the same. So Kubernetes gives us a stable way of fronting these pods with a concept called services. So let's actually look at some of our services here. So let's probably start with our UI service. Service types build upon each other. You'll see here that we have a base cluster IP service. And what this means is that we have a stable virtual IP internal to our Kubernetes cluster that fronts pods with this key value-- these key value pairs. So we can actually provide these key value pairs, otherwise known as labels, in our Kubernetes manifest. And so this service is actually associated with a dynamic map called Endpoints. So if we get Endpoints, we'll actually see here that I have a map of all of the pod IPs that are running my UI. And this Endpoint map is actually associated with the UI service. So if we scale that UI deployment up and out pod, how many replicas? Anyone out there? Six? WESTON HUTCHINS: Six. ANTHONY BUSHONG: I think we can that. Actually, I didn't-- I don't think I enabled cluster auto scale in this cluster, so let's keep it at four and not go too wild. So if we actually look at Endpoints again, we'll actually see some of the control loops that Wes alluded to in action. And so we see that our Endpoints object has dynamically been updated to incorporate this new pod that was created. And so as we add more and more pod replicas, we can expect this map to be updated. And there's actually some magic under the hood where we have a daemon running on every single node, configuring IP tables on all of our Kubernetes nodes to forward these stable VPs to this-- one of these pod IPs in the Endpoints map. And so this is really good. We don't have to think about how we configure load balancing. Kubernetes makes this so just by declaring our service object. What I also want to call out is that, like I said, services build upon each other. So we have an external IP address for our UI pods as well. And so if we actually go into the UI, we'll see that we'll have network load balancers provisioned for us. And then these will actually route to all of the nodes in our cluster-- so each cluster has three nodes-- and then we're back to the IP tables forwarding to our actual pod traffic. So we've actually talked a lot about how we expose an individual service. But what happens when we have multiple services, which I'm sure you're bound to have if you're running Kubernetes engine? How do we have a single entry point to route to those services? So can we actually cut back to the slides? So that's where the Kubernetes Ingress object comes into play. So in GKE, the default Ingress object is actually implemented as our global HTTPS low balancer. And so if you create an Ingress object, you'll see here that I have a Service Foo and a Service Bar, we're actually able to route users. One method is route by path. So if we're going to path Foo, we're routed to our Service Foo. And then if we route to path Bar, we're routed to our Service Bar. And so we also expect this to do a lot of basic things at layer 7, like TLS termination. But what's really cool about GKE is that now we get to, again, like Wes mentioned, integrate some of the great GCP services with our Kubernetes applications. So if we're using the global HTTPS load balancer, we're actually able to use things like Cloud Armor to protect against DOS attacks. Or we're able to use Identity Wear Proxy, which is basically enforcing identity-based access control in front of services, Cloud CDN. And then I want to call out one thing that's really cool and very unique to Google Cloud Platform, and that's Kubemci. So what we can actually do, I think as you-- this might not be a getting started topic, but certainly a lot of customers have directed there or headed in that direction-- is running multiple clusters in different geo regions. And so with Kubemci, we can actually configure our load balancer not only to front multiple services, but to actually front multiple different clusters across the globe. And so just a quick demo of that. You'll see here that I have a-- can we switch to the demo? So you'll see here that I have two Kubernetes clusters, one running in US-West and run one running in Europe-West. And so we also have our global load balancer the has a single IP address. So we have a single entry point that is fronting both of our Kubernetes Engine clusters. So if I go to this, because we are in wonderful San Francisco, we should actually be routed to the US-West1-A cluster. And what's really cool is that if we actually go into a Europe-West1-B virtual machine and actually hit that same endpoint, we're going to be-- we should, assuming, I don't want to jinx the demo-- we should be routed to our Europe-West cluster. And what's really great about this is that previously, a lot of people were trying to think about how to implement this at the DNS layer. But with Kubernetes Engine and our global load balancer, we can actually think about this again at layer 7. So if we ping this, we'll actually see that we hit that same IP address. And we were served from Europe-West1-B. So no tricks up my sleeves. This is something that is just really cool and, again, built into our actual global load balancer using any cast protocol. Great. So going back to the slides, the last thing I want to wrap up with as far as load balancing is really making containers a first class citizen. So as I mentioned before with IP aliases, we're able to front IP addresses with known IP addresses to the VPC. And what we can actually do with that now is implement something called network Endpoint groups. So as I mentioned before, there is a lot of IP tables magic in traditional load balancing in Kubernetes, which could lead to a bunch of extra network hops and really a suboptimal path. And I think if you look on the right here, we'll see because we are aware of pod IP addresses, we're actually able to configure them as back ends to our load balancers in Google Cloud Platform. And so what this makes for is for an experience that makes load balancing much like the way we do it across VMs. We're able to health check against the individual pods. We're able to route directly to them without being rerouted through IP tables rules. And it really, again, demonstrates unique properties of Google Kubernetes Engine. This is a feature in Alpha, but certainly keep an eye on it. So the last tip I have around networking is sometimes we need private and internal applications to run on Kubernetes. And so Google Kubernetes Engine also helps you to get enabled with that via private clusters and internal load balancers. So private clusters means that all of our virtual machines running our containers are actually running in a private VPC without any access to the public internet, so we can really harden our clusters using private clusters. And if we have other applications in the same VPC that are not running in Kubernetes Engine, we can actually also have network load balancers that are internal to a VPC. So moving on to the next session. Now that we have our app in containers, and we have a way for people to access them, how do we handle a scenario when a lot more people want to access them? And that's where autoscaling comes into play. So before we talk about autoscaling, it's paramount to introduce a concept of requests and limits. The first tip here is use them. They define a floor and ceiling for containers in pods around resources like CPU and memory. And they actually inform how our control plane schedules pods across nodes. And you'll see that they actually inform autoscaling. A quick note here is that if you define limits for CPU, your containers will be throttled at that limit. But for memory, it's incompressible, and so they will actually be killed and restarted. So really, again, it does require a lot of understanding of what are acceptable resource utilization of your containers. So the two most well-known autoscaling paradigms are horizontal pod autoscaling and cluster autoscaling. So for horizontal pod autoscaling, we can actually target groups of pods and scale on metrics like CPU. Or in GKE, we can actually provide horizontal pod autoscaling based on custom metrics. So if you're doing something like a task queue, and you want to track that and scale more pods to handle, let's say, more work, you can configure HPA to do that. We also have the concept of cluster autoscaling. So eventually if you're scaling a numerous amount of pods, you'll eventually need to add more physical-- or not physical-- virtual resources to your Kubernetes cluster to run those pods. And that's where cluster autoscaling comes into play. Pending pods that need resources-- maybe they're not able to be scheduled-- will trigger cluster autoscaling to spin out more machines in our new pools. And so these two autoscaling paradigms are relatively new, but I do think it's important to call out. One is being able to scale workloads vertically. So if we want-- if we think of a compute intensive application, maybe like a database, adding more horizontal replicas won't necessarily solve our issue. And so maybe we actually need to scale up the amount of resources that that individual pod has. So with vertical pod autoscaling, you'll actually get recommendations that can be either suggested or automatically applied to resize the requests of your pods. And then on the right-- or on the-- yeah, on the right here, you'll see that we also have the ability to scale infrastructure dynamically. And so what this means is that if we're-- if we have a certain set of node pools in our Kubernetes cluster, but we are-- we don't have any resources to accommodate, let's say, a 16-CPU request, we could actually use something like node auto provisioning to deploy a new node pool to handle that request that you can scale out as many one-CPU machines as you want, it's not going to fix that. So node auto provisioning really helps customers that are operating at scale and can't necessarily keep track of every single one of their hundreds of microservices and requests and to have node pools that fit each of those workloads. Awesome. So high availability is important. And I'm now going to kick it over to Wes to tell us more about it. WESTON HUTCHINS: Awesome So I'll try to go through this relatively quickly because we're running a little low on time. So HA, we added a number of new features, actually in the last couple of months, around high availability applications. The first one is we added regional persistent disk support to GKE. Quite simply what this does is it replicates your persistent disks across two zones in a region. Now, why is this nice? Well, you don't have to handle replication at the application layer and worry about distributed databases. You can just store it all to a PD, and Google Cloud Platform will automatically replicate those to different zones. So the abstraction happens at the storage layer, which you get out of the box automatically. The other concept around high availability is what we call regional clusters. This is a multi-node control plane that spreads both the Kubernetes API server and all of the control plane components along with etcd across three zones in a region. Now, the really nice thing about regional clusters is that it increases the uptime of the cluster from 2.5 nines to 3.5 nines and you get zero downtime upgrades. So in the zonal cluster case where you have a single node running your control plane, it will go down temporarily when we upgrade the Kubernetes version. But if you are using regional clusters for your production workloads, we'll do a one-by-one upgrade. And we'll always have two masters ready to serve any workloads that come into the cluster. The best part about this is it's available at no additional charge over the zonal clusters. We really want everyone to be able to use multi-master, high-visibility clusters for their production apps. So a few quick tips on upgrades just so people understand how this works on GKE. For the control plane, Google engineers will constantly keep this up to date. We will upgrade behind the scenes automatically. This is not something you can control outside of a few maintenance Windows settings that tell us when we can or cannot upgrade your cluster. Now, we almost never upgrade to the latest version of Kubernetes. We'll upgrade you to a more stable version that's had a bunch of patch fixes and keep you on that, and slowly roll you up to the new-- to the next version once another version of Kubernetes comes out. You can also trigger this manually if you really want the latest version of Kubernetes by going to our Cloud console and clicking upgrade. Now, the upgrade for the control plane happens automatically. For nodes, if you enable auto-upgrade, it'll work for that as well. However, if you don't enable auto-upgrade, this is much more control over when your Kubernetes nodes get upgraded to the next version. And the thing I wanted to call out is that your cluster can work in this mixed mode state via Kubernetes backwards compatibility. So we can have nodes running on 1.8 and a master that's running on 1.10. And we actually won't get to the next version of Kubernetes until we go and upgrade the nodes. Now, when you're upgrading your nodes, there's a few things to think about. You can do the rolling upgrade that just happens by default. And this is just by clicking the node upgrade button in the window. Or you can do something called migration with node pools. There's a blog post about this on the Google Cloud Platform blog that goes into much more detail. But the short of it is, you can actually create another node pool on the newer version of Kubernetes, split traffic over to that node pool using kubectl drain and cordon, and then you can test a little bit of the traffic on the new version. If you see your app is misbehaving, you still have the other node pool that you can route back to the previous version while you debug. Once the new version looks good, you scale that one up, and you scale the other node pool down. And we actually see a lot of production users doing this who want to have a much more testable way of rolling to a new version of Kubernetes. And the last section is logging and monitoring. So I wanted to call out that at QCon in Europe this year in Copenhagen, we announced Stackdriver for Kubernetes. What this allows you to do is take open source Prometheus instrumentation and inject that into Stackdriver's UI. And the really cool thing about this is it's multi-cluster aggregated, and you get to use all of the greatest Stackdriver features like the pre-built dashboards and the ability to do things like set alerts so that you're notified when something goes wrong in your cluster. The other feature that we have on GKE is GKE audit logging. So if people are changing your cluster, or you want to figure out who was the last person to deploy, you can go into our logging UI under Stackdriver and see both pod-level and deployment-level information along with information about the resources as well. So if somebody added a note to your cluster, this is the screen that you would go to in order to figure that out. Now, there are a ton more topics that we just didn't have enough time to cover in 50 minutes, things like security, identity, policy management. There's a lot of sessions that are going on that cover these topics in much more depth. Anthony and I will be outside to help answer any lingering questions you have. With that, try out GKE. And thank you very much. [MUSIC PLAYING]
Info
Channel: Google Cloud Tech
Views: 58,296
Rating: 4.8523078 out of 5
Keywords: type: Conference Talk (Full production);, pr_pr: Google Cloud Next, purpose: Educate
Id: znhnDHAPCZE
Channel Id: undefined
Length: 48min 20sec (2900 seconds)
Published: Wed Jul 25 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.