Building high performance microservices with Kubernetes, Go, and gRPC (Google Cloud Next '17)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

ANDREW JESSUP: Thank you, everyone, for coming down today. First of all, my name is Andrew. I've been working in Google Cloud for about four years now. I think this is the third time we've run a GCP Next event in San Francisco. And every year we've run it, it's really been a step change increase in the number of partners, the number of attendees, the number of launches, and more importantly, the energy that we see at these conferences. It's really gratifying to see it grow so much every year. So truly, thank you for coming down today, and thank you for being a part of that energy and a part of this conference. Particularly, thank you for coming at this, which is probably the danger hour for a tech conference-- right after lunch on a Friday. So hopefully we'll be able to keep you entertained. What we're talking about today is building high-performance micro services with Go, Kubernetes, and gRPC. And we actually realized, when we were working through the structure of this talk, we kind of started with a title, and we realized that you could almost pick any single word in that sentence and quite comfortably make an hour-long talk out of it. And in fact, a number of my colleagues are doing so. Even right after the session, we've got Tim Hawken is going to be diving into the details of Kubernetes networking and the latest there. Dan Ciruli and Sep are going to be talking about a thing called Cloud Endpoints-- which I'll talk about a little bit, too. But there'll be going deep on that. And then, across the week we've had a bunch of sessions on these topics that have usually picked individual slices of this and gone deep. So we were trying to think of what we could do to make this interesting. And we've realized that what we could do is a complement to some of these deep-dive sessions was something that is a little broader and a bit more of an overview. So what we want to do today, is show how you can use these technologies. Hopefully, you've heard of some, or all of them. But we want to show how you can use them in a real world, practical, grounded application. So that's what we're going to do today. Today, we're going to build and deploy an application to the Cloud Platform that uses all of these different services. And we're going to show you a little bit about that process and some techniques and some strategies you can use to turn that into a robust workflow. We're going to talk a little bit about some of these technologies as well and how they help us get there. So before we get into the app itself, It's worth just recapping a little bit of motivation. Hopefully, microservices is a term that you've probably heard before. It's not a new concept for the most part. Fundamentally, it's about taking apart what might have previously been a large monolithic codebase and breaking it out into discrete components that can be deployed independently. And particularly in the case-- there's a number of advantages for doing this. But it's particularly advantageous when you are building an application that has a large number of developers on it. Being able to independently deploy and manage and implement each of these different components unlocks the development things from each other so that they can work independently. It can be an enormous productivity boost. At Google, we're a very heavily microservice-driven company, or service-driven company. When you interact with a site like Gmail, for example, when you go and view mail.google.com, you are triggering a cascade of RPCs across Google's data centers in order to retrieve the information and serve you that page. There's a whole ton of work going on behind the scenes, and as you do that, your RPC is passing down through the code paths that dozens and, in many cases, hundreds of engineers have worked on. So for us being able to deploy these things independently is hugely powerful. So we've seen with cloud, as well, there's been something of a resurgence, and this is a design pattern. Again, for those who were around in the '90s, service-oriented architecture was another way of kind of saying the same thing. The general design pattern is not new, but a combination of elastic workload provisioning. And some of these technologies have helped accelerate it and push it a little bit further back into the mainstream. So it's becoming a very popular design pattern, a very useful one, especially in large organizations. But it comes with a couple of important challenges. And perhaps the most immediate and obvious challenges, when you're taking apart a codebase, where previously, you were passing objects around in memory, now we're as often as not passing them across the wire. So to send a message, we have a bunch of work, to take a request, to serialize it into a form that we can actually send its bits down the wire. That request then needs to get to a network card. That needs to get translated into packets. Those packets need to get sent, and again, there's many hours of discussion you could have about that. All of this, then, needs to happen in reverse on the other end. The message needs to be deserialized, and then turned into something that an application developer can use, some kind of in-memory object that they can actually work with. To do that can impose a significant performance penalty. It's obviously a lot more expensive from a compute-resource point of view to do all of this work every time you send a message. And particularly, when you look at modern microservice architectures, they may be composed of many tiers and many different operations, like Fan-Out. As you push on this design pattern, this cost of serialization becomes more and more acute. Excuse me. So that's the first problem. In concert with that is another problem, which is just network contention. Part of this is just simply the amount of bytes and bits you have to send over the wire when you're dealing with significant volumes of scale, when you're dealing with low latency or low, poor reliability networks. Network contention itself can be a problem. And if you're not careful, not having a good model on your network architecture can impede your ability to take up a design pattern like this. And then, finally, as you're breaking these things down into multiple components, usually, you're breaking them down into multiple processes, multiple running programs. And when you take traditional models of application isolation, application containments, and apply that to microservices, often this thing could cause a certain degree of overhead that may be practical for something that requires a whole machine or several machines to run. But when you have a small service that just takes up a couple of megabytes of memory, maybe only needs a small amount of spindle usage, some of the traditional models of managing the compute resources that these services need aren't necessarily as robust or as elastic as we would like them to be. And just to emphasize that last point, it's worth mentioning Bin Packing. And so this is a somewhat simplified view of the problem. But in a traditional model of building applications, we may have separated our component. We might have separated our architecture out into a few services, but we will typically have at least one machine dedicated to each service, or maybe a cluster dedicated to each service. And again, that's fine when the services are quite large. It can often justify several machines to run those. But when you have a large number of small services, this model becomes problematic. It's much more convenient if we can run twofold. Firstly, run services across multiple machines, and secondly, run multiple services within a single machine. At Google, this is a real problem. This is something we've started encountering a long time ago because we had a large shared bulk set of compute resources. We've got a large number of developers deploying services into them. We wanted a way of being able to efficiently manage how services were being allocated to compute resource, without necessarily having to get into the language of individual machines to individual services. So this fine-grain Bin Packing, if you can do it, turns out to be a real advantage when deploying these kinds of systems. Again, not strictly necessary, not strictly required to couple this kind of deployment to microservices, but it definitely helps you, if you can. So, since we've been doing so much of this at Google, and we've been doing it for a while, we've spent a number of engineering cycles building out technologies that can help us in this process. We're going to talk about three of them today. In many cases, what you're looking at are the second, third, or fourth generation attempts at solving these kinds of problems. The first, and one of the most exciting, is a thing that we open-sourced reasonably recently called gRPC. It stands for Google Remote Procedure Call. And gRPC is actually a collection of libraries and tools that allow you to create APIs, client and servers, in a number of different languages. And it relies on Protobufs, which are a strongly-typed and very binary-efficient mechanism for serializing and deserializing messages. And it's also built as a transport on HTTP too. So it's able to take advantage of things like bi-directional streaming. And then, on top of that, is robust library for things like flow control, right management, retries, all of the work that ends up being necessary in building a robust, powerful client. But it usually takes a few attempts and a few mistakes, in order to get right. The other thing about it is it's all open-source. Second thing that we're going to use is Go. Go is the language that absolutely has the cutest mascot-- I'm sorry, smallest runtime footprint. And this, again, turns out to be important, especially when your services are small, and you have a lot of them. It's not, again, strictly necessary to have a small runtime footprint, but when you can run your process as efficiently as possible, with as little memory and as little system overhead as possible, this allows you to pack more and more of that capacity into a finite amount of compute, which saves you money. The other nice thing about Go, when it comes to microservices, it has a ton of interesting features in the language. But one of the things it has that's important when it comes to microservices is it has a modern and efficient networking library. Go is a relatively new language, which means, again, it's been able to learn some of the lessons of the past in terms of just how networking can be performed and abstracted. And it's been used very, very heavily at Google for networking applications. And so we've had a lot of time and a lot of energy spent into making these libraries as efficient and as optimal as possible. They're also nice and easy to work with. And again, Go is open-source, so there's a community out there that you can-- you don't just get it for free, but there's a community that you can engage with. And I don't want to make this too academic a discussion, or too academic a talk, but it is worth talking a little bit about performance. The graph here is actually from a public benchmark that the gRPC team published. They rerun this benchmark with every release of gRPC. It's a thing called the Ping Pong Test, where you have two VMs, and you just take a message serialize it in a client, send it to a server, deserialize it, serialize it again, send it back, and repeat, I think it's about 100 times. And measuring the latency of that process. So it's a really good measure of the performance of the language and the runtime at dealing with this message passing. And you can see Go does really well here. It's approaching the kind of performance you get out of C++, and also, although it's not on this list, Java. So, while gRPC is very much a language-neutral technology, it works really well with Go, which is why we like it. The last key piece in this is Kubernetes. Again, there's a lot of talks right now in Kubernetes, so, hopefully, you've heard something about it. Kubernetes is the spiritual successor, in many ways, to a technology that we've been working on at Google for a long time, called Borg. Borg is our hosting platform, that allows us to, again, solve the Bin Packing problem-- taking a large number of tasks, that large numbers of Googlers is submitting to a system. And it effectively acts as a giant solver, figuring out where is the best place in my fleet to run that at any given time. And when these workloads are not just being deployed all the time, but they're also dynamic in terms of the resources that they need to have access to, being able to have a dynamic, automatic system that's able to handle this allocation is hugely powerful. With Kubernetes, we've taken that exact same philosophy, and many of the same ideas, and we've packaged app into an open-source project. Interestingly, it is itself written in Go. Although, like gRPC, it works perfectly well with any language. It's very much a language-agnostic piece of infrastructure. Also, open-source and a very thriving community of people building around that, as well. And I think, more contributions of Kubernetes work commits have now come from outside Google, rather than inside. So we want to build an app and show how all of these things can fit together. If we can cut to my laptop quickly-- which I may need to wake up. So we're going to build an app. The app does this-- it creates 3D animated GIFs Super practical. I hope you can see how this applies to your day-to-day lives already. So it seems practical. One of these apps, that is actually a little deceptive-- very simple to use, very simple to show, but there's a lot going on, in order to make this happen. I'll try this live and do my ritual prep for the demo gods. Well, let's see what this actually looks like to use this app. So a very simple form. I'm going to put in a name. I'm going to choose a mascot. You've seen the Go one. Again, cutest, right? I'll pick something else, say gRPC. We go create an app. So we've now submitted a job. We're using a technique, called Path Tracing, to actually generate that 3D GIF, which is a particularly computationally-intensive exercise. And so, in order to make all of this happen, we've actually had to do a ton of work. The first thing we had to do is create a series of scene assets to render that image-- things, like materials files, objects files, a bunch of metadata about light sources and camera placement. And we actually generate that dynamically for every single GIF that needs to be requested. So we've gone and created a bunch of those assets. The next thing we've done is create a task to render each and every frame of that GIF. There's about 10 of them. And so we then submitted that. And we're done. It's actually faster than it took when I practiced this. So we had to submit a task for every frame of this animation, and then we've then gone and dispatched that out to our cluster. Our cluster has then gone and run that rendering process. It's a pretty slow library that we use. So it takes about 15 to 20 seconds to render each frame. So it's doing that in the background. Pulls them all together and synthesize them into an animated GIF. So quite a bit of work there, just to handle all of that overhead. And if we switch back to the slides, we can dive into the architecture of this thing and take a look about how we actually built this using these technologies. So what we're seeing here is, basically, the set of services that we needed in order to build this out. Would you believe five services are needed to build 3D animated GIFs? So let's break this down a little bit. Let's start with the first service, the bottom-most service, which is the Renderer. So the Renderer is actually a very general-purpose service, which is nice if we ever want to reuse animated GIF creation for any other purpose. What the Render service does is take a request that includes references to all of those assets that we talked about-- so, again, objects, materials, light sources, and so forth-- will perform the task of actually generating an image. It will store that image in Google Cloud Storage, and then, it will return in the response back to any upstream caller path to where that single frame, that single image, lives in Google Cloud Storage. On the other side of this, we have the front-end service. The front-end service is the service that we were actually interacting with directly in the browser there. This is the, essentially, HTTP server. So it's handling the form, it's handling the form submission. If we were doing user authentication, it would probably do that. And it does all the kinds of things you would normally expect to see in a web front-end. Handle it and serve static assets, and things like that. There's not a lot of logic in the front-end, though. We really wanted to keep the logic out of the front-end, as much as possible. Because we never know if we need to be building front-end ends for other things. Maybe there's a mobile kiosk we want to build, or a mobile app that we want to build that doesn't necessarily need to talk to a HTTP API. So we really wanted to try and encapsulate as much of the business logic, if you will, of the GIF creator into some other service. So we called this, of course, the GIF Creator Service. GIF Creator does everything in between our web front end and our Renderer. GIF Creator has an API of its own. It takes an API request that looks a lot like that form that you saw before. So there's a caption. You can pick a mascot. There's a couple of other little bits of metadata. And when it receives that, it will then do all of the work that we talked about. It will be creating the scene assets. It will be storing them in Google Cloud Storage. It will also be responsible for calling out to the Render Service. So it will call a Render's own API and perform the Fan-Out in order to make that happen. We want to do that robustly, so we've actually split this into a server and a worker. And we use Redis as our State Management Service to track the status of all of these jobs. GIF Creator handles all of that. It actually has its RPC and its response. It doesn't send back the GIF because we expect this process to take a while. Instead, we send back a job status. Which the upstream front-end can pull for, as necessary, in order to see where this work is at. So GIF Creator is really the workhorse of all of this. Of course, GIF Creator also does that synthesis, the compositing of a set of frames, once they've all been rendered, into a final image. So a lot of work going on here. Let's zoom in on the APIs for a second. There's really two APIs that we have for the two different services. The Render API and the GIF Creator API. We've talked a little bit about what those APIs do. But let's jump into a little bit of code. We wanted to write these APIs in gRPC. Again, we like the fast serialization, the efficiency on the wire. We like all of the retry and flow control that gRPC gives you. GRPC APIs typically start with a document that looks a bit like the one you've got on screen here. This is called a Protofile. A Protofile is actually a language-agnostic domain-specific language for describing an API. And it's describing two things-- both the RPCs, which end points, you can call. Most crucially though, it's also describing the messages that get passed in the request and the response. And the takeaway, really, from this screen is that this document is quite strongly typed. You can see we've got things like ENUMS here. We can describe quite complex data structures. Although we haven't had them here, we can have things like repeating fields and arrays. And so it's a robust language for describing message passing. And as I said, this is language-agnostic. Although it looks a bit like Go code, it isn't. But what it can be used for is two things. One, is it's a fairly human readable contract for describing what the API should be. That in itself is actually pretty useful. Two different teams who need to talk via API can use this document to effectively negotiate on what RPCs they need from each other and what information they need from each other. Especially when they work in different languages, having a language-neutral document like this can actually be a really nice idea. The second thing that we get out of this document, though, is an input for a tool called Protoc. So Protoc is the protocol buffer compiler. And what Protoc does is take this document, or a document like it, and can turn it into clients and server libraries in your language of choice. I think, from memory, we support about seven different languages at this point. And certainly Go is one of them. And I want to demo it to you here. But when I run Protoc over a file like this, I then get a library generated, which I can then use in my code. So this is some actual code from the GIF Creator Service. I've cut a bit out. But this gives you an idea about how this generated code from a Protofile can be used in my service. And if you've worked with, really, any kind of client server interface, this will look fairly familiar. There's a lot of options and things that you can play around with that we don't necessarily need to deal with by default. But you can certainly get into things like injecting middleware, things like supporting different forms of authentication. But fundamentally, it's a very simple, particularly in Go, construct to work with. And, crucially in this, you can see that the objects that I'm dealing with in my Go code are Go objects. I have Go structs. I have Go arrays. I have Go strings. So I'm working in a form that's pretty idiomatic and pretty natural. I haven't had to deal with any of the serialization and deserialization. GRPC has handled that for me. So now we'll get back into the code in a second. But one of the things that we found really trips people up-- it tripped me up a lot when I was first starting to work with these tools-- is actually building the pipeline that gets us from code, like you saw before, to something that's actually running up inside Kubernetes. It's complex because, frankly, there's quite a few different steps involved. And there's many different ways of accomplishing those steps. There's not necessarily one single blessed path that everybody uses. So we wanted to talk about that a little bit, not to show the economical way, necessarily, but a way, where we can get from a code base to something that actually runs. As I said, there's a lot of steps. The two high-level steps, if you will, are the Build and Packaging Phase-- this is about taking code and turning it into an image of binary assets. Typically, that's a Docker image, but you can pick your packaging format of choice. And you'll usually store that in an asset registry somewhere. Maybe it's Artifactory or DocHub. Or we're going to show Google Container Registry, which comes with every Google Cloud project. Second stage is about deployment. I've built this asset that describes everything I need to run. I then need to get it into Kubernetes and deploy it. So let's double click on that first stage. This is what the build pipeline looks like-- going from source to an image for this app that we just showed you. And there's actually quite a few different steps in there. And this is especially true when you think about compiled languages, like Go or C or Java. It's a lot more than just getting some code up and running on a server. We really actually rely on these pipelines in order to be able to do it. The first step is getting the dependencies involved. Pretty straightforward. For those of you who are Go developers, you'll know there's plenty of dependency management tools. In this case, we used Glide, although, I encourage folks to check out Dep, which is gaining a lot of traction. But you use a package management tool of some kind, to bring in all the necessary dependencies that you have in your application, anything you haven't checked in. Particularly important, if this is on, say, a CI server, or it's a clean room build of some kind. The next step is you need to actually build your binaries. You need to run Go Build. Now, in our case, we have three services that we're actually building ourselves. So we have three binaries, one for each of them. In our case, all of these binaries come from the same code base. And we're not just going to package them into the same image. We're just going to invoke that image differently for each service. Now, you don't have to do it this way. Again, you could have separate binaries, separate images, even separate languages for every different service here. But for convenience, we've packaged it into one. And we're at Google. We like them on a repo thing I guess. But you have three binaries. And then, we have these binaries-- we need to package them into an image. So in that image, this image is this hermetically-sealed unit that contains everything we need in order to be able to run our app. So obviously we need the binaries in there. We don't need the source code. We don't need the build tools. But we do need a few other assets. The web server, for example, needs images, CSS files, templates, a bit of extra stuff in order for it to work. The GIF Creator Service needs canonical copies of all of its asset files. So there are a few resources in there that are necessary, in addition to the binaries, in order for this thing to run. And then, once we've packaged it, we need to get this thing into a registry, somehow. Now, when most people start off with this, usually what they end up doing is building usually some kind of, I don't know, shell script file, or some other tool to help coordinate all of these steps and bring them in together. And that works pretty well. Although, you do run into some challenges with that model. One of them being, you actually need copies of all of these tools running, wherever you're running that script in order to run the build process. So we found, at Google, a lot of teams have this problem as well. A lot of our customers have this problem. So we've come up with a tool that helps. It's called Container Builder. We shipped it a couple of days ago, actually. Container Builder does exactly what it says on the tin. It takes source code and packages it into a Docker image. And it supports a language for describing these kinds of pipelines that we've talked about. It's called, the YAML Syntax. The file is usually called Cloudbuild.yaml. And again, I won't dive into the details here, except to say that we have a Cloud build YAML file. For this project, we've checked it into the same repository that we have all of our other code. And this describes the three steps that are necessary in order to build our app. Everything is included in this file. The way we describe steps in Container Builder is actually with Docker images themselves. So the nice thing about this file is that it can actually be taken and run by anyone else on my team, even if they haven't necessarily installed all the tools like Glide and Go and Docker before. So it's a pretty neat way of describing a build process, and it's pretty general purpose. Again we chose things like Glide and Go Build and Docker Build to package everything. But you can pick up other tools if you wanted to. Practically, how do you actually trigger a build like this? You've got two ways. One is, of course, there's a GCloud command to do it. And that can be done from any source tree that's sitting on your laptop. But we also have-- and again, this was just launched a couple of days ago-- an extension to contain a registry, called Build Triggers. Which allows you to watch GitHub, any Bitbuckets, any Google source repository that you might have set up. And every time you do a commit to a particular branch, or a particular tag, or that follows a particular pattern, you can automatically trigger a new build in Container Registry. So I'll give you a quick demo of that now. Hopefully, you can see this. I'm looking at Google Source Repositories here. This is the source code of all three of those services. Let's jump into front-end, just to give you a sense of it. This is the front-end, this is the Go code that produces the binary for the front end. Some static assets in there. And some templates and a few other little things. If you jump back out and look at the root folder here, we're looking at the master branch. We've also got some pretty familiar-looking files that help us with the build process. So here's our Glide file that describes all of our dependencies. Here's the Cloud Build file, what we showed you a couple of seconds ago. Let's also go to Docker file here, too. I mentioned there was a packaging step. We're using Docker to perform that packaging, and so it'll pull from this Docker file. And you can see there-- if you're not familiar with Docker files, this probably won't mean much to you. But if you are, the interesting takeaway from this Docker file is there's really not that much going on. All it's doing is copying in the binaries that we built upstream and adding the templates, and a few other things. It's creating some root certificates. But otherwise, it's really not doing very much. It's a last mile packaging step. And this is really nice. It allows us to keep all of our build tools out of our final image. And it allows us to create something that really only has the bare minimum we need in order to run it. So down here in Cloud Shell-- hopefully, you can all see this. So this is a local VM. I've checked out, and you can see that same repository. So I'm going to try and make a quick change here. And for the sake of this demo, I'm going to make a boring change. I'm just going to jump into the front-end. And what I'm going to do is, I'm going to make the form that you fill out to create a GIF, I'm just going to change the color of that, just to prove this deployment flow to us. So if look at that, it's form.html. OK. We have a header here. So I'm going to do something really trivial. I'm just going to add something like that. So I made the change-- simple. Let's commit that back to our repo. OK. So we've pushed that. So now, if we jump over to Container Registry-- let me give you some room. If we jump over to Container Registry, you see here, we have this list of Build Triggers. So we've set up a trigger already to watch that repository, and every time we push to the master branch, it's going to automatically trigger a Build for us. If we going jump in to Build History here, we can see that build is actually kicking off. And I won't bore you with too much of the detail here, except to say, here's the logs of the process. Right now, it's running Glide to bring in all of the dependencies, and then it'll continue on with the build and the remaining steps. Something interesting to note here is every build is given a Build ID. This is just a UUID. And in our Cloud Build file-- actually, let me show you that really quick. In our Cloud Build file, we've actually configured it to generate our Docker image with a tag based on that Build ID. So we've asked to insert the Build ID into the final image. So let's jump back. That might be done by now. And it is. So there's our build. If we jump into Container Registry-- and look. And we have a new image. Great. And just a quick look at the size of that image. It's only 15 megabytes. And if you're used to dealing with your Docker images built from Ubuntu that have a ton of tooling in them, 15 megabytes is nice and small. So now, we have an image. Next, we want to get it into production. So now, we want to talk about deployment. Before we do that, though-- so you'll notice, again, this thing has a tag based on its Build ID. For Kubernetes, this can be really useful. Because we're going to use these tags in our deployment manifests. Now, we could use the UUID that I got created here, but that's a lot to remember and a lot to type. So I'm actually going to add a second tag here called Demo Fix, which we can use a little later when we're doing a deployment. So if we can jump back to the slides. Let's just talk about what this deployment actually looks like. So what does this thing actually look like inside Kubernetes? Kubernetes has a number of constructs that are really useful in this process. And again, it could be easily an hour talk, just describing the different moving parts of Kubernetes that come into play when you're running a real service. We're really only going to talk about two of them. The first is deployments. So a deployment is a concatenated object in Kubernetes that actually, in turn, describes a number of other underlying constructs-- a replica set and pods and pod configurations. And it provides a really nice declarative way of describing how all of these things are constructed. Fundamentally, what a deployment is doing is telling Kubernetes how to run this image, how to run this code. And we actually have an example of what a deployment looks like. This is a slightly simplified deployment spec. What this deployment is describing is, take an image that we have sitting in a registry somewhere. Run that image using this command. So this is the front-end service here. So we want to call that Front-End Binary, that we built before. Describe any of the metadata that this thing needs. So what ports does it need? What environment variables does it need? You can describe other constraints, like what's the minimum amount of CPU this thing needs? How much memory does it need? But fundamentally, pretty simple. And how many replicas do we need? How many copies of this process do we want running across our cluster. The other component is services. So it's going to get a little meta, but we're going to build services for services. So a Kubernetes service is a way of providing a simple static endpoint that we can use when we talk, used to help us in talking to all of these different components. With deployments, we might be describing a number of different processes running all over our cluster. We need a way of being able to discover where they are and being able to get traffic to them. So a service in Kubernetes helps us do that. The way you can use a service really depends on your deployment. It's a very flexible descriptor. In our case, though, the three services we're building all work basically the same way. We want to have a stable, internal DNS name for each of them. so that if we need to refer to them from another service inside Kubernetes, we can just look it up by its DNS name. And then, what we want is a load balancer which can dynamically route traffic to any of the pods, i.e. the running versions of this process, at any one time. So this manifest file describes everything we need, in order to be able to do that. And we have one of these for each of our services. We also have one for the Render Service that we're running inside Kubernetes as well. So now, we're going to run another quick demo. So if we can switch-- great. So we've built this image. Let's get that image into Kubernetes. So in our Source Tree, we have this directory k8s. And in that k8s directory, we have a set of YAML files that describe these deployments. Basically, the same thing you just saw on screen. Now, we have three services. And each of those has a deployment manifest and a service manifest. So in our case, what we want to do is update the front-end. And really, all we want to do is tell that front-end service, instead of pointing to the old image, now we want you to be running the new image. So if I increase this a little bit and jump into that file, you can see here is the file. I like environment variable. So I've created a bunch of them for this project. And this here is the image that's being used to render the front-end. You can incidentally run multiple, different images inside a single what we call pod in Kubernetes parlance. But in our case, we're just going to do one. And we're going to pull from a different tag than the one we did previously. And I think-- what was it again? Was it Demo Fix? Yeah, Demo Fix. OK. So we've updated that file. Now we need to tell Kubernetes to deploy it. So let's go do that. OK. Deployment replaced. Now, it's worth unpacking that last section a little bit, because a lot just happened. We didn't tell Kubernetes that we wanted to replace an image. What we did was tell Kubernetes that we would like the state of the system to change, to match this new file. And one of the things that's different in that new file is the image has changed. It's kind of important that we described it that way. We didn't say, make a change. We just said, we have a new idea of what we want the state of our system to be. Kubernetes, go and figure out how to make that happen. Now we can provide hints as to how Kubernetes can make that happen, like deployment policies. But, at the end of the day, this is Kubernetes's responsibility to figure out how to make it work. And this is, again, a really powerful idea in Kubernetes, that you can declaratively describe the state of your system and have Kubernetes figure out all the phony details about how it should actually work. So let's actually-- a moment of truth here-- see if this actually worked. So I'm going to jump into our Kubernetes Dashboard. You can see, again, all of the deployments. This is our running cluster. This is running across 10 nodes, which might be a little excessive for a GIF Creator. We have a front-end deployment here. And we can see, a minute ago, this thing received a new replica set. So what Kubernetes did, in the background, was create a new replica set that describes the set of running pods that we need. This new replica set would have the new pod configuration in it. I with the new image. And it will gradually increase the size of that replica set, while spinning down the old one. And so what you get is a graceful deployment as we shift from one mode of the system to another mode of the system. So that actually happened. In our case, it's a pretty simple system, and we're not really doing much in the way of health checking, or any other thing that can potentially slow this down. So in our case, it just happened in a second or so. But we have now a new replica set. And you can see here, it has two pods in them. Pods aren't doing very much yet. But if we jump into one of them, we can see that this is now running with a new image. And it looks like it started perfectly well. So now, if we go back to our GIF Creator, we go back to that entry form. Our text is in red. So our change went through. [APPLAUSE] That is, hands down, the biggest applause I've ever got for changing a bit of CSS. So thank you for that. So now, we have a running app, running in a cluster. We only got a little bit of time left. And I want some for Q&A. So I'm going to do a whirlwind tour of some of the things that you get when you run this stuff inside Google Cloud. Everything I've shown you up until now, for the most part, with the exception of Google Cloud Storage, perhaps, you can run basically anywhere. In fact, most of it, you can run on your laptop, if you want to. And again, all of this stuff is open-source. That's pretty nice. It means, you don't really need to pick Google, in order to get started with this stuff. But if you do happen to be running it on the platform, you get a couple of interesting advantages. The first, of course, is you get Google Container Engine, which takes away a lot of the headaches of actually running a Kubernetes cluster. And it can do a lot of things for you, including automatically patching nodes, automatically standing up, say, Google Cloud Platform Load balances in order to make your service run. It can do dynamic auto-scaling. A bunch of other nice plug-in features there. And it takes a couple of clicks to use. But there's a couple of other features that turn out to be pretty powerful on the Cloud Platform that are worth touring through briefly. We saw a Container Builder. We saw Source Repositories. We saw the Container Registry. We also have a built-in tracing system, which works really well with Kubernetes and gRPC. So I won't bore you with a full tour of this, but this allows us to do distributed tracing across a cluster. Distributor tracing is actually really, really useful in situations like this. One of the first things, that I wanted to do once I deployed this app, was figure out why it took so long for us to render a bunch of frames. So by jumping into tracing, you are able to understand, even as a job progresses, where the time is being taken. Is it being taken up in rendering, or passing this Cloud Storage? Or is it in a Go binary? Distributed tracing allows you to instrument your code. And when it's distributed, the nice artifact of this is you can parse trace context across multiple running processes. GRPC, in particular, makes it really easy to do that. But by doing so, you can actually ditrace a request, all the way from when the user requested it on our front-end, down through to the fanned-out tasks that were being run in the Render-- and all the way back up to the final job. I won't demo that now, but would make for a good follow up talk. Incidentally, what you're looking at right now is the breakdown of latency over time. And you can see there's a few experimental tasks there, which took a really long time to run. And generally, you've got a clustering that looks a little bit like what you'd expect. Some of these dots represent our front-end service, which is really fast. And responds to requests very quickly, terminates very quickly. Some of these represent spans of longer jobs-- e.g., the entire process of rendering an image end to end. Or just calling the Render Service, which takes a bit more time, too. So we have tracing. Of course, logging is centralized. If we jump back to the slides, I'll cover a couple of other quick, nice features of the Cloud Platform The other one is the Live Debugger. So if you want to do real-time introspection of application states, the Stackdriver Debugger now allows you to do that. It works really well with Go, and it works with Kubernetes. The other thing is Google Cloud Endpoints. If you haven't heard of Cloud Endpoints, and you're interested in this, Dan Ciruli is giving a talk on this, right afterwards. What Cloud Endpoints allows you to do is to provide some additional robustness and automation-- and centralization around your APIs. Happens to work really well with gRPC. And this gives you features like a dashboard for detailed logging of RPC traffic, for detailed monitoring of your RPCs, and latencies-- again, across the system, if necessary-- and things like centralized access control. So if you ever get to a point, where you're thinking about publishing one of these APIs to make them consumable to the rest of the world, all of those things that you might need to think about, like user-based quartering and ACL management. Cloud Endpoints grossly simplifies the process of building app. And again, works really well with gRPC. [MUSIC PLAYING]

Info

Channel: Google Cloud Tech

Views: 87,009

Rating: undefined out of 5

Keywords: micro-service, Go, Google Container Engine, Container Engine, container, Google Stackdriver, Stackdriver, monitor, instrument, trace, debug, production service, realtime, Cloud NEXT, Google Cloud, GCP, Cloud, #GoogleNext17

Id: YiNt4kUnnIM

Channel Id: undefined

Length: 44min 32sec (2672 seconds)

Published: Fri Mar 10 2017