Caching Made Easy, with Cloud Memorystore (Cloud Next '18)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] GOPAL ASHOK: Welcome to the last day, last show. We were just saving the best for the last. My name is Gopal Ashok. I'm a product manager for Cloud Memorystore. I also have here with me Karthi Thyagarajan, who is a Cloud Architect in Google Cloud Platform. Today we're going to talk about Cloud Memorystore and talk about caching in general in Google Cloud Platform. First of all, thank you for being here. At the end of the conference, after three days, I'm pretty sure it's pretty tiring, but we really appreciate you all being here. Hopefully we can make this a productive session for you. One of the things that I'd like to do as I do the session is to keep the session interactive. So if you have questions, please feel free to raise your hand. And if appropriate, we can take the questions as we go along. Please use a microphone when asking this question so that we can record the questions, too. All right. Let's get started. Before I jump into the topics, I just want to kind of lay the framework for the talk. What I'm going to talk today about is Cloud Memorystore as a product of what we are offering. We'll real deep dive into some of the features of the managed service that we are offering, and finally we will bring it all together by showing how you can build applications, deploy, build, deploy, and also, more importantly, monitor your applications when using Cloud Memorystore. So before I jump into the product itself, I just want to take a quick step back. For me, this chart is a little bit interesting. A lot of you may be familiar with DB-Engines. DB-Engines is a website that basically tracks the popularity of the different databases that are currently in use. When you look at this chart, you can actually make lot of different inferences in the chart, depending on which eye you use to look at the chart. But for me, the key thing here is the number of different databases that are currently popular and the number of different types of databases that are currently being used to build applications. It's surprising. When you look at the different kinds of use cases that have evolved over the last 10 years, it's very clear that using a single relational database was not going to cut it. And the reason being different application use cases have different access patterns, different query patterns. So using just relational database, even if the relational database had different data models, was not optimal. So over a period of time, different kinds of databases came into the market. And one of the key trends that we are seeing is this notion of polyglot persistence. I just like the word polyglots. I'm like, OK, I have to use this. But more importantly, what we see is the shift towards more adoption of microservices-based architecture. And what that means is when you use microservices-based architecture, one of the key design patterns is that you basically have different databases, purpose-built databases, for whatever specific microservices is using. So quick question. How many of you use just one database in the applications that you currently have? What about, let's say three? So the key point here is that with microservices architecture-- or not; even if you are not-- the management of databases start becoming a lot more complex. When you use different kinds of databases, now you need to have new skill sets that you need to have. Even if you can standardize some of the automation, or deployment, et cetera, there are still nuances across all these different databases. What you want to be able to do is just spend time building applications, and less time managing the infrastructure. So that's where, in the context of this talk, where managed services and managed database services helps when you move into a cloud platform. So the key thing about managed services is that it helps you focus on the logical side of things in terms of building applications. And we take care of the physical aspects of managing the database. So in that context, Google Cloud provides a wide variety of managed database services. And since this is the last session, I'm pretty sure you've attended a lot of the database talks, and you probably have seen this slide. But the point here is that regardless of what database you are using, in Google Cloud we offer managed services either provided by Google or through our partner services. So if you look at just a quick look at the different databases that we offer on the non-relational side, we have Cloud Data Store, which is just kind of a document DB. Bigtable is a wide-column store. And with relational, we have CloudSQL, which is essentially PostgreS and MySQL; and Spanner, which is a highly scalable globally distributed database. Today what I want to do is focus on Cloud Memorystore. So what is Cloud Memorystore? So Cloud Memorystore, which is currently in beta-- we released the product back in May, and we are currently in beta-- is a fully managed in-memory datastore service for Redis. So it's a fully managed offering that supports Redis. So why is this important? So if you ask anybody who builds large-scale applications, in memory caching there's a fundamental piece of that application architecture. I don't think we need to spend a lot of time kind of evaluating the benefit of caching. But if you look at where caching is applicable, there's a wide variety of use cases where caching is truly beneficial, from a very simple web application to anything like Ad Tech, where you're serving real time, or in the real-time bidding infrastructure. So across the board, we know that in Memorystore that provides extremely low latency, query processing becomes extremely beneficial. Or it's actually a requirement these days to have a cache in place. Now what are the different types of caches that people normally use? There are two types of caches that are very popular today. The most popular one is Redis. And then there's Memcache. Memcache has been around for a long time. We actually have Brad here, who worked on the initial versions of Memcache. But Redis has become a lot more popular. The reason Redis has become a lot more popular is because apart from the performance aspects that it gives, because it's in Memorystore, it is mostly referred to as a data structure server. So Memcache essentially allows you to store key values, and provides you quick access to static data. But Redis provides built-in data structure. So it actually expands the set of use cases that can be used. And some other things, like sorted sets, are very popular for building gaming applications. If you're building leaderboards, sorted sets becomes extremely useful. And there's lists. And there is [INAUDIBLE]. So there's a wide variety of data structures that make it a very powerful in Memorystore. But apart from that, I think one of the issues with Memcache was that if you really relied on a cache, and if you wanted durability of keys, Memcache was not really the best solution for it. So with Redis you have persistence. You also have replication. So if you really want to build a highly resilient cache, or in Memorystore, Redis is actually a good choice for that. Apart from that, it also supports Pub/Sub. So if you want to build notification systems, et cetera, Redis gives you that capability. And also scripting. And all this that really, really fast performance. So over a period of time, Redis has become extremely popular. So a quick question. How many of you actually use Memcache? Wow. Quite a bit. What about Redis? So it's interesting for us, in terms of when we look at should we do Redis, or should we do Memcache, or should we provide both? But it's interesting to see that it's an equal proportion of both Redis and Memcache. And I think there's reasons why Memcache sometimes is much more useful for certain use cases compared to Redis. So with Cloud Memorystore, we provide Redis as the engine. The key thing that we've been trying to do with Cloud Memorystore, like I said, is that with managed services, what we try to do is we try to take the burden of managing Redis yourself. And if you ask anybody who has deployed Redis, there's a lot of things that goes into managing something on your own, even as simple as things like patching, for example. So with our managed services, we take care of, essentially, deploying Redis, and it gives you an endpoint, and all you need to do is basically worry about writing applications against it. The key thing I want to say here is that we are built on open source Redis. We are not building anything Google proprietary. We are simply taking the open source Redis and exposing it as a managed service. The other key things that we wanted to focus on was how do we provide a highly reliable and highly available service. So in the offering, which I'll get into in a little bit more detail, we provide replication. We provide fast, automatic failover. And we also provide an available [? DSLA. ?] So that's something that normally you don't see. But we will be having that once we go GA. With respect to replication and automatic failover, Sentinel is a common way to set up replicated Redis. But there's a lot of complexity in terms of deploying Sentinel and managing it. So with Memorystore, you don't have to worry about all those things. We basically take care of things. In terms of how we do health checking, everything is done for you. And finally scalability and security. With Memorystore, one of the things again, one of our focus is how can we make things easier for you. So we have put in a lot of work to make scaling super easy. Because some of the use cases that we see with customers is essentially spikes. For example, we have a retail customer where during Black Friday, or any of their peak retail or shopping days, it's very unpredictable. So it's great for us to be able to be sure that they can actually scale up as needed without having to go through a lot of pain. So that's kind of provided in the service. The other key thing is that Redis in a way is not very secure out of the box. So it's always a recommendation that don't put Redis out there in the public. So what we do is we deploy Redis on a private IP. So it's not really accessible from anywhere else other than from within the Google network. We use network-based authentication or network-based security to limit access to the Redis instance. So there's an additional level of security there. And also we provide role-based access control for the administrative side of things. So bottom line is it's a fully managed service that allows you to just easily deploy Redis and use it. And we have customers currently using it. And it's seen a pretty good adoption in terms of a Memorystore. And it wasn't surprising, because everybody wants a cache, and Redis is a very popular engine that customers use. So we've been actually getting pretty good responses from customers. So let's dive into a little bit into what the offering itself is and what we have right now in beta. So one thing I want to clarify is that whatever I'm talking about right now is what is available today in the beta release. And this is our v1 release. So the way we expose the service is through two tiers. We have a basic tier and a standard tier. So the basic tier is a single Redis instance. And we see a lot of customers just using the basic tier, because in a lot of the use cases it's a very simple cache. The key thing that you want to kind of be aware of in basic instance, just like any other instance of Redis-- single instance of Redis-- is that if this instance goes down, it's a full cache flush. For any reason behind the scenes something happens, we will make sure that it comes back up, but you will experience a full cache flush. But that's some of the advantages you have, because now we do the health check. And then we bring back the instance, we preserve the IP address. In some cases, when you are deploying your own, those are some of the challenges that you run into, whether you can actually reserve the same IP address when an instance comes back up. So all of that is kind of taken care of. We don't provide an ancillary for basic tier. But in the standard tier, we provide a replicated version of Redis. We have across zone replication. And we provide automatic failover. The other key thing is that you connect to a single endpoint. And we make sure that the traffic is directed to the right instance, or right location. So in both the cases for both basic and standard tier, the key thing to remember is that it's a single master. So even in the standard tier, there's one master. It is not a scale-out model-- yet. So that's something to keep in mind when you use a service. And like I said, we will be providing three nines of availability once we go GA. Right now we are in beta, so the SLAs don't apply. But when we go GA, we will provide three nines of availability. So provisioning. One of the things that we do is to try to make things really simple. But we also understand that using UI is not the only way folks want to provision their services. Nowadays with infrastructure as a code becoming more and more of a standard way to manage your infrastructure, there are different ways to do it. So with Cloud Memorystore, we obviously support the UI, SDK, et cetera. But we also support Cloud Deployment Manager, or Deployment Manager and also Terraform. So if you are using Terraform to manage your infrastructure, you can use that to manage Memorystore. For the beta version, we are supporting Redis 3.2.11. One of the things that I-- in talking about usefulness of managed services, one of the things that I forgot to mention is recently, I'm not sure if you're aware, Redis had a recent security vulnerability that was announced, I think, last month. So the key thing is that we actually applied the patch even before the vulnerability became public. So those are some of the things that we get to do behind the scenes so that you don't have to worry about those things. So the 3.2.11 that we have today is a Redis version that is fully patched with the latest fix. So going to the next level, in terms of how we expose the service, what we allow you to do is we just allow you to provision any sites, from 1 GB to 300 GB, at 1 GB increment. But the way the service is steered is that the higher the memory size, you get better throughput. And also, there's M1 through M5 tiering essentially not only captures the network performance, but also relates to how we price the service. So in this particular table, what you see are the ranges that we provide, and the corresponding network throughput. So if you need higher throughput, you basically have to provision more memory based on your needs. But Redis being single threaded, we provision enough CPUs in the background that, for certain workloads our network may become the bottleneck. So what we recommend is that profile your workload, run the benchmarks, and see what works best for you. But the way we have exposed the service is essentially using capacity tiers. And from a pricing perspective, we price the service in gigabyte hours. For the different ranges, we have fixed gigabyte hour pricing. And so when I say gigabyte hour to somebody, and it's like, what is the pricing per hour? So the pricing per hour is essentially a multiplier of the gigabyte hour price times the amount of gigabyte you're provisioning for that tier. So for example, if you have provisioning like 4 gigs of memory, it'll be four times 4.9 cents, or if you're provisioning 100 gigs, it's 100 times 1.9 cents. And we have a pricing calculator that will easily let you figure out what the pricing is in terms of the per hour pricing. But the key thing here is that we give you the flexibility-- depending on what you want from a cache-- we give you the flexibility, whether it's a single instance, highly available instance, different throughput characteristics, et cetera. So what I want to do now is go to the demo and walk you through the provisioning process. And I just want to dive into some of the details of the managed service, and some of the things that you may want to know about when you're designing applications on top of Memorystore. So Memorystore is a storage product. And it can be found under the Storage menu in the Google Cloud Console. So what you see here is a few instances that we have provisioned. So let me walk you through creation of an instance. And I want to kind of talk through some of the nuances about the service. So I'm going to provision an instance here. Let's say 4. The instance ID is essentially the identifier of the instance. So like I mentioned earlier, the version that we exposed is 3.2.11. Immediately we get the question, are you going to support 4.0? The answer is yes. We're going to add 4.0 pretty soon. But right now if you're provisioning Memorystore, you get 3.2. You get to choose a tier-- like I mentioned, basic or standard tier. And the locations that we have currently exposed, we have the services available in five regions today. We are well aware of the fact that there are more regions, and we need to be there. So that's something that we are working on, getting more regions added very quickly. So what you see here when you select the region, essentially, is Memorystore is a regional service. So what that means is that when you deploy in a particular region, applications can connect from any zone. It's a regional service. So you can connect from within that same region. We today don't support cross region access. So if your application's trying to connect from one of the region to Memorystore in a different region, we don't allow you to. That is not enabled yet. So we give you the option to deploy your application in any specific zone. One thing I want to note about pricing is that we don't charge for network. Memorystore doesn't have network charge. But we do charge for, for example, if your application is in zone A, and if Memorystore is in zone B, and if you're connecting from GCE, GCE has a cross-net cross-zone networking charge. So if that traffic exists, then you will be charged on the GCE side for the cross-zone network traffic. So that's something to keep in mind. And we here, you basically again, like I said, you can provision anywhere from 1 to 300 GB. And depending on what size you select, you can see that the network throughput changes. And how much network throughput you can get, it changes with the size. So let's go ahead and pick us a number of 30 gig. And finally, the network. So the way we authenticate access to the Memorystore is using VPC. You pick a specific network that you're going to deploy Memorystore in. And we restrict access to Memorystore store for that particular network. So what that means is that any application, or any VMs on that specific network can access this particular instance. And finally, from a Redis configuration perspective, we haven't exposed too many parameters yet. We have only exposed two. Max memory policy is one of them. I believe [INAUDIBLE] lru is our default. So this is something that you may want to think about configuring when you actually deploy Memorystore based on what you want to do. And that's it. So once you do that, you basically click Create. And you go ahead and create the instance for you. So like I said, when you create the instance, what you get is an IP. It's a private IP that you get on the 6379 port. And I mentioned that Redis is fully protocol compatible. So I can use redis-cli. Oh, I hope we can see it. And they run commands just like you would do in any Redis instance. So it's an open source Redis. So all that is compatible. I would say, though, that from a managed services perspective, we do block some commands. So we have documented what those commands are. But those commands are primarily some of the admin commands that we have blocked. From the application side of things, we support Lua scripting. We support Pub/Sub. So all that is fully available when you use Memorystore. Let's go back to the slides. So from a best practice perspective, like I said, you have the option. So if you're using it as a simple cache, basic tier will definitely work for you. One thing I would note is that with basic tier we do have regular maintenance that happens on a quarterly basis, or whenever we have to apply any critical patches. If you're using a basic tier, you can experience a full cache flush in events like that. With standard tier, what we do is we do a rolling upgrade, so you have higher availability on events like that. Because we just apply the patch in a rolling manner. The other key thing that I want to point out is persistence. So in the beta release, we haven't enabled persistence. When I say persistence, I'm talking about AOF, and I'm also talking about RDB. We haven't enabled either one of them yet on the basic or standard tier. We are working on the ability to import and export data to a GCS bucket. That will be coming soon. We are also looking at AOF persistence, which is probably further down the road. From an instance sizing perspective, like I said, there are various benchmarks that are currently available. I'm sure if you're already using Redis, you have an idea of the workload pattern that you have. Always run the benchmarks to see what size fits for you. Like I said, the higher the size, the better the throughput that you get. So you may want to do that. Talked about the eviction policy a little bit, and also the configuration. So we have documented all the default configurations that we use. Like I said, we don't allow you to change a lot of the configurations, or many of the configurations. So it's good to know what other configurations that we set by default for these Memorystore instances. How many of you actually do scale their instances often? Nobody. So scaling is something that we have enabled in beta. It's a very simple way to scale a system. You basically change the size. How does the behavior work? In basic tier, it is a full flush of the cache. And in standard tier, we basically do a rolling upgrade. Or when we do the scaling, we do it in a highly available manner. So from an application perspective, what that means is that the only time the application is unavailable if you're using standard tier is when we do the failover. And the failover that we do is actually very quick. It's probably in the 30-second range. And we have some heuristics in the back for replication. But having said that, talking about replication, we use Redis replication under the covers. So that means it's an asynchronous replication. So when you do scaling or operation, it's quite possible that there could be some unreplicated data, or there could be some data that is stale, because the changes were not replicated when the failover happened. So that is instance scaling. And I just wanted to take a quick moment and see if folks have any questions about the things that I talked about till now. Do you mind using the mic? AUDIENCE: So I think one of the themes that's been running through a lot of the talks I've been to at this conference has been don't trust the network, make sure you authenticate your peers, the whole service mesh thing. But this service seems to, from what I can see, be mostly have VPC-level security. Are there any plans to tie this into IAM or TLS or something so we can authenticate the clients, in the managed server? GOPAL ASHOK: Right. So you're right that currently there's VPC security. And one of the things that we are working on is initially at least enable [? opp. ?] But again, that's just a Redis password. But beyond that, yes, we are looking at at least figuring out how to do encrypted connections between the client and server. So that's something definitely on the road map. AUDIENCE: Thanks. AUDIENCE: [INAUDIBLE] GOPAL ASHOK: On-premise? What do you mean? AUDIENCE: [INAUDIBLE] GOPAL ASHOK: So are you talking about connectivity, or are you talking about whether-- AUDIENCE: [INAUDIBLE] the cache [INAUDIBLE] applications. GOPAL ASHOK: So you want to be able to connect to Memorystore from on-prem using, let's say, a VPN, or something like that. Can you tell me a little bit more about-- because from a caching perspective, latency is a critical factor. So I'm curious in that hybrid case is more for migration, or what is the use case there? AUDIENCE: [INAUDIBLE] Some of them are only on-premise. [INAUDIBLE] GOPAL ASHOK: Is it Redis or Memcache? AUDIENCE: Right now, we use the [INAUDIBLE] caching [INAUDIBLE]. GOPAL ASHOK: OK. So there's two questions there-- I think two parts of the question. One is how do you connect from on-prem onto Cloud Memorystore. Again, that's something that we haven't enabled yet. We are trying to figure out, see how many customers actually need it. But we do see the use case for it. In terms of the terabyte size cache, like I mentioned today, we have the basic and standard tier, which is a single instance; and the replicated instance, again, single master. You're right, it's 300 GB. From a terabyte cache perspective, the next thing we are working on is clustering. So if you do scale-out Redis clusters. So when we have scale-out Redis cluster, you should be able to provide terabytes of our provision terabytes cache. So that's something that we're working on. But you're right that currently it is limited to 300 GP. We may increase that number a little bit more. But we are still constrained by that single VM deployment. AUDIENCE: [INAUDIBLE] monitoring? GOPAL ASHOK: Yes that was my next topic. One more question, please. AUDIENCE: OK. So am I to understand that because you're using asynchronous replication in a failover event in general, not just in a scaling, that there could be a small amount of data loss? GOPAL ASHOK: That's correct. So yes, any time there is a failover, you can have that. One of the things that we are planning to provide-- today we don't provide a failover API that you can manually failover. So if you're doing manual failovers, once you have the API, then you can control whether the instances are in sync, and then failover. In that case, obviously there won't be any. But unplanned, yes, you can expect unreplicated data. AUDIENCE: And somewhat relatedly, how long would a failover event be expected to last? Here, when it's planned, it's about 30 seconds, apparently. But if it's unplanned, about how long would that period of strangeness and possible unavailability be? GOPAL ASHOK: So different scenarios. If it is a complete failure, essentially we expect it to be less than a minute, if it's an unplanned, because essentially we are now waiting for transactions to be replicated, or commands to be replicated. In the case of scaling, we actually have a heuristic where we wait for the replica to get caught up, and then do the failover. But then the failover time is still less than a minute. AUDIENCE: Great. Thanks. GOPAL ASHOK: There was a question on monitoring. So let me quickly go back to the slides and talk about monitoring a little bit. So we do have integrated monitoring. So what that means is that we are integrated with Stackdriver. So we export all our metrics from the instances into Stackdriver. You can use Stackdriver to basically do the service-side monitoring. So let me quickly show you what we have today from a Stackdriver perspective. Switching to the demo. So if you go into Stackdriver-- it's a little bit zoomed. I'm going to zoom in a little bit. So if you go into Metrics Explorer in Stackdriver, and if you look for Memorystore, see your Cloud Memorystore Redis instance. And underneath that, we have exposed a whole bunch of metrics that you can use to monitor your instances. So what I've done is we haven't yet have an integrated dashboard. But you can go ahead and create a dashboard yourself. And this is a dashboard that I've created. So as you can see, you basically can build your own dashboards, and monitor the Cloud Memorystore instance from Stackdriver. So we have that integrated monitoring. And the other thing is that if you are using your own tools, you should be able to get the metrics out of Memorystore, because we are fully protocol compliant. One thing you may want to test is if there is any other block commands that we are using, blocks any of the tools, we would be happy to kind of take a look at that, and see if we need to do something there. But we haven't had any yet in terms of those commands blocking the usage of third-party tools. But like I said, with Stackdriver monitoring, you get all the metrics from the Redis side, like network throughput CPU utilization. One of the interesting things that you can do also is you can monitor the create alerts for specific metrics, and basically alert you. One of the good examples is like memory usage ratio. So if you want to be able to be sure that your instance is not running out of memory, you can actually set an alert. And Karthi's actually going to demo this in a section, I believe. And also do alert-based monitoring. So going back to the slides, the service-side monitoring is interesting. So it gives you a bunch of data. But it still really doesn't-- can we go back to the slide, please? Oh, we're back on the slide. So it still doesn't give you end-to-end visibility, in terms of what's happening from an application perspective. A lot of the times you struggle when you have a latency problem on the application side. You come in, look at the server-side metrics. But you are still not sure what's going on. Unless there is something obvious that's going on on the server side, it becomes extremely hard to troubleshoot. So the interesting thing-- and one of the things that Google has been trying to do, and Google has actually open sourced is OpenCensus. How many of you use OpenCensus today? So OpenCensus is essentially a distribution of libraries that basically allows you to collect traces and metrics from your application. It's a super powerful framework, or a set of libraries that we have open sourced. And one of the things that we've been doing internally is to base instruments, some of the Redis clients that are out there, and see how it all works. But at a high level, what OpenCensus does is it basically allows you to use the libraries and inject your own or export your own metrics from your application perspective. So depending on what your application does, you can define what matters. Is it latency? Is it transactions per second? Whatever it is. But at the application level. And then what you can do is the OpenCensus exports it into some of the popular monitoring and tracing tools, one of them being Stackdriver. So now you have end-to-end visibility not just from the server side, but also from the application side. So this makes troubleshooting extremely, extremely much more easier, and provides end-to-end visibility compared to just having server-side metric. So having said that, what we want to do right now is to show you how you can actually bring all these things together. So Karthi is going to kind of walk you through how you can build applications, build, deploy, and actually monitor application also using OpenCensus. [APPLAUSE] KARTHI THYAGARAJAN: Thanks, Gopal. Can everyone hear me? Great. So let me actually move to the next slide. As Gopal mentioned, the engineering team responsible for Memorystore, they've done a great job surfacing all the metrics that you just saw. You guys are able to see latency, errors, things of that nature on the server side. So what we're going to see next is how can we instrument our own application with an emphasis on using Memorystore. So I'll talk about a simple application architecture that's a three-tier application that uses Java. And we're also going to use a traditional database-- in this case MySQL on top of Cloud SQL. And we'll also get into deployment and provisioning. So I have a quick, informal survey here for provisioning. We're going to be using Terraform and Kubernetes as the topology. So how many of you are using Terraform and Kubernetes across your applications? Great. So it'll be relevant to a good number of people. Happy to see that. This is the simple three-tier application that I talked about. As you can see, we have, IN this case, it's a single-page application that uses Vue.js. It's going to talk through our global HTTP load balancer into a Java Spring API hosted on top of a GKE. That application is, in turn, going to talk to Cloud SQL, MySQL in this case. And essentially this application, which I'll demo in a little bit, will look up employees in our organization, and cache those results into Memorystore for Redis. And the hope is you'll get really low latency after it's been cached, because you're using an in-memory cache. And a little bit about infrastructure as code. As Gopal mentioned, we'll be using infrastructure as code. I mean, this is all the rage these days. It makes DevOps a whole lot easier. We can do cool things like configuration management, keep track of what our resources are, all that good stuff. Is everybody familiar-- or are most people familiar-- with VPC, and the whole notion of networking on GCP? So assuming that, what we're going to do is we're going to deploy a GKE cluster into our VPC. This is a separate VPC set aside for our application. And into the same VPC, we're going to deploy a Memorystore instance. This instance, as Gopal mentioned, has to be in the same VPC so that we isolate traffic to that instance from any other resources that may be trying to access it. We also have a whole bunch of other GCP resources that we'll be using. And I'll get into it in a little bit. On the Terraform side, this is something that our product teams have been really good at. Every time we release a product, we ensure that there's deployment manager support, as well as Terraform support. In this case, it's fairly straightforward to declaratively define our Redis instance, or Memorystore instance. The two things that I'll call out here are, as you can see, there's an authorized network specified there. That network is the same as the GKE cluster that I'm going to show you in a little bit. The other thing that I'll call out is the fact that there's a reserved IP range. This keeps things clean. We're documenting the IP range we're going to be using for our Redis instance, as well as our pods in the GKE cluster. And here's our GKE cluster. The only thing I'll call out here is the fact that I'm specifying an IP allocation policy. And the reason for doing that is because we're using GKE, we have to enable IP aliasing in order for our pods to be able to talk to Memorystore. This is an extra step that you have to do, as opposed to if you were deploying a simple VM, and having that talk to your Memorystore instance, not much else to do there. But in this case, you have to enable IP aliasing. The steps for this might look a little different if you're provisioning these resources using, let's say, G-Cloud, or Deployment Manager. And I also want to call out the fact that-- I'm going to go back a couple of slides-- all the resources that you see here can be deployed with Google Deployment Manager as well. I'm choosing to use Terraform here. With that said, let's get into the demo. Thankfully my machine has not locked. Awesome. So actually, the first step that I'm going to do is I'm going to show you that I can have a pod on my GKE cluster talk to the Memorystore instance. As Gopal showed you earlier-- I'm just going to refresh your memory on this-- we have three instances. These are all for the purposes of our demo. And they're all deployed into different VPCs. And even if you see some of those instances having the same IP address, don't get confused. They're all in different VPCs. That's why they have the same IP address. So we know that we want to be able to ping from our pod 10.0.0.4. So what I'm going to do is I'm going to connect to the junpod, where I have Redis CLI deployed. It happens to be called mysqlclient, because that's also where the MySQL client is. And I connect to it. And let's spin up Redis CLI. That's the IP address, as I mentioned. And I can list the characteristics of that instance. So all good. And this is what you would do in your application as well. And once there, let me flush the cache so that I can get the demo to work as expected. Cool. So now let me go to the demo and kind of show you. It's pretty simple. It's an application that, once again, as I said, it allows you to look up the employees in your organization. I am going to start typing in Gopal. And there's one step that I have to do. We have something called BeyondCorp that interferes with my demo. I'm going to turn that off. And now let me type in Gopal. And what this demo has is, in addition to all the names that it brings back, it also shows you how long that request took. Hopefully, all of you can see this. In this case it took 539 milliseconds. Now the hope is it's been cached. I type in G-O-P again. And let me refresh this just for good measure. It took much less time. And I know that's not very convincing. What we're going to do is, as Gopal mentioned, we're going to use OpenCensus, which I'll show you the code for. We're going to use OpenCensus to see what happened during that call path as that call progressed. For this, I will switch to my Google Cloud console. Let me pull up the console here. So just so you guys can go to this console on your own, this is under the Stackdriver Trace menu. And I'm going to look at the trace list. And let me make this a little bit smaller, because it's hard to see. This was my initial call. I typed in G-O-P for Gopal. And you can see that this call started at the getEmployees method. And it actually shows you how long that method took-- 538 milliseconds or so. But the cool thing here is you can tell that this call actually looked up, or tried to look up that keyword in Redis. We're using the Jedis library here, and I'll talk a little bit about how that library's instrumented. But it didn't find that keyword. So then it had to go to MySQL. And it also shows you how long that MySQL call took. And you can tell the bulk of the time was in the MySQL call. Execute query, et cetera. It found the entry. And then it put the results of that request into our cache. So something straightforward that most people do in a caching type environment. So you can kind of see what happened. This is the happy path. This is what you expect to happen. So now let's go back to our trace list, and just confirm that once it's been cached, I type in the same key word. In this case, hey, it pulled it out of the cache. And it took a lot less time. And you can see that it didn't have to go to MySQL. So that's the happy path. Oh, before I go back and introduce an error, let me also show you, in addition to tracing, the cool thing that OpenCensus does is it also lets you surface metrics. I had a whole bunch of metrics shown here. But unfortunately a whole bunch of time has elapsed. So this is what happened for my most recent query. You can see things like heat map. So you get a sense of what your queries are doing in an aggregate. You can also see a linear view of what your queries are doing. In addition to that, you can surface errors from whatever Redis library you're using. In this case I've instrumented Jedis. And I'll show you how to surface those errors as well. Before I do that, let me show you the code real quick. Can everybody see this OK? Is it too small? Hopefully, it's big enough. This is the getEmployees call that I mentioned. It's my Spring Boot controller. And I'm importing the appropriate OpenCensus libraries. Gopal showed you where to get that library. And I'm using something called the tracer class. And I set up a span. And in that span I can put in annotations that show me which particular method was called, any data that was passed into that method-- in this case, the keyword that was supplied. And I can trace methods that are internal to that call as well. In this case, I'm also instrumented getFromCache. I've added an annotation that says cache miss, in case there is a cache miss. Let me show you that real quick. Go back to this. And you can see that annotation. So it shows up in my Stackdriver console. So it's pretty straightforward to annotate your code with OpenCensus. The other thing I'll call out is the fact that you're seeing, going back to this real quick, you're seeing all this stuff from the Jedis library, like jedis.protocol.bulk, process bulk reply, redis.clients.jedis.close. All that stuff. I didn't instrument that in my code. That came from the instrumentation that is in the library that I'm using. So out of the box, obviously the Jedis library is not going to give you this instrumentation. And I'll show you where to get that instrumented library. But for your own needs, if you're using a version of Redis library for whatever language you're using, look for it in this place. If it's not there, you can obviously fairly easily instrument it and contribute it back to the community. Let me see how much time we have. We have a little bit more time to show the error condition that I described. So how do you surface errors that you may see. So for that, what I'm going to do is I'm going to trigger this thing that essentially has my application talk to a non-existent Redis host. So now I'm going to type in Gopal. Even though Gopal was cached, you can see that it's still taking a lot more time. Let me switch back to my trace window here, and see what happened. I'm going to quickly refresh this Stackdriver console. So this is my trace from just now. And click on the trace and see what happened. You can see that it had attempted to connect to Redis. And it failed. And it went to MySQL. So something happened that caused it to close the connection. And I can also see in my Stackdriver console that errors are starting to show up. And I can trigger alerts based on this error as well. What I've done here is I've set up an alert that says if there is more than one error for this particular metric that I'm surfacing, send me an email. You can also have a connect to page or duty, or a whole slew of other integration points. So hopefully that gives you a sense of what you can do with OpenCensus. There's a lot of power here. Let's actually switch back to the slides, since we don't have a lot of time. I'm going to wrap things up. As I mentioned, if you go to this link, you can find a whole bunch of libraries that are already instrumented. Jedis is one of them. There are a couple of Go libraries in there. And please do add OpenCensus integrations or instrumentations on your own, and contribute it back. Yeah, there's a whole lot more to come with Memorystore, as Gopal mentioned. I'll hand it over to him real quick. [APPLAUSE] GOPAL ASHOK: So just to quickly wrap up, so what we discussed here and talked about is the overview of Cloud Memorystore. There are a lot of questions in terms of feature sets that you want to see. That's something, like I said, we are actively working on. This is the first version of the product. If you really, really care about something, we do have something called the Issue Tracker, where you can actually log your request. We actively take a look at all the issues, so our requests that comes in in terms of feature requests. So I really, really encourage you to please do that if there are things that you feel are super important for you to use the service. But just in closing, thank you all for being here. Thank you for coming to Google Cloud Next. We really appreciate you taking the time. But if you have any more questions, we have the engineering team here, also. So we'll hang around, and if you have any more questions, we are happy to take those. Thank you very much. [MUSIC PLAYING]
Info
Channel: Google Cloud Tech
Views: 4,909
Rating: 5 out of 5
Keywords: type: Conference Talk (Full production);, pr_pr: Google Cloud Next, purpose: Educate
Id: YyA_kPxOgg4
Channel Id: undefined
Length: 50min 11sec (3011 seconds)
Published: Thu Jul 26 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.