[MUSIC PLAYING] GOPAL ASHOK: Welcome to
the last day, last show. We were just saving
the best for the last. My name is Gopal Ashok. I'm a product manager
for Cloud Memorystore. I also have here with
me Karthi Thyagarajan, who is a Cloud Architect
in Google Cloud Platform. Today we're going to talk
about Cloud Memorystore and talk about caching in
general in Google Cloud Platform. First of all, thank
you for being here. At the end of the
conference, after three days, I'm pretty sure
it's pretty tiring, but we really appreciate
you all being here. Hopefully we can make this a
productive session for you. One of the things that I'd
like to do as I do the session is to keep the
session interactive. So if you have questions, please
feel free to raise your hand. And if appropriate, we can take
the questions as we go along. Please use a microphone when
asking this question so that we can record the questions, too. All right. Let's get started. Before I jump into
the topics, I just want to kind of lay the
framework for the talk. What I'm going to
talk today about is Cloud Memorystore
as a product of what we are offering. We'll real deep dive into some
of the features of the managed service that we are
offering, and finally we will bring it all
together by showing how you can build applications,
deploy, build, deploy, and also, more importantly,
monitor your applications when using Cloud Memorystore. So before I jump into
the product itself, I just want to take
a quick step back. For me, this chart is a
little bit interesting. A lot of you may be
familiar with DB-Engines. DB-Engines is a
website that basically tracks the popularity of
the different databases that are currently in use. When you look at this
chart, you can actually make lot of different
inferences in the chart, depending on which eye you
use to look at the chart. But for me, the key
thing here is the number of different databases
that are currently popular and the number of different
types of databases that are currently being
used to build applications. It's surprising. When you look at the
different kinds of use cases that have evolved over
the last 10 years, it's very clear that using
a single relational database was not going to cut it. And the reason being different
application use cases have different access patterns,
different query patterns. So using just
relational database, even if the relational database
had different data models, was not optimal. So over a period of time,
different kinds of databases came into the market. And one of the key
trends that we are seeing is this notion of
polyglot persistence. I just like the word polyglots. I'm like, OK, I
have to use this. But more importantly,
what we see is the shift towards
more adoption of microservices-based
architecture. And what that means
is when you use microservices-based
architecture, one of the key design
patterns is that you basically have different databases,
purpose-built databases, for whatever specific
microservices is using. So quick question. How many of you use just one
database in the applications that you currently have? What about, let's say three? So the key point here is
that with microservices architecture-- or not;
even if you are not-- the management of
databases start becoming a lot more complex. When you use different
kinds of databases, now you need to have new skill
sets that you need to have. Even if you can standardize
some of the automation, or deployment, et
cetera, there are still nuances across all these
different databases. What you want to be able to
do is just spend time building applications, and less time
managing the infrastructure. So that's where, in the
context of this talk, where managed services and managed
database services helps when you move into a cloud platform. So the key thing
about managed services is that it helps you focus
on the logical side of things in terms of building
applications. And we take care of
the physical aspects of managing the database. So in that context, Google
Cloud provides a wide variety of managed database services. And since this is
the last session, I'm pretty sure you've attended
a lot of the database talks, and you probably
have seen this slide. But the point here is that
regardless of what database you are using, in
Google Cloud we offer managed services
either provided by Google or through our partner services. So if you look at just a quick
look at the different databases that we offer on the
non-relational side, we have Cloud Data Store, which
is just kind of a document DB. Bigtable is a wide-column store. And with relational, we have
CloudSQL, which is essentially PostgreS and MySQL;
and Spanner, which is a highly scalable globally
distributed database. Today what I want to do is
focus on Cloud Memorystore. So what is Cloud Memorystore? So Cloud Memorystore, which
is currently in beta-- we released the
product back in May, and we are currently in beta-- is a fully managed in-memory
datastore service for Redis. So it's a fully managed
offering that supports Redis. So why is this important? So if you ask anybody who
builds large-scale applications, in memory caching there's
a fundamental piece of that application
architecture. I don't think we need to spend
a lot of time kind of evaluating the benefit of caching. But if you look at where
caching is applicable, there's a wide
variety of use cases where caching is
truly beneficial, from a very simple web
application to anything like Ad Tech, where
you're serving real time, or in the real-time
bidding infrastructure. So across the board, we
know that in Memorystore that provides
extremely low latency, query processing becomes
extremely beneficial. Or it's actually a
requirement these days to have a cache in place. Now what are the different types
of caches that people normally use? There are two types of caches
that are very popular today. The most popular one is Redis. And then there's Memcache. Memcache has been
around for a long time. We actually have Brad here, who
worked on the initial versions of Memcache. But Redis has become
a lot more popular. The reason Redis has
become a lot more popular is because apart from
the performance aspects that it gives, because
it's in Memorystore, it is mostly referred to
as a data structure server. So Memcache essentially allows
you to store key values, and provides you quick
access to static data. But Redis provides
built-in data structure. So it actually expands the set
of use cases that can be used. And some other things,
like sorted sets, are very popular for
building gaming applications. If you're building
leaderboards, sorted sets becomes extremely useful. And there's lists. And there is [INAUDIBLE]. So there's a wide variety
of data structures that make it a very
powerful in Memorystore. But apart from that, I think
one of the issues with Memcache was that if you really
relied on a cache, and if you wanted
durability of keys, Memcache was not really
the best solution for it. So with Redis you
have persistence. You also have replication. So if you really want to build
a highly resilient cache, or in Memorystore,
Redis is actually a good choice for that. Apart from that, it
also supports Pub/Sub. So if you want to
build notification systems, et cetera, Redis
gives you that capability. And also scripting. And all this that really,
really fast performance. So over a period of time, Redis
has become extremely popular. So a quick question. How many of you
actually use Memcache? Wow. Quite a bit. What about Redis? So it's interesting
for us, in terms of when we look at
should we do Redis, or should we do Memcache,
or should we provide both? But it's interesting to see
that it's an equal proportion of both Redis and Memcache. And I think there's reasons
why Memcache sometimes is much more useful for certain
use cases compared to Redis. So with Cloud Memorystore, we
provide Redis as the engine. The key thing that
we've been trying to do with Cloud
Memorystore, like I said, is that with managed services,
what we try to do is we try to take the burden of
managing Redis yourself. And if you ask anybody
who has deployed Redis, there's a lot of
things that goes into managing
something on your own, even as simple as things
like patching, for example. So with our managed services,
we take care of, essentially, deploying Redis, and it
gives you an endpoint, and all you need to do is
basically worry about writing applications against it. The key thing I want
to say here is that we are built on open source Redis. We are not building
anything Google proprietary. We are simply taking
the open source Redis and exposing it as
a managed service. The other key things that
we wanted to focus on was how do we provide a highly
reliable and highly available service. So in the offering,
which I'll get into in a little bit more
detail, we provide replication. We provide fast,
automatic failover. And we also provide an
available [? DSLA. ?] So that's something that
normally you don't see. But we will be having
that once we go GA. With respect to replication
and automatic failover, Sentinel is a common way
to set up replicated Redis. But there's a lot of complexity
in terms of deploying Sentinel and managing it. So with Memorystore,
you don't have to worry about all those things. We basically take
care of things. In terms of how we
do health checking, everything is done for you. And finally scalability
and security. With Memorystore, one of the
things again, one of our focus is how can we make
things easier for you. So we have put in a lot of work
to make scaling super easy. Because some of the use cases
that we see with customers is essentially spikes. For example, we have
a retail customer where during Black Friday,
or any of their peak retail or shopping days, it's
very unpredictable. So it's great for us
to be able to be sure that they can actually scale
up as needed without having to go through a lot of pain. So that's kind of
provided in the service. The other key thing
is that Redis in a way is not very secure
out of the box. So it's always a recommendation
that don't put Redis out there in the public. So what we do is we deploy
Redis on a private IP. So it's not really
accessible from anywhere else other than from within
the Google network. We use network-based
authentication or network-based
security to limit access to the Redis instance. So there's an additional
level of security there. And also we provide
role-based access control for the administrative
side of things. So bottom line is it's
a fully managed service that allows you to just easily
deploy Redis and use it. And we have customers
currently using it. And it's seen a
pretty good adoption in terms of a Memorystore. And it wasn't surprising,
because everybody wants a cache, and Redis
is a very popular engine that customers use. So we've been actually
getting pretty good responses from customers. So let's dive into a little
bit into what the offering itself is and what we
have right now in beta. So one thing I
want to clarify is that whatever I'm
talking about right now is what is available
today in the beta release. And this is our v1 release. So the way we expose the
service is through two tiers. We have a basic tier
and a standard tier. So the basic tier is a
single Redis instance. And we see a lot of customers
just using the basic tier, because in a lot of the use
cases it's a very simple cache. The key thing that
you want to kind of be aware of in basic instance,
just like any other instance of Redis-- single instance of Redis--
is that if this instance goes down, it's a full cache flush. For any reason behind the
scenes something happens, we will make sure
that it comes back up, but you will experience
a full cache flush. But that's some of the
advantages you have, because now we do
the health check. And then we bring
back the instance, we preserve the IP address. In some cases, when you
are deploying your own, those are some of the
challenges that you run into, whether you can actually
reserve the same IP address when an instance comes back up. So all of that is
kind of taken care of. We don't provide an
ancillary for basic tier. But in the standard tier, we
provide a replicated version of Redis. We have across zone replication. And we provide
automatic failover. The other key thing is that you
connect to a single endpoint. And we make sure
that the traffic is directed to the right
instance, or right location. So in both the cases for
both basic and standard tier, the key thing to remember is
that it's a single master. So even in the standard
tier, there's one master. It is not a scale-out
model-- yet. So that's something to keep in
mind when you use a service. And like I said, we will
be providing three nines of availability once we go GA. Right now we are in beta,
so the SLAs don't apply. But when we go GA,
we will provide three nines of availability. So provisioning. One of the things
that we do is to try to make things really simple. But we also
understand that using UI is not the only
way folks want to provision their services. Nowadays with
infrastructure as a code becoming more and
more of a standard way to manage your
infrastructure, there are different ways to do it. So with Cloud
Memorystore, we obviously support the UI, SDK, et cetera. But we also support
Cloud Deployment Manager, or Deployment Manager
and also Terraform. So if you are using Terraform
to manage your infrastructure, you can use that to
manage Memorystore. For the beta version, we
are supporting Redis 3.2.11. One of the things that I-- in talking about usefulness
of managed services, one of the things that I
forgot to mention is recently, I'm not sure if you're aware,
Redis had a recent security vulnerability that was
announced, I think, last month. So the key thing is that we
actually applied the patch even before the
vulnerability became public. So those are some of
the things that we get to do behind the scenes
so that you don't have to worry about those things. So the 3.2.11 that we have
today is a Redis version that is fully patched
with the latest fix. So going to the
next level, in terms of how we expose the service,
what we allow you to do is we just allow
you to provision any sites, from 1 GB to
300 GB, at 1 GB increment. But the way the
service is steered is that the higher
the memory size, you get better throughput. And also, there's M1 through
M5 tiering essentially not only captures the
network performance, but also relates to how
we price the service. So in this particular
table, what you see are the ranges that we provide,
and the corresponding network throughput. So if you need
higher throughput, you basically have to
provision more memory based on your needs. But Redis being single threaded,
we provision enough CPUs in the background that,
for certain workloads our network may
become the bottleneck. So what we recommend is
that profile your workload, run the benchmarks, and see
what works best for you. But the way we have
exposed the service is essentially using
capacity tiers. And from a pricing
perspective, we price the service
in gigabyte hours. For the different
ranges, we have fixed gigabyte hour pricing. And so when I say
gigabyte hour to somebody, and it's like, what is
the pricing per hour? So the pricing per
hour is essentially a multiplier of
the gigabyte hour price times the
amount of gigabyte you're provisioning
for that tier. So for example, if
you have provisioning like 4 gigs of memory, it'll
be four times 4.9 cents, or if you're provisioning 100
gigs, it's 100 times 1.9 cents. And we have a pricing
calculator that will easily let you figure out what the
pricing is in terms of the per hour pricing. But the key thing here is that
we give you the flexibility-- depending on what you
want from a cache-- we give you the
flexibility, whether it's a single instance, highly
available instance, different throughput
characteristics, et cetera. So what I want to do
now is go to the demo and walk you through the
provisioning process. And I just want
to dive into some of the details of the managed
service, and some of the things that you may want
to know about when you're designing applications
on top of Memorystore. So Memorystore is
a storage product. And it can be found
under the Storage menu in the Google Cloud Console. So what you see here
is a few instances that we have provisioned. So let me walk you through
creation of an instance. And I want to kind of talk
through some of the nuances about the service. So I'm going to provision
an instance here. Let's say 4. The instance ID is essentially
the identifier of the instance. So like I mentioned earlier,
the version that we exposed is 3.2.11. Immediately we get the question,
are you going to support 4.0? The answer is yes. We're going to add
4.0 pretty soon. But right now if you're
provisioning Memorystore, you get 3.2. You get to choose a tier-- like I mentioned,
basic or standard tier. And the locations that we
have currently exposed, we have the services available
in five regions today. We are well aware of the fact
that there are more regions, and we need to be there. So that's something
that we are working on, getting more regions
added very quickly. So what you see here when
you select the region, essentially, is Memorystore
is a regional service. So what that means
is that when you deploy in a particular
region, applications can connect from any zone. It's a regional service. So you can connect from
within that same region. We today don't support
cross region access. So if your application's
trying to connect from one of the
region to Memorystore in a different region,
we don't allow you to. That is not enabled yet. So we give you the option
to deploy your application in any specific zone. One thing I want to
note about pricing is that we don't
charge for network. Memorystore doesn't
have network charge. But we do charge
for, for example, if your application
is in zone A, and if Memorystore is
in zone B, and if you're connecting from GCE, GCE has a
cross-net cross-zone networking charge. So if that traffic
exists, then you will be charged on the GCE
side for the cross-zone network traffic. So that's something
to keep in mind. And we here, you basically
again, like I said, you can provision
anywhere from 1 to 300 GB. And depending on
what size you select, you can see that the
network throughput changes. And how much network
throughput you can get, it changes with the size. So let's go ahead and pick
us a number of 30 gig. And finally, the network. So the way we authenticate
access to the Memorystore is using VPC. You pick a specific
network that you're going to deploy Memorystore in. And we restrict access
to Memorystore store for that particular network. So what that means is that
any application, or any VMs on that specific
network can access this particular instance. And finally, from a Redis
configuration perspective, we haven't exposed too
many parameters yet. We have only exposed two. Max memory policy
is one of them. I believe [INAUDIBLE]
lru is our default. So this is something
that you may want to think about
configuring when you actually deploy Memorystore based
on what you want to do. And that's it. So once you do that, you
basically click Create. And you go ahead and create
the instance for you. So like I said, when you create
the instance, what you get is an IP. It's a private IP that
you get on the 6379 port. And I mentioned that Redis
is fully protocol compatible. So I can use redis-cli. Oh, I hope we can see it. And they run commands
just like you would do in any Redis instance. So it's an open source Redis. So all that is compatible. I would say, though, that from
a managed services perspective, we do block some commands. So we have documented
what those commands are. But those commands are primarily
some of the admin commands that we have blocked. From the application
side of things, we support Lua scripting. We support Pub/Sub. So all that is fully available
when you use Memorystore. Let's go back to the slides. So from a best practice
perspective, like I said, you have the option. So if you're using
it as a simple cache, basic tier will
definitely work for you. One thing I would note
is that with basic tier we do have regular
maintenance that happens on a quarterly
basis, or whenever we have to apply any critical patches. If you're using
a basic tier, you can experience a full cache
flush in events like that. With standard tier, what we
do is we do a rolling upgrade, so you have higher availability
on events like that. Because we just apply the
patch in a rolling manner. The other key thing that I want
to point out is persistence. So in the beta release, we
haven't enabled persistence. When I say persistence,
I'm talking about AOF, and I'm also talking about RDB. We haven't enabled
either one of them yet on the basic
or standard tier. We are working on the ability
to import and export data to a GCS bucket. That will be coming soon. We are also looking at
AOF persistence, which is probably further down the road. From an instance sizing
perspective, like I said, there are various benchmarks
that are currently available. I'm sure if you're
already using Redis, you have an idea of the
workload pattern that you have. Always run the benchmarks to
see what size fits for you. Like I said, the higher the
size, the better the throughput that you get. So you may want to do that. Talked about the eviction
policy a little bit, and also the configuration. So we have documented all
the default configurations that we use. Like I said, we don't
allow you to change a lot of the configurations,
or many of the configurations. So it's good to know
what other configurations that we set by default for
these Memorystore instances. How many of you actually do
scale their instances often? Nobody. So scaling is something that
we have enabled in beta. It's a very simple
way to scale a system. You basically change the size. How does the behavior work? In basic tier, it is a
full flush of the cache. And in standard tier, we
basically do a rolling upgrade. Or when we do the
scaling, we do it in a highly available manner. So from an application
perspective, what that means is that the only time the
application is unavailable if you're using standard tier
is when we do the failover. And the failover that we
do is actually very quick. It's probably in
the 30-second range. And we have some heuristics
in the back for replication. But having said that,
talking about replication, we use Redis replication
under the covers. So that means it's an
asynchronous replication. So when you do
scaling or operation, it's quite possible
that there could be some unreplicated
data, or there could be some data
that is stale, because the changes
were not replicated when the failover happened. So that is instance scaling. And I just wanted to
take a quick moment and see if folks have any
questions about the things that I talked about till now. Do you mind using the mic? AUDIENCE: So I think
one of the themes that's been running through a lot
of the talks I've been to at this conference has been
don't trust the network, make sure you authenticate
your peers, the whole service mesh thing. But this service seems
to, from what I can see, be mostly have
VPC-level security. Are there any plans to tie this
into IAM or TLS or something so we can authenticate the
clients, in the managed server? GOPAL ASHOK: Right. So you're right that currently
there's VPC security. And one of the things
that we are working on is initially at least
enable [? opp. ?] But again, that's
just a Redis password. But beyond that,
yes, we are looking at at least figuring out how
to do encrypted connections between the client and server. So that's something
definitely on the road map. AUDIENCE: Thanks. AUDIENCE: [INAUDIBLE] GOPAL ASHOK: On-premise? What do you mean? AUDIENCE: [INAUDIBLE] GOPAL ASHOK: So are you
talking about connectivity, or are you talking
about whether-- AUDIENCE: [INAUDIBLE] the
cache [INAUDIBLE] applications. GOPAL ASHOK: So you want
to be able to connect to Memorystore from
on-prem using, let's say, a VPN, or something like that. Can you tell me a
little bit more about-- because from a
caching perspective, latency is a critical factor. So I'm curious in that hybrid
case is more for migration, or what is the use case there? AUDIENCE: [INAUDIBLE] Some
of them are only on-premise. [INAUDIBLE] GOPAL ASHOK: Is it
Redis or Memcache? AUDIENCE: Right now, we
use the [INAUDIBLE] caching [INAUDIBLE]. GOPAL ASHOK: OK. So there's two questions there-- I think two parts
of the question. One is how do you connect from
on-prem onto Cloud Memorystore. Again, that's something
that we haven't enabled yet. We are trying to figure out,
see how many customers actually need it. But we do see the
use case for it. In terms of the terabyte size
cache, like I mentioned today, we have the basic
and standard tier, which is a single instance; and
the replicated instance, again, single master. You're right, it's 300 GB. From a terabyte
cache perspective, the next thing we are
working on is clustering. So if you do scale-out
Redis clusters. So when we have
scale-out Redis cluster, you should be able to provide
terabytes of our provision terabytes cache. So that's something
that we're working on. But you're right that currently
it is limited to 300 GP. We may increase that
number a little bit more. But we are still constrained
by that single VM deployment. AUDIENCE: [INAUDIBLE]
monitoring? GOPAL ASHOK: Yes that
was my next topic. One more question, please. AUDIENCE: OK. So am I to understand
that because you're using asynchronous replication
in a failover event in general, not just in a scaling,
that there could be a small amount of data loss? GOPAL ASHOK: That's correct. So yes, any time there is a
failover, you can have that. One of the things
that we are planning to provide-- today we
don't provide a failover API that you can
manually failover. So if you're doing
manual failovers, once you have the
API, then you can control whether the instances
are in sync, and then failover. In that case, obviously
there won't be any. But unplanned, yes, you can
expect unreplicated data. AUDIENCE: And
somewhat relatedly, how long would a failover
event be expected to last? Here, when it's planned, it's
about 30 seconds, apparently. But if it's unplanned,
about how long would that period of strangeness and
possible unavailability be? GOPAL ASHOK: So
different scenarios. If it is a complete
failure, essentially we expect it to be less than a
minute, if it's an unplanned, because essentially we are
now waiting for transactions to be replicated, or
commands to be replicated. In the case of
scaling, we actually have a heuristic where
we wait for the replica to get caught up, and
then do the failover. But then the failover time
is still less than a minute. AUDIENCE: Great. Thanks. GOPAL ASHOK: There was a
question on monitoring. So let me quickly go
back to the slides and talk about
monitoring a little bit. So we do have
integrated monitoring. So what that means
is that we are integrated with Stackdriver. So we export all our
metrics from the instances into Stackdriver. You can use Stackdriver
to basically do the service-side monitoring. So let me quickly show
you what we have today from a Stackdriver perspective. Switching to the demo. So if you go into Stackdriver--
it's a little bit zoomed. I'm going to zoom
in a little bit. So if you go into Metrics
Explorer in Stackdriver, and if you look for Memorystore,
see your Cloud Memorystore Redis instance. And underneath that,
we have exposed a whole bunch of
metrics that you can use to monitor your instances. So what I've done
is we haven't yet have an integrated dashboard. But you can go ahead and
create a dashboard yourself. And this is a dashboard
that I've created. So as you can see, you basically
can build your own dashboards, and monitor the Cloud
Memorystore instance from Stackdriver. So we have that
integrated monitoring. And the other thing is that if
you are using your own tools, you should be able to
get the metrics out of Memorystore, because we
are fully protocol compliant. One thing you may want to test
is if there is any other block commands that we are using,
blocks any of the tools, we would be happy to kind
of take a look at that, and see if we need to
do something there. But we haven't had
any yet in terms of those commands blocking the
usage of third-party tools. But like I said, with
Stackdriver monitoring, you get all the metrics
from the Redis side, like network throughput
CPU utilization. One of the interesting
things that you can do also is you can monitor the create
alerts for specific metrics, and basically alert you. One of the good examples
is like memory usage ratio. So if you want to be able to be
sure that your instance is not running out of memory, you
can actually set an alert. And Karthi's actually going
to demo this in a section, I believe. And also do
alert-based monitoring. So going back to the slides,
the service-side monitoring is interesting. So it gives you a bunch of data. But it still really doesn't-- can we go back to
the slide, please? Oh, we're back on the slide. So it still doesn't give
you end-to-end visibility, in terms of what's happening
from an application perspective. A lot of the times
you struggle when you have a latency problem
on the application side. You come in, look at
the server-side metrics. But you are still not
sure what's going on. Unless there is
something obvious that's going on on the server
side, it becomes extremely hard to troubleshoot. So the interesting thing-- and one of the things that
Google has been trying to do, and Google has actually
open sourced is OpenCensus. How many of you use
OpenCensus today? So OpenCensus is
essentially a distribution of libraries that
basically allows you to collect traces and
metrics from your application. It's a super powerful
framework, or a set of libraries that we have open sourced. And one of the things that
we've been doing internally is to base instruments,
some of the Redis clients that are out there, and
see how it all works. But at a high level,
what OpenCensus does is it basically allows
you to use the libraries and inject your own or
export your own metrics from your application
perspective. So depending on what
your application does, you can define what matters. Is it latency? Is it transactions per second? Whatever it is. But at the application level. And then what you can do is
the OpenCensus exports it into some of the popular
monitoring and tracing tools, one of them being Stackdriver. So now you have
end-to-end visibility not just from the
server side, but also from the application side. So this makes troubleshooting
extremely, extremely much more easier, and provides
end-to-end visibility compared to just having
server-side metric. So having said that, what
we want to do right now is to show you how
you can actually bring all these things together. So Karthi is going
to kind of walk you through how you can build
applications, build, deploy, and actually monitor
application also using OpenCensus. [APPLAUSE] KARTHI THYAGARAJAN:
Thanks, Gopal. Can everyone hear me? Great. So let me actually
move to the next slide. As Gopal mentioned,
the engineering team responsible for
Memorystore, they've done a great job surfacing all
the metrics that you just saw. You guys are able to
see latency, errors, things of that nature
on the server side. So what we're going
to see next is how can we instrument our own
application with an emphasis on using Memorystore. So I'll talk about
a simple application architecture that's a three-tier
application that uses Java. And we're also going to use
a traditional database-- in this case MySQL
on top of Cloud SQL. And we'll also get into
deployment and provisioning. So I have a quick, informal
survey here for provisioning. We're going to be using
Terraform and Kubernetes as the topology. So how many of you are using
Terraform and Kubernetes across your applications? Great. So it'll be relevant to
a good number of people. Happy to see that. This is the simple
three-tier application that I talked about. As you can see, we
have, IN this case, it's a single-page
application that uses Vue.js. It's going to talk through
our global HTTP load balancer into a Java Spring API
hosted on top of a GKE. That application
is, in turn, going to talk to Cloud SQL,
MySQL in this case. And essentially
this application, which I'll demo in
a little bit, will look up employees
in our organization, and cache those results
into Memorystore for Redis. And the hope is you'll
get really low latency after it's been
cached, because you're using an in-memory cache. And a little bit about
infrastructure as code. As Gopal mentioned, we'll be
using infrastructure as code. I mean, this is all
the rage these days. It makes DevOps a
whole lot easier. We can do cool things like
configuration management, keep track of what our resources
are, all that good stuff. Is everybody familiar--
or are most people familiar-- with VPC,
and the whole notion of networking on GCP? So assuming that,
what we're going to do is we're going to deploy a
GKE cluster into our VPC. This is a separate VPC set
aside for our application. And into the same VPC, we're
going to deploy a Memorystore instance. This instance, as
Gopal mentioned, has to be in the same VPC
so that we isolate traffic to that instance from
any other resources that may be trying to access it. We also have a whole bunch
of other GCP resources that we'll be using. And I'll get into
it in a little bit. On the Terraform side,
this is something that our product teams
have been really good at. Every time we
release a product, we ensure that there's deployment
manager support, as well as Terraform support. In this case, it's
fairly straightforward to declaratively define
our Redis instance, or Memorystore instance. The two things that I'll call
out here are, as you can see, there's an authorized
network specified there. That network is the
same as the GKE cluster that I'm going to show
you in a little bit. The other thing
that I'll call out is the fact that there's
a reserved IP range. This keeps things clean. We're documenting
the IP range we're going to be using for
our Redis instance, as well as our pods
in the GKE cluster. And here's our GKE cluster. The only thing
I'll call out here is the fact that I'm specifying
an IP allocation policy. And the reason for doing that
is because we're using GKE, we have to enable IP aliasing
in order for our pods to be able to talk
to Memorystore. This is an extra
step that you have to do, as opposed to if you
were deploying a simple VM, and having that talk to
your Memorystore instance, not much else to do there. But in this case, you have
to enable IP aliasing. The steps for this might
look a little different if you're provisioning these
resources using, let's say, G-Cloud, or Deployment Manager. And I also want to call
out the fact that-- I'm going to go back
a couple of slides-- all the resources
that you see here can be deployed with Google
Deployment Manager as well. I'm choosing to
use Terraform here. With that said, let's
get into the demo. Thankfully my machine
has not locked. Awesome. So actually, the first
step that I'm going to do is I'm going to
show you that I can have a pod on my
GKE cluster talk to the Memorystore instance. As Gopal showed you earlier-- I'm just going to refresh
your memory on this-- we have three instances. These are all for the
purposes of our demo. And they're all deployed
into different VPCs. And even if you see some
of those instances having the same IP address,
don't get confused. They're all in different VPCs. That's why they have
the same IP address. So we know that we want to
be able to ping from our pod 10.0.0.4. So what I'm going
to do is I'm going to connect to the junpod, where
I have Redis CLI deployed. It happens to be
called mysqlclient, because that's also where
the MySQL client is. And I connect to it. And let's spin up Redis CLI. That's the IP address,
as I mentioned. And I can list the
characteristics of that instance. So all good. And this is what you would do
in your application as well. And once there, let
me flush the cache so that I can get the
demo to work as expected. Cool. So now let me go to the
demo and kind of show you. It's pretty simple. It's an application that,
once again, as I said, it allows you to look up the
employees in your organization. I am going to start
typing in Gopal. And there's one step
that I have to do. We have something
called BeyondCorp that interferes with my demo. I'm going to turn that off. And now let me type in Gopal. And what this demo has is,
in addition to all the names that it brings back,
it also shows you how long that request took. Hopefully, all of
you can see this. In this case it took
539 milliseconds. Now the hope is
it's been cached. I type in G-O-P again. And let me refresh this
just for good measure. It took much less time. And I know that's
not very convincing. What we're going to do
is, as Gopal mentioned, we're going to use
OpenCensus, which I'll show you the code for. We're going to use
OpenCensus to see what happened during that call
path as that call progressed. For this, I will switch to
my Google Cloud console. Let me pull up the console here. So just so you guys can go
to this console on your own, this is under the
Stackdriver Trace menu. And I'm going to look
at the trace list. And let me make this
a little bit smaller, because it's hard to see. This was my initial call. I typed in G-O-P for Gopal. And you can see that
this call started at the getEmployees method. And it actually shows you
how long that method took-- 538 milliseconds or so. But the cool thing
here is you can tell that this call actually
looked up, or tried to look up that keyword in Redis. We're using the
Jedis library here, and I'll talk a
little bit about how that library's instrumented. But it didn't find that keyword. So then it had to go to MySQL. And it also shows you how
long that MySQL call took. And you can tell the bulk of
the time was in the MySQL call. Execute query, et cetera. It found the entry. And then it put the results of
that request into our cache. So something straightforward
that most people do in a caching type environment. So you can kind of
see what happened. This is the happy path. This is what you
expect to happen. So now let's go back
to our trace list, and just confirm that
once it's been cached, I type in the same key word. In this case, hey, it
pulled it out of the cache. And it took a lot less time. And you can see that it
didn't have to go to MySQL. So that's the happy path. Oh, before I go back
and introduce an error, let me also show you,
in addition to tracing, the cool thing that
OpenCensus does is it also lets you
surface metrics. I had a whole bunch
of metrics shown here. But unfortunately a whole
bunch of time has elapsed. So this is what happened
for my most recent query. You can see things
like heat map. So you get a sense of
what your queries are doing in an aggregate. You can also see a linear view
of what your queries are doing. In addition to that,
you can surface errors from whatever Redis
library you're using. In this case I've
instrumented Jedis. And I'll show you how to
surface those errors as well. Before I do that, let me
show you the code real quick. Can everybody see this OK? Is it too small? Hopefully, it's big enough. This is the getEmployees
call that I mentioned. It's my Spring Boot controller. And I'm importing the
appropriate OpenCensus libraries. Gopal showed you where
to get that library. And I'm using something
called the tracer class. And I set up a span. And in that span I
can put in annotations that show me which
particular method was called, any data that was passed into
that method-- in this case, the keyword that was supplied. And I can trace methods that are
internal to that call as well. In this case, I'm also
instrumented getFromCache. I've added an annotation
that says cache miss, in case there is a cache miss. Let me show you that real quick. Go back to this. And you can see that annotation. So it shows up in my
Stackdriver console. So it's pretty straightforward
to annotate your code with OpenCensus. The other thing I'll
call out is the fact that you're seeing, going back
to this real quick, you're seeing all this stuff
from the Jedis library, like jedis.protocol.bulk,
process bulk reply, redis.clients.jedis.close. All that stuff. I didn't instrument
that in my code. That came from the
instrumentation that is in the library
that I'm using. So out of the box,
obviously the Jedis library is not going to give you
this instrumentation. And I'll show you where to
get that instrumented library. But for your own
needs, if you're using a version of Redis
library for whatever language you're using, look
for it in this place. If it's not there, you can
obviously fairly easily instrument it and contribute
it back to the community. Let me see how
much time we have. We have a little bit more
time to show the error condition that I described. So how do you surface
errors that you may see. So for that, what
I'm going to do is I'm going to trigger
this thing that essentially has my application talk to
a non-existent Redis host. So now I'm going
to type in Gopal. Even though Gopal
was cached, you can see that it's still
taking a lot more time. Let me switch back
to my trace window here, and see what happened. I'm going to quickly refresh
this Stackdriver console. So this is my trace
from just now. And click on the trace
and see what happened. You can see that it had
attempted to connect to Redis. And it failed. And it went to MySQL. So something
happened that caused it to close the connection. And I can also see in
my Stackdriver console that errors are
starting to show up. And I can trigger alerts
based on this error as well. What I've done
here is I've set up an alert that says if there
is more than one error for this particular
metric that I'm surfacing, send me an email. You can also have a
connect to page or duty, or a whole slew of other
integration points. So hopefully that
gives you a sense of what you can do
with OpenCensus. There's a lot of power here. Let's actually switch
back to the slides, since we don't
have a lot of time. I'm going to wrap things up. As I mentioned, if
you go to this link, you can find a whole
bunch of libraries that are already instrumented. Jedis is one of them. There are a couple of
Go libraries in there. And please do add
OpenCensus integrations or instrumentations on your
own, and contribute it back. Yeah, there's a whole lot
more to come with Memorystore, as Gopal mentioned. I'll hand it over
to him real quick. [APPLAUSE] GOPAL ASHOK: So just
to quickly wrap up, so what we discussed
here and talked about is the overview of
Cloud Memorystore. There are a lot of questions
in terms of feature sets that you want to see. That's something, like I said,
we are actively working on. This is the first
version of the product. If you really, really
care about something, we do have something called
the Issue Tracker, where you can actually log your request. We actively take a
look at all the issues, so our requests that comes in
in terms of feature requests. So I really, really
encourage you to please do that if there are
things that you feel are super important for
you to use the service. But just in closing, thank
you all for being here. Thank you for coming
to Google Cloud Next. We really appreciate
you taking the time. But if you have
any more questions, we have the engineering
team here, also. So we'll hang around, and if
you have any more questions, we are happy to take those. Thank you very much. [MUSIC PLAYING]