Scaling Redis To 1M Ops/Sec

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi my name is Jane Peck i'm the solutions architect manager for Redis labs so we are sponsoring this event and they have asked me to talk about a topic that is relevant to the type of discussions that we have with our customers so just to give you a background as a Solutions Architect our team talks to numerous different customers about different types of use cases where they want to understand how do you use Redis what do they want to do with Redis and a lot of times the conversation comes - well I could do this with a single open source for badass database but how do i scale it and what does that concept mean to scale Redis so I thought okay like let's talk about scaling Redis to a million offs for a second and it may sound like a lot but one of my co-workers he totally trumped me and he did a presentation yesterday about how to scale rate us 250 million ops per second at less than one millisecond latency so if you get a chance you could visit his training video from yesterday about how he goes about scaling that okay so that's the topic for today I have 20 minutes I'll go through the agenda but just to level set how many of you have used Redis so why are you in this class okay how many of you have used clustered Redis handful of you how many of you have actually configured a clustered Redis versus using it via a hosted provider so who's clustered Redis manually before oh so maybe you should be up here teaching this class so I want joking aside what I'm going to cover is just this is talking about Redis from what is Redis a shard how do you describe the Redis database itself and then taking it up a level in terms of you know like what are the concepts around clustering and partitioning if you take a look at a lot of the presentations they cover they talked about the Redis open source clustered API but there's many different ways in which you could cluster Redis I'll discuss some of those strategies okay we'll also get into some patterns and anti patterns to scaling some things that work very well on a single reddest instance or at a server process does not well work well in a Redis cluster and and then we'll get into tools for benchmarking mm to benchmark where its benchmark there's some pretty cool announcements that were announced this morning with the keynote and I'll try to do a demo for my one-million officer second in a 3-node instance that I have deployed on ec2 and some resources available for you to review on just getting to know Redis okay hopefully I could cover all of that in 20 minutes all right so let's level set the conversation about Redis because I hear this time and time and again there's a lot of misconceptions about what is a Redis instance versus of what a spread of server process what is a shard so in general the three are equivalent when you talk about a Redis instance it's that one server process that you could run on your docker container or on your MacBook or on your huge ec2 instance or your Pew TCP instance so that is a single Redis server process and you could call so call that a shard now the concept of the shark doesn't really apply for a single process Redis database but it is considered a shard okay so when we had these conversations about Redis and sharding what we're really discussing is taking that concept of a Redis database with the entire key space that it has I think it's a 2 to the 32 around 4 billion keys that you could house in a single read a server process and how do you scale that handle even more keys and leverage more CPU and probably get a lot better latency and a lot better throughput okay so in general Redis it's 10 years old now you have a bunch of clients be it running in kubernetes or your jaebeum's or wherever they all point to a hostname and a port ok and you could talk to it it has a protocols and you're using your client libraries via for Python client library or C or so those client libraries you pass it the hostname and the port and it talks to a back-end Redis server process okay and so that's a short okay so you have in this example I have one ready server process running on a node and your application talks to Redis and it's represented by a hostname and port now what do you do in Redis like what can you do with it anyone know how many data types or data structures you could currently deploy in Redis okay quick quiz how many of you have used a hyper log log in Redis just to happen but actually I think I'd see three - log looked to me it's like one of the coolest data structures in Redis twelve kilobytes of memory but incredibly fast inefficient at giving you a cardinality of the different elements that we're able to count right so that's one data structure how many of you have used any of the geo indexing in Redis right or a couple more people so geo geo spatial indexing allows you to just pass it a latitude or longitude and something into the index and you could start asking radial questions about I'm here find me who's closest to me and I recall correctly lyft uses that algorithm where as you drive around the drivers update their locations and I am in San Jose at the airport I'm like hey tell me the closest driver around me that you could rock me - it's a classic use case right or tell me where the nearest pharmacy is so there's a lot of different data structures that you could do even within that one single read of server process and so these are the gamut of different things that you could use with in Redis very popular are the people things like like strings and you can use counters but SATs sorted sets hashes lists and the new ones is streams that's come out so Redis is that one single process could do a lot and you could also load modules like we had discussed this morning about how do you make it into a search database how do you make it into a graph database or make it into an AI or ml module right the ability to process these things so you got familiar with Redis you love using it and you realize I need to scale and sometimes that decision may require you to completely reaaargh attacked your application okay because depending on how you've written your application if you thought that you're happy to as a startup company you could go at the small little ready server process it works great but then you need to scale if you don't understand how you scale Redis it could have a dramatic impact on the way your application has to get redeveloped okay so in terms of where you what can you do with Redis let's play around with the typical web architecture space right we talked about the different data structures you could put Redis and use it as a string for all of your caching needs session caching image caching profile caching dating caching you could cache almost anything in Redis you could then go into user sessions with hashes individual keys to identify all the different subfields you could use the list and the sets and the pups up elements for a lot of your job queues and and actually streams is a great play in this space you could use a search module make it into a search engine superfast you could extend Redis into round with some of the flash extensions from Redis Enterprise and you could also do streams and pops up JSON storage numerous different things that you could do with the multi property of Redis okay so now you're ready to scale right what are you how do you scale Redis it's a pretty broad question and you'll notice that there's a consistent pattern to a lot of the presentations here about different ways in which they decided to approach that scaling and a lot of it is still dependent on how you're using Redis if you're using it as a key value store very basic key string with no transactional requirements at all then it's actually pretty easy to scale Redis by charting if your application uses a lot of multi exacts weights or you're accessing certain keys and they need to be accessed at the same time those types of requirements you need to rethink how your applications design okay so having said it here are all my clients I am those that's a blue I need to access my Redis server processes I'm going to in this example I've charted into four different ways so for server processes and it just for simplicity sake I've divided it into four different servers okay so how do you choose which one of these servers you should point your client to so there's different options that you could take okay so one example is to let the client logic decide which one of these nodes which one of these shards you talk to your client application you write the Python or Java code where you define a lookup table to say four keys with ranges from A to G I go to shard 1 or node 1 for keys form h20 or h2o I go to no 2 and so on so your client has to be smart enough to know exactly which key belongs in which shard and you access it the great thing is right there is no network overhead you just do a quick lookup and you know exactly where to find your key the trouble comes in when you start having to move around your key so you have to recharge your logic right and once you start doing that something has to update the lookup table and it becomes cumbersome and once you start getting into high availability master slave concept and knowing which ones to point to it gets a little bit tricky having said it I've seen this as an excellent strategy for people who want to scale pub/sub so how many of you have used pops up before someone a lot of you so a single Redis server process once you start chucking tons of requests of updates and so on it gets constrained and to be honest even the Redis clustering architecture is actually it's I can't even say it's great I can't like I could almost say it's actually potentially worse on handling pops-up so by leveraging a client-based sharding algorithm to partition your different pub/sub channels you could really scale out pub/sub in this mechanism okay so it works great on this front the alternative way is to go through client-side sharding still so the client still has to know which shard it needs to talk to but the logic behind which shard it goes through is kind of spread out and managed by hashing the keys okay and so the most popular one that every one a lot of people talked about is the concept of the Redis cluster at API and the concept is you take a key you apply a CRC 16 bucket basically create 16,000 buckets and assign the buckets and go through like a mod whatever number so in this case mod 4 to determine which one of these nodes are shards own which buckets are which slots okay so that's what's behind the hash with creo routing where you take a key and so in this example I get a and you're like the client says I need to get it that's a command that it right that it executes and so the first thing it needs to do is well what is the hash law that that key a belongs to and that has thought if you go through a cluster key slots command you'll find that that slots 2 slot number 15,000 495 so then though you figure out okay if I do a mod for identify which node it belongs to and then you connect to note 4 to get key a the great thing is even though it is client-side charting you don't have to write all of this logic most not a lot of the client libraries already know how to support Redis cluster API so when you're working with the cluster database you enable it for the clustered API and it now knows oh you're asking for key a all slot you to node 4 and if you're asking for key a and all of a sudden key changed it will actually update you to tell you that it's moved to no.2 and in the client library will go to no.2 to pick up that data the nice part about this design by the way is that you're still taking away this middleman even though you calculate for this key for this slot which node you have to go to that gets processed very quickly in your application layer and then it knows directly which node to go to for your key so if you want speed this is still the fastest way to get access to the Redis data clustered api go directly to the server with the char that houses the data okay so the third option is to go through a proxy size charting and what that does is you have the clients and they talk to the proxy and from the client view the dread is database still looks like a single Redis server process it's a host it's a port and it grabs the keys that it wants okay so the proxy handles the expertise to know which nodes have which slots so the client just says get a the proxy does a hashing knows which one of these shards it needs to goes to it gets the data and passes it back to the client does this look familiar this is some of the new initiatives that are coming out with Redis okay so the proxy layer some of the advantage that it provides is that what at these nodes moves what if you have to do maintenance what if you have to change the IP address and what if you have to add a node remove a node to all the stuff that you have to do from a maintenance standpoint it's great from an application view that hey it's just a debate it's database but underneath it's somebody at the operational side has to consider how to patch it for security how to patch it for all the OS updates how do you do the rolling of VM dies and how do you replace something else right so the advantage of the proxy is that it gives you a little bit of separation from the client to the underlying infrastructure or the database so that you have the maneuverability for the operational efficiency the one thing though is that you could immediately see the proxy could introduce additional latency so depending on the type of deployment that you have you if you have a proxy as a whole separate layer a whole separate server then you're introducing client to a server and then proxy server to the underlying shard so there's a couple of additional network hops that are better around it you could potentially house the proxies into the same nodes as where the shards live and that could minimize some of the overhead but ultimately if the if you have these shards spread across multiple nodes there are still hops that need to happen for the proxy to talk to all the shards and then return the result set back to the client make sense all right so in terms of the Redis enterprise deployment in general we use a proxy because it allows us the operational efficiency to balance things but we also get the advantage of doing something called a multi-tenancy on the database distributed database deployment so with multiple nodes in a three node configuration I could have a logical Redis database database one it has to master shards it has to slave shards so it allows us for failover and I could create a separate database that's just a single master database and I could create a third database which is just a single Redis server process with one slave one replica okay so you could play around with the way your database or your application shards are placed and as long as you specify a port and change the port for the different shards you'll be able to configure it in this fashion okay but you have to manage the deployment carefully if you don't have a proxy layer that accommodates on top of that all right so normally I pause and say any questions nope okay so a couple of things that we tend to see whoops in terms of scaling things that help you scale things that really don't help you scale right so dues use pipelines pipelines allow you to send a bunch of commands to Redis Redis processes it and then returns it back to you but it does not lock the Redis server process for the entire time that you're sending a thousand gets to Redis okay and you're overcoming some of the network calls that's involved if you have to do a single get set Get Set back and forth to Redis so pipeline is a great way to actually optimize the calls that you have in Toretto's as long as you're not looking for individual transactional updates okay use unlinks versus dowels how many of you are familiar with unlink okay great for me it's been very effective when my customers ask me hey Jane our deletes are taking forever or part of our process is that we flush the database at 12 o'clock midnight and that flush all takes 5 6 10 minutes right so if you do something like a flash all async it basically says okay we're gonna disconnect the key name versus a key value so from the application standpoint my Redis is flushed I don't have any keys but in the background if you actually monitor the Redis memory use you'll actually see it slowly taper off okay so that helps to in terms of just you know management and so the same thing applies to deletes versus unlink delete breaks the connection between the key name versus a key value so it's much faster of an operation especially if you have large keys okay and by the way having a key that's 250 gigs in Redis is not a good idea I have seen it it is not a good idea and if you try to delete a key that's 250 gigs in Redis it's going to take a very long time ok so the other thing is you know Redis works well for when people think fine at a small keys the definition of small I think is kind of like loosey-goosey some places they say small is s in 10 kilobytes others small is SM 100 kilobytes some customers have keys that are 1 megabyte like it's whatever works for you right whatever works for you for the type of throughput that you want but don't expect super bleeding fast performance if you have keys that are 30 Meg's right and you want to have like super high throughput through Redis at the latency that you want okay so and then the other one is understand the time complexity of the commands are you're issuing right if how many of you have gone to the Redis IO commands page thank you okay thank you how many of you noticed the time complexity section when you look at the commands great ok so every Redis command will have a time complexity section well it will tell you whether this is an expensive call or not best one is old one right great we know exactly where the key is but once you start getting into things like the sordid side the Z range by give me all the hundred million entries that you have in your sorted set that's a very expensive call really really expensive you're not going to get great benchmark raise outside of that one ok so just be careful look at the time complexity of the command that you're issuing even if you have a huge sorted set if you just pop the most the top most scoring element fast versus if you want the top most hundred thousand could be slow ok um if you're if you need to have keys that belong together just make sure that you use a curly bracket or your custom hashtag so that they all belong to the same shard actually this doesn't really impact scaling per se but this will definitely impact how your application behaves but you're just going to get errors if you're trying to do a multi exact on a bunch of keys there is supposed to be together but they're actually spread across multiple shards because all the commands were rather still apply at the shard level actually for Redis cluster be at the slot level so if they are violating if they're not in the same slot then you're going to get errors okay and the other thing is monitor for slow log as you're benchmarking Redis especially if you're benchmarking your own applications take a look at the slow log to see if things come up by default Redis will start writing the commands that are less than along within 10 milliseconds into slow log I mean in the grand scheme of things 10 milliseconds is pretty fast but in Redis scale it's slow so take a look at the slow log and see if there's things that automatically are caught I've worked with many customers where we look at the slow log as part of their deployment or QA testing and some things that pop up a lot is key star so I'm switching to the be careful section where it says be careful of running key star in production or in your application in general it's a very very expensive call if you have lots and lots of keys very easy to do in deployment but really you should be using scan ok so be careful I think I mentioned already be careful of the sorted set queries some of the persistence choices that you may have are dbe's versus AOF especially if you're persisting from the master shard could be expensive ok and then caching keys without TTLs especially so you have two choices on how you manage memory and Redis right you could define TTL for keys where you say hey key key Jay King okay let's call it the key name right the keys name is name I'm going to set the TTL for 30 seconds and after 30 seconds it's going to expire great and then there'll be some background processes which are going to slowly clean up the memory for the keys that I've expired well what if you don't set the TTL for the key and you've defined your Redis size to be 10 gigs and it grows and grows and grows and everything's great until uh-oh you hit the out of memory error or you set an eviction and all of a sudden it starts to evict keys and that eviction process is more expensive than doing a TTL okay so just these are some of the things that you should consider there's so much more on how you tune and benchmark Redis but these are some of the key things that we end up dealing with when we work with our customers so top benchmarking tools mm to your benchmark created by Redis enterprise you could download it if you want to take a picture you could see where you can actually download it up again really what I like about it is that it does get set it's super fast it's a quick and dirty way of just assessing the health of Redis and this has been something that we've been using a lot okay and I'll be doing a demo on using this one the other one is part of the Redis distribution is called Redis benchmark and I actually like this one when I start working with customers to better profile their environments and the reason is this I mean it has a traditional stuff like you specify the host and port and password and things like that but the main thing that it does is I mean it does do testing for paying sets gets incurs list sets hashes and and em get em sets and so on but the part that I really like about Redis benchmark is that you could define the type of calls that it's going to run as part of your benchmark and this was hugely helpful for us when we work with their customers because one customer had an M get with a thousand elements in it and they wanted to benchmark if they ran that for X number of times what type of throughput and performance should they see in Redis and some of the cooler stuff that's coming out hopefully with Redis six is the fact that it's going to be able to support the open source cluster so now you could launch it at by the way mem tear and Redis benchmark as it stands today could only benchmark against a single endpoint or a single Redis server process it doesn't allow you to benchmark the cluster okay and so the new version is going to be out which is going to be really nice to really benchmark the Redis cluster all right so let me do a quick demo I'm over my time are you okay with a couple more minutes well so let me switch over to my browser so I have three instances running you see two instances and in this example I have an open source raddest database running oh can I switch over your wrapping up can I turn my computer okay thank you oh you get won't take long alright so let me add a Redis database and I'll just called it LGB will create a 10 gig database I'm going to I'm not going to enable slave I'll make the port 10,000 and let me activate stay on this page ok so I created a Redis database here's the end point and I'm going to launch it load against this so launched great so in this example and this is what's pretty impressed about Redis this is a single Redis server process where is it running yet I mean I'm literally pushing six hundred thousand operations per second but its latency is around three milliseconds okay and you could see that you know we're literally pegging the CPU said 100% and I'm writing mem to your benchmark on this example right and so you could say well you know hypothetically if I did shard this database if I wanted to cluster it what would what type of performance would I get I'm going to change this so that I will cluster it and divide it divided up that one shard into four different shards update so in the background we're taking a copy of the original shard making four different copies of it figure out what the slot the hash thoughts are going to be partition it and then start redirecting the traffic from the proxy right so now 1.4 million ops a second at less than one millisecond latency okay so this is the advantage of clustering or sharding you get much better throughput and much lower latency how you go about doing your clustering and charting it's up to you so from a resource standpoint again my name is Jane and you could go to home of Redis Labs slash get started that's going to give you lots of resources on downloading content Redis reference Redis IO and the last thing I really want to push is we have new classes starting on April 16th on introduct introductory classes to write estate of structures whereas for Java developers ready search and registries okay thank you for coming really appreciate your time [Applause]
Info
Channel: Redis
Views: 5,890
Rating: undefined out of 5
Keywords: redis, redis labs, redislabs, redisconf, redisconf19, redisenterprise, benchmarking, nosql, ops/sec, scaling, clustering, tech, cloud, data, databases, shard, redis cluster, devops, architecture
Id: 55TFuBMFWns
Channel Id: undefined
Length: 29min 52sec (1792 seconds)
Published: Fri May 03 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.