Using Hashtags in a Redis Cluster

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hey how's it going justin here with redis just want to take a little bit of your time to go over hashtags now i talked about this briefly in a talk on the redis monthly live back in september with guy royce if you want to see that original talk go ahead and check it out in youtube and redis channel but i just wanted to actually go over it a little bit more formally and dive in a little bit deeper and just watch me code out uh using hashtags to actually keep data within the same shard in a cluster okay so let's start okay so what i want to first start off is just exploring the differences between a clustered node and a non-clustered node of redis so a non-clustered node of redis is just having one single instance uh more often than not on port 6379 and it's just sitting in memory say on your computer or on redis cloud just sorry a microphone just fell so when entering data into a regular local or regular single instance a redis we just simply set our keys and they are set to a value within our memory space with a clustered instance of redis when we actually set a key it needs to determine the hash slot so what is a hash slot a hash slot is a namespace or a location created by redis that goes from 0 to 16383 and this allows us to store and retrieve at specific numerical locations your data no matter how big or how small the number of shards is or are so if we have let me look at this example here here we have a clustered and a non-clustered node the non-clustered node is just another word for your local redis instance either connected with the redis cloud or running locally on your computer with usually the port 6379 the data that you write goes directly into your memory and it's all together with a clustered node data is written across multiple shards if if you have multiple more than one chart uh so data is isn't always going to be in the exact same location as some other data so uh with your local computer you write two different sets or two different lists they will be within the same location on your computer in your memory with a clustered computer setup they might not always be in the same shard if you have two different keys and two different shards and you want to work with them with a multi-key command like r pop l push or z enter or s diff things like that then you're going to run into some problems okay so let's actually just go ahead and check this out so when i'm actually clustering i'm using hash slots uh what is a hash slot a hash slot is simply a location that is numerically determined to store your information so instead of actually hashing using a shard quantity to determine a hash location or location of your data we're actually going to use a hash slot so here we go we're using a crc 16 hashing algorithm on a key and when they're modding it by 16 384 so that's what we actually do to determine the hash slot location so um let's go for an example uh let's say that i have slot 11 940 so in a single uh cluster of just one single shard i'm labeling it here shard zero um it's going to be right approximately in that location in the shard so the beauty about using hash slots within our cluster is that the hash slots move but it's not determined by the hash nut hash slot number so let's check out this new view now i have two shards previously our shard zero only has hash slots from zero to sixteen thousand three hundred eighty four this next set setup where i have two shards shard zero now has zero to eight thousand one hundred ninety shard one has eight thousand one hundred ninety one to sixteen thousand three hundred eighty three so we still know the location of my little example slot it's just now in shard one so let's expand this let's add another shard so now we have one going from zero to 5460 at the second one going up to 10 923 and the third one taking all the way up to 16 383. if you see my little blue example there he's still there it's just being moved based off the hash slot count in the hash lat location so lastly this actually reflects uh the clustering uh creating a cluster and redis example that i actually released not too long ago if you follow along there you can see you have this exact same setup we have four shards and now they're evenly distributed uh between zero four thousand ninety five four thousand ninety six to eight thousand one hundred and ninety one eight thousand one hundred and ninety oh that should be two to twelve thousand two hundred and eighty seven and then twelve thousand two hundred and eighty eight to 16 383 so now they're all evenly distributed amongst four shards and again you can see my little my little blue example there is exactly where it should be it's based off shard number not i mean i'm sorry hashtag number not shard number because again we don't care about the number of shards those can increase and decrease based off of traffic time of day our budget you know what have you what we really care about is the hash slot location which is figured out with that cr16 hash algorithm modded by the total number of hash laws which is again 16 384 okay so i'm going to go on to my actual uh cluster setup and then i'm going to call um cluster slots so this is going to give me a breakdown of all the slots that exist in or the hash loss that exists within my uh cluster and so i see here uh from hash slot 0 to 4095 it's assigned to port 7000 which is going to be the primary and this is the hash id of my primary which ends in 68f 7a and then the replica that is copying all the information from my primary is at port 7004 and here is the last five of the hash id and so this gives me a breakdown of all the information of all of my shards this actually information is really helpful um but i can actually do one better let's take a look at what that command actually does if we added some visualization to it so again uh node 7000 or cluster 7000 which is a primary has the note 868 f7a replica id of 2c476 and it has slots of 0 through 4095. and here's a replica that has the information here's a replica id the primary id and it's replicating all the slots zero through four thousand and ninety five cool all right so i really wanna just wanna focus on the primary shards for now since we're dealing with hash slots uh i'm just gonna assume that replication is great and it's working in the background so keep this mental image in your mind we have four different shards and 16 384 hash slots are distributed equally amongst all four okay cool so now let's go to an actual coding example um say i want to create like a i guess you could say a social network of some sort and i'm going to create this set of user 512 following that's going to be the key name and this user is following four other users user 271 user 973 user114 and user056 all right so i entered that into redis redis has redirected to slot 7578 because again we are setting up into hash slots we're storing our data into hash lots when we're clustering so and uh i'm glad it told me that um and you can see that it actually switched me from uh i was on node 7000 with my command line interface now i've switched over to 7001 and if i'm ever curious as far as what a key slot i'm at or what um what's a hash slot my my key is at i can call cluster key slot and then i can just type in user 512 following and that will tell me the hash slot where it's located so 7578 so that means they would be in here this uh 7001 port and incorrect again it moved me to 7001. so now let's create another set um i'm following these users why don't i create a set of users that are following me so i'm going to call it followed by so set add user 512 followed by and i'm followed by user271 and user 197 and user 661 so i'll enter that okay so it redirected me to slot 3322 because when we actually run our our key through the algorithm uh it directed us to uh 3322 as the hash slot so i'm just going to verify again and this is just you know me being a little paranoid i want to make sure that everything's right so followed by and again yeah 3322 is the location of this data so cool we have two pieces of information um in our database rule quite easily so now let's go over here um so now i'm looking at uh the diagram of user 512 following stored at 7578 and user 512 followed by at location 3322 so we have two different keys and two different shards in my cluster cool so these are this again this is for people that are visual learners so this this actually helps me when i was starting with clustering so now um let's do something fun i want to find the intersection of the two so basically who are mutual friends um between user 512 and who follows them and who's followed by user 512 so to do that i'll use s enter which is an intersection and i'll do user 512 followed by or almost did start with following and then let's do user 512 followed by and again who's who am i following or his user 512 following and who is following user 512 so the intersection of those two will basically give us the mutual follows um so let's check it out oh an error so cross slot keys in request don't hash to the same slot so basically what's happening is that we're trying to look at two different shards at two different locations to run an intersection and that's not going to work for various reasons uh redis is fast redis is very fast to retrieve data between two shards would not guarantee fast retrieval at intersection of the of the data so it's just not going to work multi-key commands will not work across charts they will only work within the same hash slot um and this is again to protect the speed and integrity of redis it just would not make sense to compare and work with multiple keys amongst multiple charts so there's a way around this and it's you know pretty pretty cut and dry we basically use a pair of brackets now i'll see if i can copy this in without us actually automatically entering so i'm using s add and then user and then i'm actually putting the user number the unique user number 512 within curly braces now what happens is usually the hash slot the hashing algorithm to determine the location will hash the entire key now sometimes i don't want to hash the entire key i just want to hash a specific subsection or substring of the key that will be common amongst other keys so here i'm telling it to hash the number the number 502 or 512 and that will determine its hash slot number so i'm going to enter the exact same data as normal user 271 user 973 user 114 user 056 these are all people that user 512 is following so i'll enter that not a problem out of curiosity let's see what hash slot that actually has um hashed to so i'll do cluster key slot and then user colon and then curly brace 512 curly blade brace colon following and it goes to 3808 cool okay so um as you can see i haven't switched uh clusters so it's still on um my shard zero it hasn't skipped shards so i'm still in my shard zero so now let me add uh my followed by so i'm going to add my followed by user and again i'm uh going to use the curly braces so i'll just paste that in uh set add user colon curly brace 512 curly brace so again this is all uh i'm telling the hashing algorithm to only hash this value here and it's going to have the exact same hash slot as what's up here because again i told it to only hash this section right here so this is going to guarantee that we only are hashing that one specific thing between both of the keys and we're going to be using that same hash slot so let me just check to make sure followed by okay cool so i just called cluster key slot on my followed by key and now i see that both are actually um slot 3808 and just for us visual learners um you can see oh i'm sorry i should put that in curly braces um that both of these are going to hash to 3808 and when i run my intersection command so let's do uh z enter and then let's do num keys that's going to be two and then my first key is i'll just let's let's save everybody a little bit of time call this so this is gonna be my first set that i want to look at and i want to intersect it with this set and this will give me the intersection of those two keys because they're in the exact same uh key slot or a hash slot in the exact same shard which happens to be shard zero now again if we tried to do this with oh why did i call this enter oh that's not right sorry uh let me call s enter wow [Music] a few moments later so now when i call s enter the intersection between my two keys let me just copy them over just because you know save everybody a little bit of time following and followed by i will have them in the exact same shard in the exact same hash slot and so there we go user 973 and 271 are both following user 512 and following user 512 is following user seven 973.271 okay so hopefully that was helpful um the main takeaway from this quick little talk is make sure that if you're using multi-key commands on your data types such as some string some use cases for strings lists sets and sorted sets make sure to use these little curly braces which are called hashtags to actually enforce that they're going to be in the exact same hash slot thus in the exact same shard thus you get that speed that redis demands of itself and of our applications okay hopefully that was helpful if you have any questions as always please don't hesitate to reach out to our discord channel or leave a comment in the video alright thank you have a good day [Music] you

Info

Channel: Redis

Views: 4,112

Rating: undefined out of 5

Keywords: Redis, Redis University, Cluster, Clustering, Shards, Networks, Database, NoSQL, Hash Tags, Hashslots, modulo, Replica, Primary, Hashtags, Hash Slots, High Availability, High Durability, Distributed Systems

Id: YF3wj5d_tkc

Channel Id: undefined

Length: 18min 50sec (1130 seconds)

Published: Thu Nov 11 2021