Amazon Interview question: Learn hashing and consistent hash ring

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
everyone my name is Lorraine and in this session I'm going to talk about hashing and consistent hashing often times why we are coding we need to save an item and retrieve it as fast as possible the real question is how to speed up searching of an item consider a case where I have given you lists of items and you need to search an item whether it is present or not if the list is sorted then it's much easier already to do is use binary search algorithm and such for that item is present or not what if the list of items which have given you is not sorted in this case the only option left is you need to consequently go through each and every items in the list and then find whether it is present or not cases like this can be easily accomplished using map this is something similar to your phone directory where you know the location of the name of a person and then you retrieve the phone number by its name instead computer also same thing happens something called as hash map our hash table where the goal is to use the key and save our value against the key and to do it as fast as possible is in the same key now to achieve this what are the components to meaning so all we need is first one is a key value and then you need a hash function here the hash function and then you need a hash table let's talk about each component in brief so the first one is can I do obviously you know the key and you know the one the second one is hash function consider that hash function is something like black box are you can think of as hash function when you give some value to it or when you need put a value it emits a large number like this okay this is what it does so you go in a different but you get a different large number and the third component is hash table so hash table is exactly like a table which you guys know and here the columns are key and a bucket okay it's to follow any how many rows in it so the keys starts from something like 0 1 2 3 are different values but for simplicity let's say so you want before and so far so let's take already yeah up to 9 it can be something like up to 9 and these buckets are where your value is going to be stored so let's use the listed components and build the complete flow diagram okay so the first content which we knew was key value pair let's take an example dog value is max the second one is okay so now the second component we have was hash function okay and the third component we had was the hash table No okay so we have a total of 10 different locations in our hash table they have hash table we have hash function we have key value pair let's try to save this key value pair in hash table what happens is as I already explained you have to give or save the key to hash function first and it is going to generate some large number so how does it look say for now it look like 1 9 3 4 3 so this is a large number and since ok now use this hash and find a row in the table and then still that value listed it's that simple so how many to do is go and find this particular row in this table 194 3 ok where is it it's not there because we only have 10 different locations out of a little here so this is an example but think of it our computer have a finite memory right so we can't have infinite memory since hash function can generate up to large number we can't directly map in hash the value in the big hash table so instead we do the trick what is the type so we know the total number of rows are done the locations available here is 10 what do we do now is on the hash value just do a modulo operator or you get 3 okay now go to the location 3 that is 0 1 2 3 and still the value that simple ok now we need to is good free and save the revenue that's it done okay so let's do one more so now let's try a sale catch so now you treat care to the hash function now it will emit one more big number of sales on device nine three four to five it's a big number and what it what we need to do is to the number of allocations do a marvelous operation so what do we get we get fine so now take this meow value to five and then sale there that's done so the next thing is similarly you have to say or add weight to it and then we process it generate some number like 5 6 3 4 do i positive and to get 4 okay see you man good in the location that it's that simple so this looks good right now there is a problem now this hash function I explained can give same number for different keys how now what happens now the problem is degree this function can't a written same number for two different inputs say something like I input flag and dog in both of the cases what if it returns same number say something like I'll run to one more example say I'm trying to say hey and the value is high ok so now let's create this hey to this function now the output it returned turned out to be 5 6 3 4 ok that is something civil what we got for rat also when we do a certain time we also get for now I will go to the location map or and then try to say hi equalizer over right but we know that the average of the whamming is already present okay this is something like collision is happening exactly the term is far less hash collision so when two different keys gives out the same hash that's when collision happens how do we resolve this kind of collision now we can't save both girl and hi over here the whole we know what value for watching that will be a typical situation we go so instead let's do a simple trick again and you guys know linked list just leave the list and Stroke directly say the pan you use linked lists here do more link and then sale okay but still the problem is there so how do you know what value for what instead of just say the value you try to save keys also for this is high and in case of good it's right that's it so when there is a collision you go there if there is a linked list height right over the linkage and check for the key and the value and retrieve it as fast as possible done now that you learn how to save key value in hash table the tripping is also something easier okay now let's say you're trying to return your value for dog now what take this key you don't know the value give it to a hash function it will obviously devote the same hash value like the previous side like you get 193 for 3 do a modulus to get location 3 go to the table or television and you will see fine just retrieve it rich it's faster right so now all good but still our computer systems have finite memory sometimes these locations might not be available or sometimes we may have distributed systems some computers may might fail some computers might be added dynamically without restarting the whole system so in this case let's see what happens so consider this system ok consider these two locations - computer - and these locations they are compatible now if this computer goes down these two locations will not be available obvious right so now the hash table remains same but the total count instruct then it is aged right so let's write the new card eight now let's not the same example now I want to retrieve dog and it's where you are so seeded value now if I give you the system function hash function now it will obviously written 1 I 3 4 3 ready to pass until 10 now you won't do personal time because the updated total locations I will release till 8 because the system got when dog so now is 10 we need to do now instead of 3 we get something different in this case we get 7 what was supposed to get for us 3 now we got 7 we got rubbish in 7 we have nothing over here what if we had something obviously we get wrong workers and this is not acceptable so now in this case we need to remap all of our data in this case we have to through each and every key and dual remapping of data okay and that is tedious task what if we have 1,000 are 10,000 of these keys it takes a lot of time right to solve these kind of problems Dave is a better solution instead of going for a conventional hash table and this is something new okay this is far as consistent hashing or consistent hash ring let's see how this works okay so the consistent hashing works on a simple similar idea but little twist in it so earlier in a normal hashing we used to know the all the available memory locations in our computer and then we used to consecutively name the keys in a hash table with consecutive number right so the key used to be so the key used to be continued numbered from 0 1 2 3 until they available in member of memory locations but the twist here is instead of knowing memory available memory location we are going to assign these keys randomly totally random ok now consider we took some random numbers and I said so what do we as I said let's are saying some random numbers over here so the random was being I say something like big random numbers okay say your 9001 and this is say 9000 and so we have assigned some random numbers over here so now what happens let's say let's try to save this demon payers so now if we want to save he-hey and hi right so for hey we pass the speedo and function V god okay some hash number so consider that big R as two zero seven one what you need to do is take this hash number unlike earlier you don't need to do a modulus or division operation data people and find the location so if you can't find it here we won't find it what what a minute to do is find a pocket or key which is greater than answer so in this case instead of searching for exactly the roof at the key to 0-7 find a row which is greater than that so in this case the great one is not 1 0 1 1 the next immediate critical thing is 3 0 3 now just go to distillation and savory I knew hi that's it okay now you might be thinking what if one more key also get lesser than trees road freezes so consider we asked dog into the hash function and that resulted say 2005 now what now also it will try to find the next thing exactly this is not this is what it is 3 so 3 0 and obviously save it here max and dual interest that's ladies so that happens to us all of these piece of value say if you want to say Blanche consider regard something like nine thousand fifteen what we need to do this case go on fine okay it's a lot yet this is what it's so save Giada so now let's say okay to me also so in this case what happens has they'll be God value something like five fifteen okay so this is watch research nowhere it's fitting so the next number is not a little in this case go back to the very first row and save it there so this case has value in y'all you save in okay so you can see there is something like a ring happening so this is how consistence hash that's it consistent hashing doesn't solve the problem of looking things up instead it solves the problem of looking up for the key and its location in this hash table you can see that there are multiple values are saved against each memory location you know unavailability of particular location doesn't disturb the whole table unlike the previous hashing algorithm where and availability of a particular location used to deserve everything and we needed to remap all the values against their hash keys but in this case that there this particular location is not available still on this home key value and the hash map he is still valid unlike previous algorithm it was like the whole table will be invalid but in this case only this particular location will not be available but the rest of the location is still available what if we want to save some value here since this location is not available because of her ring it goes to the next available location that is one zero one one okay that's how but that's why the consistent hashing is really important in computer science the examples are the applications of computer sorry the consistent hashing varies from load balancing caching are some of the great applications like or some of the great databases like Cassandra uses consistent hash string to save the data in different nodes block maintains large distributed system uses consistent hashing to locate different nodes or the workers are tasks and etc I am NOT going to talk a lot about the application of consistent hash string instead I leave you lot of links and documents which talks about consistent hashing applications feel free to read this document and understand better if you like this video please subscribe and hit like button and share with your friends thank you
Info
Channel: Tech Dummies Narendra L
Views: 124,771
Rating: 4.794558 out of 5
Keywords: interview, interview preperation, software interview question, learn hashing, hashing, consistent hashing, hash ring, consistent hash ring, python interview question, java interview question, how dictionary works in python, how hashmap works, learn software development, hashmap, hashtable
Id: bBK_So1u9ew
Channel Id: undefined
Length: 19min 26sec (1166 seconds)
Published: Wed Jul 18 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.