HashTable vs HashMap vs Concurrent HashMap all kinds of Map implementations

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video we are going to explore the various map implementations available in the java.util package so this is kind of a complete continuation from my previous video regarding hash map internals also this I will try to show some examples where the concurrency can cause troubles in hash map even when we use concurrent hash map and what we can do to work around it so to start with in java.util map implementations you have so many map implementations I can say as you can see and let's go and inspect the first class where it all began so the first implementation that we know of is Java you need odd hash table this is the legacy associative array implementation where the first Java in release had only hash table this is the time when the collections package was not even existing so at that time hash table was the only implementation of an associative array in Java so later on when collections framework came in this became like a legacy class because collection framework had hash map implementation and so hash table was literally throw fitted by implementing map interface so hash table has a unique advantage in terms of thread safety all the public API is on a hash table has synchronized keyword on it meaning it is thread safe that means if thread is currently writing into the map suppose we are calling put from one thread another thread which is trying to get something has to wait until the put is completed so that means it's the whole entire map is locked so um it's actually you know appears like a good thing but it's not really a good thing because it is an overhead in an environment where we write the map contents many times in the beginning and later on we do only read see most of the use cases of maps what happens is initially we load a lot of key value pairs into the map and later on we just read out of them we don't really write to them so in that case every read will talk any other read that will happen so that is the price you would pay for making each and every method synchronized so that's all about hash table and any given day you don't really have to use hash table in your application programming because you can safely know this class from now on going on to the most popular implementation of map interface which is java roger green door hash map so this hash map implementation historically or otherwise satisfies most of the requirements for map use cases and hash map implementation is not thread safe meaning you know concurrent threads updating hash map suppose we have a thread continually putting content and another thread continually reading from that map there may be chances where the the reading thread can see you know data that was not completely written or things like that so this map is not thread safe the calls doesn't block each other also another interesting aspect of hash map is that the iteration of a hash map it's not guaranteed in the order of you know putting the keys suppose we put three key values into a hash map later on there is no guarantee that when we read or iterate over the key set we get the same order of insertion and another point regarding hash map is that it allows you to insert null keys and there are not too many use cases where you would want to use them but that is one of the features that is out there available on a hash map and then if you want to really use hash map in a concurrent environment only way is to use the all concurrent primitives that you are available of aware of like synchronize keywords or locks and things like that most of the enterprise applications populate the map once and when they read many times from the many many threads so given this type of utility hash map is a very good candidate for your application where you throw in a bunch of keys initially key value pairs initially and then later only read from the map so let's move on to the next interesting implementation of a map interface this is Java dot util dot linked hash map it's very similar to hash map in the implementation but the interesting aspect is that linked hash map the iteration is guaranteed in insertion order so that means if we insert keys and values into the hash map later when you try to eat rate through the key set of the hash map you're guaranteed to get the same order in which those keys where a pair were inserted in the map back to you so this can be used in scenarios where insertion order is some of some significance to you for example suppose you want to load a bunch of employee names from database and then store some information against those names in the in the map so suppose you already have sorted your output based on some DB logic that you want to reflect the front-end so it's always safe to place them onto a linked hash map so that from wherever you return the list you get the list in the sense in the list of keys you get the same order of insertion so that means the ordering is preserved so how that is maintained is internally by using a doubly linked list of all entries in the in the hash map so that is an internal extra double linked list is used inside linked hash map meaning some extra additional space concerns are there but you know given the benefits that we get those can be ignored so now moving on to the next type of map implementation which is tree map which actually implements the sordin map as well as the navigable map interfaces so sorted map and navigable map interfaces are basically interfaces that guarantees or provides you contact methods based on the sorted order of the keys that were inserted in the map so a navigable map also provides you very convenient methods which gives you approximately lookup like you can seal our floor based on the key suppose you want to get a key which is nearest to the key that you are looking for you know that type of operations can be achieved using the navigable map interface and three map is the default implementation in the JDK for sorted map and navigable map so the main benefit of tree map apart from the convenience method that comes from navigable map interface is that iteration is guaranteed in natural sorted order of keys so that means you keep on inserting keys into the database let's assume we are inserting a Java line string keys into the tree map so the so you keep on inserting keys and value pairs later we can see that the keys are automatically sorted for him in the way it was it you wanted in natural ordering so that is a very beneficial scenario where you want to suppose you have a map between the country and capitals of the countries and you eventually want to show the country names in the sorted order so you don't have to really worry about achieving that sorting instead what you can do is use anything sorry three hash map tree map what I mean a tree map and insert the country versus capital in whatever order you like but when you retrieve them the keys the key set you get the country names in sorted order so that is a benefit of tree map so internally it uses a red-black tree implementation which provides you the sorted appearance of the keys now going on to another interesting hash map that we generally don't use in our applications which is Java dot util dot identity hash map so here in identity hash map the most interesting aspect is that it uses the identity to check the key equality that means it's checks if key one double equals key two instead of calling the equals method on the keys so this kind of is used in deep copying or a class object is used as a key of things like that so memory footprint of an identity hash map is comparatively smaller than a hash map and it's it has seven benefits like that so suppose you are constructing a map with interned strings that means strings that are already out there in the string pool you could use a hard in the identity hash map and save some space so next map implementation that we should know about is Tara Dooley done in a map so in a map into the inner map class definition says in a map angle bracket K extents enum of K and V close angle bracket so this is a map that takes enum s key and the enum type that you mentioned should must come from a single innum type that is specified explicitly or implicitly when the map is created so another point that we should know is that null Keys are not permitted in in a map so for that matter null Keys are not permitted in hash table also it throws a null pointer error so in a map is not synchronized neither was tree map or link hash map iterator does not fail fast what does that mean so there is a point called if you have programmed in Java you would have come across an error called concurrent modification exception so that what that essentially means is that when you are trying to read from a certain list or a map some other thread or this current thread itself is trying to modify that list or map so in that kind of a scenario concurrent modification exception is the wrong so the point this you should remember is that concurrent modification exception has nothing to do with multiple threads it can happen in single thread also so how that happens in single thread let's have a quick quick click using some code and try to simulate a concurrent modification exception so here I am going to create a little class called concurrent exception so here what I'm going to do is I'm going to create a main method and here I'm going to populate a map of string and integer as something like score of course equal to new hash map and then what we are going to do is you're going to play some scores into the Maps course dot put have a key called say user 1 as a score of 5 sorry 6 now I'm going to put a bunch of other users into the map 2 3 4 5 6 7 8 so I put 8 user data into this course map now I'm going to get an iterator for the key set so scores dot key set dot iterator returns me an iterator of strain user iterator user iterator call the score star key set dot each wait now I'm going to iterate over the user iterator you so it has next I'm going to just bring that scores dot get user eater dot next so I'm getting the next key and printing the corresponding value so when I do this suppose I try to add something to map in the same loop scores dot put user 9 and say 6 and run the application now here you can see that I am first iterating through the key set and at the same time I am trying to modify the map so this behavior even in a single thread should cause a concurrent modification exception so let's run this example and see so here you go I got a camera you read concurrent modification exception exactly at the line exactly at the line where I was you know trying to insert the item and iterate over the map key set so that is what concurrent modification exception is but there are certain map implementations that kind of doesn't fail like this so so in a map iterator does not fail fast that means if you create an enum map and add some enumeration key value pairs into the inner map and then it do the same exercise you iterate all the keys and then try to put something into the map in between it does not fail like that so that that means iterator does not fail fast now let's proceed on to another kind of map that is available in Java which is Java dot util dot weak hash map so this is a something that we should use when you're when we are creating a cache or something like that because we cache map units is used to store the keys in the currents so we should know what exactly is a weak reference so a weak reference is a reference type in Java which tends to come the garbage collector that if the only reference to the weak reference type whatever it's contained in the weak reference if the only reference is as a hashmap key the garbage collector is free to collect that object you don't have to have the object hanging around so that is the concept behind weak reference so in a weak hashmap that means the elements in the hashmap can be reclaimed by the garbage collector if there are no other strong reference to the object a strong reference is when you say object something like employee e equal to new employee and you put hashmap put e comma some value right so that e is a strong reference because it's an object that you explicitly created and associated so at the same time weak references is that a reference that can be garbage collected if only the reference occur is not not in the form of a strong reference that means suppose you insert an employee into the V cash map and then somehow the employee goes out of scope in your application right so that means the garbage collector is free to recollect or reclaim the space used by that element that you just insert in the hash map so the keys that you insert into a weak hat map gets wrapped in Java dot lanky F dot V crescents so this is useful only if the side lifetime of cash the entries is determined by external reference to the key not the value so the value will also get removed from the map but more important that it is focused on the key the key is availability if it is a strong reference or a weak reference so in Java there is yet another type of reference available which is called soft reference and soft reference is very much like the weak reference there in envy occurrence it is the the garbage collector can garbage collect the vehicle offence anytime if the only reference is of the weak type but soft means it is kind of a weak reference type but the garbage collection will wait until the anti that is really need to do garbage collection like there if there is a garbage collection pressure in the in the heap so this weak reference soft reference and garbage collection I'll discuss in my upcoming video but for now we should know that there is a weak hash map and the keys are wrapped in weak reference and hence if there are no strong references to your keys the garbage collector is free to reclaim that space and hence this big this is very useful when you're creating cache of values and that is one of the biggest use of weak hash map and you are guaranteed some you know cleanliness in your overall application so in Java there's no soft hash map we talked about soft reference but there are no soft hash map in Java so we don't have to worry about it so now let's go and take a look at an example of V cache map how the V cache map would reclaim space sorry the garbage collector would reclaim space from a weak hash map so I'm going to create an example weak each map example so here I'm going to create a main method and here map a key value string integer say orders equal to nu V cache map so I've created a weak hash map now I'm going to put some values orders also I'm going to put an order class here so that we can class order and suppose this has something like int order ID string details and let's create a constructor public order in an ID string details and then we associated to the class for a class fix order ID equals n ID and details equals 30 tails so here I created an order just to you know demonstrate the weak reference and its benefits so in orders I am going to put orders got put new order of one and some details and some order values suppose the total value we want to mention as hundred dollars or something it's a map of order so I I actually typed in correct so let's go back and inspect the example once more so the example says map of order comma integer orders equal to a new we cache map and the key is of type class order that we just created so I'm putting something into the orders map with new order one and some value to it so I'm creating yet another order key it's just order two and say I put another value here so I put two elements into the hashmap now what you should know is that see I created two objects here one and two these two objects does not have any strong reference because I do not have any reference objects from any other place other than the hash map keys so that means these two are currently weak references so the garbage collector is free to garbage collect them so what I'm going to do is I'm going to try and call system dot G C and C and then print the size of the map and see what is the size of the map orders dot size before GC I'm going to print the order start size so here it should print two and if there is a successful garbage collection the week references should go away and we should be at a zero as this size so let's see how that week reference worked in this example so if get this output see first it printed - that means two orders are in the map now we initiated one system GC so you should be aware that there is no guarantee that garbage collector would run but it looks like it ran in this instance so when we called system GC garbage collection happened and since this only reference for this order objects were on the s form of the map keys those elements got collected and removed from the map and when we provided the order stored size again we got 0 so that means whatever we cache map contact says has worked so now let's see what happens if I add a strong reference so I am creating a strong reference order order 3 equal to new order of three and some text so I created a strong reference here because even after I put this reference orders dot put order 3 comma some amount so here I added it to the V cache map but still even after system GC you should know that our main method is still referring to the oh 3 object in a head reference hard reference way so that means it cannot be garbage collected so if everything works correctly in this example when we run what should happen is first orders note shy a size should print 3 because we have one two three objects added s key and then I call a system GC the weak references should be removed and the strong reference created here which is still needed after the GC call happens should appear in the order dot size as one so now let us run this example and see the output as I said it showed three references two of them were weak reference got garbage collected and then we are printing only one reference which is the strong reference so that is the significance of V cache map and V cache map is used as is used where we create hash hash hash map of cash values that's DV cash map usage so now let's go to the synchronized world or the concurrent world of maps so first a way to get us an instance of a synchronized map in Java is using collections dot synchronized map and pass a reference to already constructed map to it that automatically converts your current map instance into a synchronized map and returns to you so this is like a decorator pattern example where you're adding additional behavior to the object that's being passed to it and this map is a fully synchronized map means the return returned synchronized map instance that you get in your hand has synchronized in all its public methods that means all the reads and writes would block the threads that are you know doing some other operations so it is provided as a convenience it's in theory similar to hash table that we have seen the synchronized map and you can use or you can not use it and most of the time there is no need to really use synchronized map and a collection start synchronized map this synchronized map API call returns you a synchronized map internal class instance and wraps around passed map instance aside that I've talked about and marks all the API says synchronized so that gives you complete thread safety but you know if you are very serious about sync concurrent programming this is not the kind of third safety that you would like to have so you can try to find out more details on the other type of maps like concurrent maps etc and then think of using those kind of Maps instead of a synchronized map so speaking of which Java dot util dot concurrent has a lot of utility classes that came with Java dot Java 5 they're the new concurrent modules were added so Java dot util dot concurrent concurrent hash map supports full concurrency during retrieval that means retrieval operations does not block at all reads can happen fast when the writes require a lock the writes does not require a lock on the entire map table instead there is something called segments that are created in concurrent hash map and each segment contains a set of buckets in it so that means a write lock may be required only on one segment at any given time and suppose you're another thread that is writing into the map is writing to a different segment it can acquire happily acquired that log-log and work so that means even writing has a very less amount of locking involved so by default the concurrent hash map is divided into sixteen segments and you can add sixteen threads to write into the hash map at the given time and iterations on a concurrent hash map does not throw concurrent modification exception and even within this same thread so it can be used well in cases where a lot of concurrent addition happens followed by a lot of concurrent reads in a concurrent hash map null Keys are not allowed so why it is not allowed why not keys are not allowed is because if the map dot gets null returns a null object it is not sure if a null is not mapped or if null is mapped to a null value so in income in a concur map in a non concurrent map world we could check with the contains API but in a concurrent environment values can change between API calls as the API s-- itself are not atomic that means even when you are using a concurrent hash map you would have to use a certain locking or some other techniques to make sure when multiple threads operate on the concurrent hash map like you know suppose incrementing a count or something like that that kind of use case have some good practices in place like synchronizing it or some other couple of techniques that I'm going to show in place so now let's look at some examples of a behavior of concurrent hash map so for that what I'm going to do is I'm going to create an example class called a concurrent map example so here what we are going to do is we are going to create a map of certain when location versus some orders coming from that location and then in from multiple threads we will update that order count concurrently and see how it behaves so now let's go and create a static reference to a map which a string of location and long order count and call it as orders and let's create a new concurrent ashmac so we created a concurrent hash map instance and let's create a main method but we are going to add some initial setup to the orders so let's put a key called Bombay and no orders have come from Bombay yet so we are putting a 0 long and then let's create another location called Beijing and that also has 0 orders now let's create another location called London and 0 orders now another location New York and 0 orders from that location now let's create a through two threads and keep on imprinting incrementing the order count in those two threads and see what happens so here I'm going to create an executor service executors dot new fixed threat role of - sighs - I'm going to create two threats and I'm going to say service dot submit I'm going to say process order method I'm going to add a static method here concurrent hash map example static process orders and I'm going to create submit two tasks that means I'm running two threads in parallel and then I'm waiting for my termination for a second say time unit one second and then I'm going to shut down my service so here what we are doing is we are going to create two threads that'll eventually add or increase the orders in our order map so let me provide the process orders API which is going to be static void process orders this is going to iterate over the keys so I'm going to say for stream key or city in orders dot key set I'm going to add 50 orders so I equal to 0 I less than 50 i plus plus I'm going to get city get sorry map dot get so I'm going to say long old order equal to MA orders dot get the city name and then I'm going to orders dot put city comma old order plus one so I'm going to increment the orders in a for loop and I'm going to do it 50 times so for each city I am going to place 50 orders per thread and I'm running two threads and at the end of running two threads I should have 100 orders against each city name if everything runs correctly so now let's say add system.out.println orders at the end so now let's run this example and see what happens in this when we use concurrent hash map and we are running two operations on the hash map and we are updating the values using two threads so now let us example and see what happens so at the end of one second I waited for one second to complete everything so I see that hundred orders are pretend in this map as expected but at the same time this is multiple thread environment and there are chances that this can go wrong so what I'm going to do is I'm going to continually run this example I'm going to go to my command line and using a script I am going to continually run this example and observe my output okay so I'm going to go to my example project out and look at the class which is in scratch I guess let me find out find out my name this class okay so that's it so let's go to that path and invoke a command line continuously so and see the output now I go to the documents workspace example example out production example scratch sorry I gotta go back here and I'm going to say well sure do Java scratch dot this name confirm the hashmap example and then so this humming we're calling infinitely and observing the output so I'm running see first I got 100 hundred looking good so far suddenly see what happened here's the surprise I got 52 on Beijing and New York now then I again got a 86 here and some times so that means the map is behaving strange or our application rather is behaving very strange under multiple threads so no matter if we use concurrent hash map or any other you know such things you know your application can go and behave real weird way like this okay so why is that because these operations orders got get and orders got put these are not atomic operations it has to you know this so it can be easily made atomic if we just synchronize this whole thing here and the calls will wait before one thread can update and at the end we will see the same value so what is the fun in that let's try to do something else right so what I'm going to do is I'm going to to address this problem where we are getting inconsistent output I am going to use one way to fix this is to use atomic law so in Java I think as of Java 5 these atomic types were implemented this atomic type are guaranteed is as per the Java memory model specification atomic types writes all its change before any other read happens in the memory so given that anatomic lung should add this our concern here so here what I'm going to do is I'm going to insert a new atomic long and do the same for all the other cities I an atomic long and I cannot do this so what we can do here is ordered orders dot get ki sorry it's a city and it returns an atomic long on which we can say get an increment so this is an atomic operation on the atomic long and all the changes the variable will be published to the memory and whatever threads try to do this again we will see they're the most current value of the atomic long and it should have consistent output as hundred in all the cities when I run multiple times so now let's just do a test run and as you can see in the test run we are getting 100 as the output now let's go and run it in the infinite loop and observe our output so I'm going to keep an eye on this output for a while and as you can see it's consistently printing 100 hundred hundred hundred which is very good so this is one of the ways to make such operations safe when you use multiple threads modifying some values of the map after getting it so that is using atomic long also you could use the replace API on the on the map which replaces which returns a true only when it can do a the values passed on the parameters matches the current value in the hashmap so that is the another API called a replace that that helps in 2d in this kind of situations now let's continue our discussion on concurrent hash map so what we have seen is the operations on concurrent hash map are not atomic and you should still be concerned about areas where multiple threads for modify your map after checking some keys or pulling some key and then putting some value back into the map those kind of scenarios you should judiciously check or use atomic variables as possible as what we have seen now let's go ahead and see another yet another implementation in Java you reckon current is the concurrent skip list map in Java concurrent what this is a scalable concurrent navigable map and sort of map implementation so this is the equivalent of tree map in the concurrent world so that's the important point of about concurrent skip list map so it behaves like the tree map which gives you all the navigable map interface API is like approximate lookups and things like that also sort of map interface implementation where the keys will be sorted in the natural ordering so a plus it brings the concurrent behavior just like the concurrent hash map where the parallel rights are very rarely blocked and the reads are never blocked also concurrent skip lists map guarantees average order of log n performance on a wide variety of operations concurrent hash map does not guarantee operation time as part of its contract so one difference between concurrent Skip list map and concurrent the hash map is that there is no performance contract in the concurrent hash map also concurrent skip list a map has a performance contract of order of log n in most of the operations so that kind of concludes the various maps in our java jdk as of today so in coming weeks i am going to publish more videos in the concurrency or multi-threaded programming and some other topics such as garbage collection which will be of interest so please stay tuned subscribe if you like like the content please leave a comment if you liked it or if you have any questions or criticism please leave a comment on that too thank you
Info
Channel: Ranjith ramachandran
Views: 190,719
Rating: 4.927928 out of 5
Keywords: java, hash, map, concurrent, weak reference, multi threading
Id: APL28XpFP0c
Channel Id: undefined
Length: 41min 25sec (2485 seconds)
Published: Sun Jul 26 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.