LEARN BITMAP INDEXES

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I'm sure you guys know about b-tree B plus tree and AVL tree data structure which are used in database indexing but do you know about bitmap indexing in this video we are going to learn exactly that so what are bitmaps a big Maps is also called as bit array is a data structure which basically uses sequence of weights to store information they usually consumes a lot less memory so why are we talking about bitmap indexes the question is our bitmap indexes are much efficient so that we can totally stop using a B+ tree or AVL tree the answer is not really straightforward there are no data structure which are like Silver Bullet where you can use one data structure for all different purposes even it goes the same for the database indexing as well there are cases where V plus tree and a male tree performs much better when we are indexing the database but there are situations where bitmap indexing will outperform B plus tree and B tree so let's understand these concepts by taking an example of product database suppose I have a product database which has ID name category stock and status if you closely look at this data there is somewhat pattern over here so the pattern is in case of name column the data here is not duplicated at all so usually the names in any kind of products or even in case of username where it is combination of first name and last name most likely the data in that particular column is not really repeated but in case of category if you see the data is repeated but not too many times but in case of stock and status if you see the data is very limited to very few number of kind of data so in this case the only possible kind of data is true or false in this case the only possible kind of data is a pro pending or rejected but in this case the category could be in thousands of number but in this case of the name it will be usually equal to the total number of products which we have in our product table in case of Amazon or Flipkart or any of the e-commerce it will be usually equal the total number of products which they have in the inventory which will be like in terms of millions right so there is a word called as cardinality what is cardinality is basically it tells you how unique the data is so to take an example the cardinality in this particular array is okay so the data is 1 1 2 2 3 4 in this case the cardinality is 4 so how do we compute this is it's basically the total number of elements which are a unique in this particular given data set so in this case the unique items are 1 2 3 & 4 the 1 & 2 are repeated but we only consider the unique basically we do the set of the whole data and the count of that is basically the cardinality why do we need cardinality is the when we understand the cardinality of the data we can decide what kind of index and we can choose so in this case the cardinality is very high because the names can never be repeated even if it is repeated but most likely it will be that the unique items are almost equal to the total number of you know the the products which we have in this particular table but in case of category the cardinality will not be equal to the total number of records because there could be hundreds of products which are belonging to a same category in this case these two products belongs to same category like sure these two these three basically belongs to books and this one belongs to twice so if you look at this cardinality over here it is actually 3 so in this case it was 6 which is equal to the total number of items we had in this case of idea as well all the IDS are unique so we have cardinality of 6 so in this case we have cardinality of true because the only data which we can have in this column is either true or false so it can be only 2 in this case we can have the cardinality of 3 the simple idea of using bitmap indexing is wherever there is high cardinality it's not really advise to use Big Mac wherever there is low cardinality it is always good to use bitmap indexing there are so many databases or products which actually uses bitmaps like Oracle database and there are so many data warehousing products actually uses bitmap indexing to make their queries faster and to basically improve the performance of the system so in this case we're which are the columns we can actually use what kind of indexing so this is the kind of column so in case of ID we should be going for B plus 3 so here we should be going for B plus 3 and here we can say the cardinality when we consider too much of data the cardinality will not be very less so it will not be in single digit it will be definitely in thousand side it is okay to use B plus 3 or you know the bitmap indexing also performs really well even in case of this kind of data but it's ok we can use either B plus 3 or bitmap indexing here so in this case it is definitely worth it to use bitmap indexing and in this case as well because the cardinality is very low so the next question is what are the factors which actually makes B plus trees are not really good for the data which has very less cardinality the problem is B plus tree actually builds a big tree right in that tree it doesn't really matter what kind of data we have or what kind of cardinality we have in that particular column it's still going to build a big tree whenever we index this specific column so that is kind of waste of memory it actually consumes hell our memory to build this particular index using B plus tree but if you use bitmap indexing we are going to consume very fraction amount of memory which B or B plus tree uses and also in case of performance it actually performs the same way we're like how B plus tree or B 3 programs so why wasting memory why don't we just simply use the bitmap indexing in case of the columns which has less cardinality a lot of databases which we actually use has the capability to use nixing it's just that we don't really use it say for example first west also has a support for a bitmap indexing oracle has the support for bitmap indexing so let's understand how to build bitmap indexes for one of the column in this particular table so let's take this particular column and see how to build bitmap indexing so we need three different bitmaps to create a bitmap indexing for this particular column so the number of bitmaps we need for any particular column will be equal to the cardinality so in this case the cardiology is 3 because we need if you said if you get the set of set on this data we get approved pending and rejected basically it's three so we need three different bitmaps to create index for this so one bitmap basically for up row and one for pending and one for rejected so now let's create bitmap for each different data for this particular column so bitmap we said we initialize it to 0 like this so each bitmap in the respective positions are basically responsible to hold the data to the respect to you ID written over here so basically this bit is responsible for first row and the second wait for second row and the third bit for third third row and and goes on not and similarly for pending as well we need a bitmap which is set to 0 with something like this ok I mean surely will have all zeros now we need to set these bitmaps based on the data which we have so in the first weight the status is a probe that means that they approve the first bit which is responsible for the first row with ID 1 will be set to 1 ok the second bit is approved so let's set it to 1 third bit is spending so this is not really responsible let's let it be 0 for bit is approved we need to set it to 1 and v ID it is rejected so v bit will stay 0 6 row it is rejected so let's state 0 so now let's look for pending the first row is not pending so we don't need to optic update over here second row it is a probe we don't need to update over here the third row it is spending will need to update this bed and the fourth row approved no need to do anything fifth row rejected sixth row rejected we don't need to do anything so pending is done rejected one how do we update is first row approved no need to do anything second row up row no need to do anything third row pending no what no fifth one is rejected and six one is rated so we need to update respective weights by one so this board will become one and one so this is basically a pic map index four column status so now how this particular data helps when we query for any given data let us understand how queries work suppose if I want to find all the rows which has status approved what do I have to do all I have to do is go through the approved bitmap and then such for all the bits which has the big value one so those are the rows which has the status approved in this case the bit with the first bit is set to one in that case the first row will be equally approved the second bit is set to one that means that the second row is a row in this case third bit is set to zero that means that we know that third bit he still is not approved a saying if we don't need to know what it is but we know that it is not approached similarly for this approach fifth is not a product not approve so if I want to just find all the rows which are approved all I have to do is just go through this get all the ideas which has approved so those are the rows in this particular table we need to written in the for that query to get the answer so in this case one the the first row and the second row and fourth row is the other columns which are status upper so that's all let's take a little bit complex query let's take one more quick example suppose if I want to find all the rows in this table which has status set to either approved or pending what I have to do is I will have to use our operation because in the query also we are writing something like either approved or pending are the query could be like status in approved comma pending that means that basically we are doing our operation in this case as well we just need to do bitwise our operation in that case we will have to look at two different bitmaps that is approved on pending do the our operation on different pits and you will basically get the output out of this and this right here all of this and this is one that means that the big the row with a ID one has is meeting that but a specific condition this one is meeting this one is meeting and this one is made basically the first the output when you actually have ID one two three and four fifth and six are R is 0 so it basically doesn't have that means that the answer for that particular query is one two three four so those there's actually reflecting over here okay now if I want to do a complex query which span across multiple column suppose it can have bitmap index computed for stock as well obviously we I would have had a no bitmap for those as well over here and if the queries are something like if the stock is true and state-assisted to upload find all the rows in this table how do I have to find is basically you do and operation on all of that and then the output actually gives the answer for that query and once you know the IDS if you have a index built for this using b3 or B+ tree it is very faster to find those respective rows as I told just by using bitmap indexing we cannot really get all the output of we need or all the answers which we need we basically use you know bitmap our indexing and in conjunction with other indexing data structure to make all the queries much faster so all in all bitmap indexing basically consumes very less memory and CPU and provides really high performance just by using bitmap we can't really achieve a lot we in the real time scenarios we basically use bitmap indexing and also know B 3 or B plus 3 in conjunction to get better performances I hope you learned something I definitely learned something by reading about bitmaps if you liked this video please like share and so I've to this channel thanks a lot
Info
Channel: Tech Dummies Narendra L
Views: 70,180
Rating: undefined out of 5
Keywords: Amazon interview question, interview questions, interview preparations, algo and ds interview question, software interview preparation, developer interview questions, Facebook interview question, google interview question, Technical interview question, software architecture, system design, learn System design, bitmaps, bitmapindexing, b+trees, databaseindexing
Id: 5-JYVeM3IQg
Channel Id: undefined
Length: 13min 19sec (799 seconds)
Published: Thu Apr 23 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.