MongoDB Indexes - The Recipe behind Fast Query - How to Create Indexes and the B-Tree Data Structure

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone welcome to the video series on MongoDB and today we are going to talk about something which impacts the performance of your MongoDB database yes today we are going to talk about MongoDB indexes now this is very important topic you cannot afford to lose your MongoDB database without understanding how the indexes work and it will take more than one video for me to explain indexes properly because of the importance of indexes in this video I am going to introduce you to the MongoDB indexes how it works how you can create MongoDB indexes and why you must take care of MongoDB indexes let's go ahead and start so in this particular video we will understand what is indexing we will create a collection and insert data and we will do indexing on that particular data we will understand v3 the data structure behind indexing this is extremely important so do not skip it and then we will look into what we must take care while creating indexing and we are gonna do all those things without writing any code we are going to see all these things in commas in my next video I will tell you how you can create indexes in your code but in this video I am deliberately not losing any code because my intention is to make sure that you understand indexes okay so let's proceed further before understanding indexes let me answer this question why huge indexes there is one and only one reason for using indexes for faster queries there is no other reason for using indexes so if you are using MongoDB database as some kind of backup database where you rarely gonna do a query indexes is not for you it is useful has been it's important if you want to access data from MongoDB if the speed of accessing data matters to you if this is not that is MongoDB indexes are not for you now to access data faster what can we do think about our day-to-day example if we want to understand the meaning of a word we look into dictionary right and how we can get our desired word endured externally because everything in that externally is stored in a sorted order the same thing happens with indexes it helps us to get the faster query because it arrange the data in a sorted order and that's the reason it is fast but there is some pretty false because of indexes since data is arranged in a sorted order if a new insertion will need to find its place in the sorted order of things so insertion will take more time and that's the precise reason you must know what all things happens when you insert data in MongoDB with indexes okay so let's go ahead and first try to understand the problem index is trying to solve and see it ourself so to understand this I have created a small piece of code in pie using pi the Python driver and I am inserting some data so let's go ahead and see that so here is my code and I am importing PI if you do not know pymongo don't worry about it you just need to worry about you know how I insert the data so I'm creating a database called learn ok and inside that I am creating a collection called index learn ok and inside that what I'm trying to do is that I am trying to generate a random number and I am trying to come up with a salary range for multiple people so this random number will give me a number anyway from 0.1 to 0.9 so what this number is going to do I am multiplying that with 100,000 for you know considering the salary so every time you can see it gives me a different different number okay so this is what I am trying to insert in my database so here is how I am using it so I am writing the name as name one - you know ten thousand and in every name I am including surgery so I am inserting ten thousand data let me insert that it is working in the meantime let me go to the compas refresh you can see the large database you can see index learn now having six thousand documents if i refresh it must be having ten thousand anytime now so it is now having ten thousand documents okay so now I have my own data with ten thousand records or documents so you can see that name and you know salary is there Harlequin see if you can see it I'll just rearrange it so you can see the data now right you have a data name and salary and salary is actually random there is no order in this hungry now as we told you that you know indexing is done only and only to make sure that you can fetch the data fast let's see that how that happens let's assume that I want to find the salary so this is my query string let's assume that I want to find this cell rate three five two six eight three five two six eight okay let me find it I found one record with salary three five two six eight now the reason why I am doing it in the compas because compás has a tool called explained plan in this you can actually run your query and see what are the performance aspects of that query so if I say explain you can see that documents returned one which is correct documents examined 10,000 execution time a millisecond okay so to get one salary the documents that are examined is ten thousand and ten thousand is total number of records we have in our collection which means that to get even a single salary we are examining all 10,000 records okay which is you know not good so let's go back here and find some other salary let's say I am using salary at something else now one eight seven six two one eight seven six - cool let me find it I found one record let me explain it again again we retrieved one document and it still takes ten thousand query so to find one record okay if such each and every collection documents okay every documents is being searched to fetch you know this particular record this is bad now you know that every record has object ID let's try to search on this object ID let me see if I can copy this yes so what I will do is that instead of salary I'll say underscore ID and object ID each this one okay so now instead of searching for salary one eight seven six - I am searching for object ID and if I click find it will find a record but if we go to explain plan you can see that documents examined only one and query used for following indexes ID now this is the same document with three fields ID name and salary if I am searching salary it has to search all 10,000 records but if I am searching ID it is taking only one record to find the record which is passed in the search criteria and it's awesome you can see that no 0 millisecond how does it happens you can see a difference that query using following index by default for every collection that is created in MongoDB it is indexed on the ID field which means the ID will always be erased in a sorted order okay so you can actually verify it by looking too ID will always be arranged in a sorted order ascending or descending okay now you understood that if ID is having an index it is searching way faster than it is searching Sandri so let me create an index for Sandri so in here indexes okay I will do create index let me limits entire salary index select the field name salary I want to index it in ascending way here are the options I generally prefer this but this will not work if the items are not unique I will leave it as it is for now I will talk about the rest of the options in some later video and say create index okay now you can see that salary index is created on the salary field and if I go ahead and again if I said salary itself with you know one eight seven six to one eight seven six two and find it and if I go to explain you can see that now it is using a salary index and it is examining only one document to fetch the desired document so I layer without indexing you were searching ten thousand documents but in this case you are searching only one document and the same is true for any salary so if you go ahead in the document and let's sir find some other salary in here let us say 7 0 7 - 6 7 0 7 - 6 find explain you can see that it is determining only one record to get the salary isn't it cool isn't it the required performing such performance or query performance we are looking for if he is the next thing you need to understand why this happens and how that happens because that is very very important as I explained to you in the beginning of this video if you are storing data in a sorted order any insertion will take time let's go ahead and see now as I have told you in the beginning that you know indexes are stored in the form of a data structure called b3 but before we understand b-tree let's talk about binary search tree let's for that matter assume that we have these many documents okay which is having some numerical values in random order if we want to create a binary search tree out of the values we will get this tree so you can see that you know if you do in order traversal which is left dude right we'll get things in a sorted order like here you can see 3 4 7 11 then again 11 14 33 42 6200 okay now this is a binary search tree representation but what is B tree representation depending upon de you know order of the B tree or degree of the B tree there can be multiple elements inside one node so in here the B tree might look like this so I have given a link over here we are going to see that link because I have you know downloaded or copied the representation tree representation from this link so this will help us in understanding how the performance is impacted by using binary search tree or may tree let's go and see these two links okay so let me open this link reduce it like a bit and these are the elements I have with me okay so this is a link from University of San Francisco it is having a very nice visualization which I want you to see remember to see this because this will make you understand what happens when you arrange data in a sorted order so if I say enable if I insert 11 is inserted then 4 then 42 keep on looking into the animation that is coming then say 150 and then 33 then 13/14 7-eleven it's taking time but there are just two more elements 60 to go 3 ok so if you are not using indexes document can go in a store on its own but if you are using indexes you know every time I write something it is trying to find a particular space in a particular node or nearby that node ok this is how binary search tree works okay but indexes MongoDB indexes use b3 let's go ahead and see that let's copy this and again open this this and see B tree for the same thing I am keeping max degree as 3 so inserting lemon 442 133 14 only four left seven just keep an eye on how things are actually being inserted you'll get to know why betray it's being used over here what I did 11 okay sixty two and three so what is the difference between binary search tree and via tree here you will get more nodes here you will get less nodes effectively if we increased a degree you know you will get the message nodes and that's why it's easy to search it in a faster way degree of three means generally you know every node can have two elements which means that it can have three child one which is less than three between three and four and more than four kind of thing so this is how this B tree works and this is the region Y MongoDB indexes every time you create an index it is toes the index in the form of P plus tree so when we create index index is a part of your document right once sadly you as a part of a document so when we create index what happens to document other documents arranged in a you know sorted order no it creates a separate index in the name of the salary in our case it is totally separated from you know actual document which is being created as and when it is created what it has is the pointer to you know document location that's why the query is able to find the document okay indexes are stored separately in a sorted order with an index value index field value and a pointer to the document where the actual documented and the indexes are created in form of Petrie for example in here a document with salary as three we also have a pointer to the document where all the records ID and name salary is also there so that's all I wanted to talk to you today about MongoDB indexes of course the topic doesn't ends here we will talk more about indexes in next video including compound indexes thanks a lot people thanks what ink in the next time we meet could take goodbye you take care
Info
Channel: Cognitive Programmer
Views: 55,962
Rating: undefined out of 5
Keywords: Mongodb Indexes, mongodb compound indexes, compound index in MongoDB, mongodb tutorial, compound index, database indexing, mongodb aggregation, mongodb performance tuning, mongodb performance, B-Tree data structures, MongoDB query, MongoDB compass, An Insightful Techie, Insightful Techie, Insightful Daksh, codesports.ai, Codesports, Programming, Database, ai, ml, tech insights, tech and travel, Software Development, codesbay
Id: IHQeDEn38BQ
Channel Id: undefined
Length: 16min 52sec (1012 seconds)
Published: Sun Jul 05 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.