In Memory databases internals for system design interviews

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everyone my name is nur in an indecision let's understand about real-time database or in-memory databases we're going to talk about different use cases where you will be using real-time databases and also understand why exactly we can't use relational databases you'll also be learning about internal working of real-time databases including indexing and also data structures like AVL tree and I'm balls also going to talk about B+ tree in our DBMS also how the crash recovery works in main memory databases and later we are also going to discuss about different performance characteristics and what are the advantages of using real-time databases also you must be wondering why exactly I need to know the internals of real-time database because these clever design patterns will actually help you to design your own system at your job or interviews otherwise that case you will be learning when exactly you need to use real-time database and you will be actually using in your systems so let's go back to whiteboard and understand the internals of real-time database real-time database also called as main memory database or in-memory database for the fact that they use RAM or in-memory to save or of the data unlike our DBMS they actually save all of the data in the disk our DBMS was actually introduced in 1980s when the database evolution was happening and people doubted that our DBMS will not work will really work for our use cases or not but here we are we are still using our DB miss these days also heavily now the way they designed is initially was there was a sequel engine there was a hard disk all of the data was sitting in a hard disk and sequel engine actually queries the disk directly let's forget about this buffer for for now and then the data is actually fetch back through the escrow agent used it but as the RAM prices dropped okay and the disks also got much cheaper people started to think that why don't we just use RAM in between as a buffer pool to make our DBMS much faster and let the database still be present in the disk so what they did is they introduced buffer pool in between where a lot of data are a lot of frequently accessed it is actually kind of cash in the buffer pool when the query actually comes to Eskil engine is currently and basically checks in the cash or buffer pool and if the data is present it you store interval and give it here and then if there are updates and deletions or insertion happens they're actually written in the both the places in disk and here so it just to work that way but people really thought in you know coming generation that why don't we just use RAM as the main place where we store the data instead of cashing it in the RAM so what people did was they totally remove the disk from the picture but not really but for the understanding purpose let's remove that is totally and they use this Ram as the main memory okay main memory for saving all of the data now this is the basic architecture of you know real time database when the query comes in the same esker engine is still there and then the escal engine actually queries from the main memory or in memory to get the data which it needs which is actually sits on the RAM and if you know the difference between RAM and the hard disk or SSD performance that is like huge right so here is the comparison I've got some data from internet suppose if you want to read data from hard disk that means that it actually we can read at the speed of 63 mb per second so if the same data is read from solid-state disk or SSD which is usually go to SSD for all of our virtual machines are in ec2 instances because of it is faster than our disk so there this data rate speed is about 457 mb/s per second but in the RAM if you want to store the same data in RAM and read it it will be like 4,600 and MB per second if you see the difference when we read the data from SST it is 7 times faster than the hard disk but the same data when we read from DRAM it is almost like 80 times faster than hard disk and also 10 times faster than SSD so 10 times performance gain is a lot this is what we are talking about you know real-time databases performance so the simple change in the architecture from our DBMS you in sort of using hard disks to ram gives around hundred times performance so just to give an example if a query in our DBMS takes about 100 milliseconds the same query for the same data in real time database actually takes one millisecond that's like 100 times performance gain for you know read and write queries so that's what we are talking about real-time databases are used to get like hundred times performance there are a lot of different databases available aerospike hana DB from sap and there is old TB and there are so many other variants of real-time databases or other which you can actually use it so Redis and memcached and also Sakura it also falls into the same category which also does the same thing but Redis and memcache doesn't really have a sequel support but they are mostly like key value pairs but underlying they use a lot of same characteristic so if you remember I we actually moved we remove the disk totally from the design but actually real-time databases also still use this but mostly to make the system more reliable what they do is they basically store all the Val file on the disk so that if the system actually crashes or if you want to restart basically we can leverage the data which is there in the disk to repopulate all of the data back because for the fact that Ram is very volatile that means that if the system crashes or restarts basically your RAM is like volatile it means that it doesn't remember what it had that means that all of the data in the databases will be gone so we need disk still to have a copy of database and all the operation recently happened using qualify stored in the disk and then we load it back once the system is start so for that to understand much much clearly we need to understand what is the difference between real and databases and our DBMS for RAM based database to disk based databases the first thing is the cost is higher in RT DB because the fact that rams are costlier because all of the data is present on the RAM and rams are costlier operational cost to run the real-time is also higher in case of our DBMS we might use hot HDD or hard-disk or SSD but still comparatively the prices are much cheaper so the second thing is volatile as I mentioned all of the data is present in the RAM itself since Ram is volatile if we switch off our system and come back the whole database will be gone but because of using the disks we can still recover all of the data but there are different results going on to use nvram which is also called as non-volatile Ram which actually uses super capacitors or battery-powered Ram which actually still remembers what it had in that case we might doesn't really need disk to make it more reliable we can use the environ but they are not still available in the market so in case of our DBMS they are durable because once we write it to disk until unless the disk itself crashes they are really durable even though if you restart your system so the performance is hundred times more as I mentioned here and we also saw the read speed on different storages so high performance but in this case of this is not really high it is low performance so how do we basically recover as I mentioned we use valve for recovery of the data in case of RT DB in case of our DBMS we don't need it really need to do anything that like that so no serialization is required in case of real-time databases so if you have any data if you want to put it into storage we need to serialize it and store it right so since we are in the ROM itself Ram is like you basically access all of the data using pointers you don't really save it into hard disk so we don't really need to sterilize the data so that way we are also saving a lot of computation but in case of our DBMS we need to serialize all of the data if it is number or if it is whatever JSON or all of that stuff you need to serialize and store it into the base so we need to spend a lot of computation to do that as well so in case of indexing you know right we also need indexing in the RAM because even though it is faster it is like variables stored in the mmm say say you can say that why do we need indexing in case of RAM because all of the data can be accessed directly by ID IDs can be variables and all of the data can be then like a data in the variable itself say in memory you you always know that the variable has all the language say for example 1 2 3 so if you have all the raw data here itself so we can directly access we don't really need the indexing because it is move it is much faster say give me all the give me the record with the ID to you basically access the data which is there in memory in case of hardest we can't really do that because all of the data is present inside the file you need to do the scanning of a whole file to access the data so that's why we need indexing in hardest but in this case also we need indexing just to support range queries so that means that we still have to index on both the cases but we don't really need to use B plus tree here we will basically use AVL tree for better performance because of the structures of these tree and the way they behave so let us understand how indexing works in real time database before that let's also understand how the + tools in our DBMS and it gives you more idea about how the indexing works so we know that making enabling the indexing on a table makes the queries faster because it enables the way to access the data much faster by using some extra Taylor structures and memory so in this case B plus tree looks like this in every node every node can actually contains one or more childrens so each node actually contains multiple data and each data represents the range of the data which actually holds in its sub sub tree and in all of these leaf nodes holds the data in some cases are the pointer or the reference back to the file position where the actual data of that particular indexed column presents so in this case this data structure is a stream tree structure which is called as B plus tree structure in this case 1 11 25 are the values which we have indexed what this one represent is all the sub structure which holds in the indexed information from the range 1 to 11 so under this point under this subtree all the data from 11 to 25 is holds here and under this subtree which I haven't not written because it is so big will actually hold all the information for the ID from 25 or greater than 25 so suppose if I want to search the row with ID 9 then how we actually use people street 2 such that is we go to the root node and then we have to check this value which falls in the range of 9 so 9 actually falls between 1 and 11 right so we had to go to this sub structure and definitely it is not going to fall between 11 to 25 or 25 or not greater than 25 so we found that 9 is between 1 and 11 that means that the all of the information which we need to access that row is here so we go to the children or child and here we have to do the same thing again east 9 is between 1 & 5 no then means that the information is not over here is the value here between 5 & 9 here it is not that is the value 9 is in this range greater than 9 yes definitely it is here and we go here and this particular record or the leaf node actually holds the data or the reference or the positions or the offset to access that particular value so for 9 the offset is present here and we go to the offset of the file itself so this looks like a table in the representation but the database itself is a file so we go to the 9th offset whatever whatever the offset which is present here so we access the row with the ID 9 so that's how the B+ tree worked and similarly in case of AVL tree in real time database also works the same way so if you know the AVL tree structure the left side of the sub tree and the right side of the sub tree height should be always differing by 1 if it is differed by more than we need to do some adjustment or rotate the values to retain that property so that's what AVL tree is and also it has the properties of all the properties of binary tree that means that all the values from the node all the values which are greater than the node should be on the right side and all the values which are less than the nodes value should be on the left side so in this case the modified AVL tree is actually used for a real-time database indexing this case how it works is each node actually contains one or more elements but in binary tree or AVL tree we'll be having only one value but in this case where we have one or more value so suppose we'll have three values that is ten eleven and twelve so these are the same values of which we wanted to index so in this case we have three values that is 10 11 and 12 what it also means that is the they can only be two children to every node right that is the military property so that means that on the left side of the tree all the data which is less than the leftmost data on the node is present that means with all the data over here is will be all the IDS which are less than this particular value that is 10 it means that it contains all the values which is less than 10 on the right side of the subtree all the values greater than the right most value here that is means that all the values greater than 12 will be present here suppose if I want to find a row with ID 9 what happens now here's first way to go to the root node here and we'll have to find where exactly is it in the left side or the right side so we inspect the first value 10 okay is 9 less than 10 that means that we definitely know that all the indexed information present on the left side subtree so we go on the left side and here also we had to do the same again so is 9 less than 6 no is 9 greater than 8 that means that all the indexed information present on this right side of the subtree so when we reach the leaf node that's where it might actually contains the pointer information for the record which actually holds this particular record 9 it could be a par-4 and here it could be having tuple with nine and all the information which way related so the only difference here is from the B tree is definitely it contains all the you know binary properties and there we don't have the binary tree property and every node will have one or more sub trees and here we only have two sub tree on his left and right sub tree and one more information one more difference is every node also can contains pointer so in this case if I am looking for and a row with an ID ten that means that we definitely know that this is a and this particular node itself actually contains the pointer to the tenth row with IE 10 so that means that pointer one it could be pointing somewhere here so if i'm looking for 18 in that case we know that okay we had to go to the root node is 18 less than 10 no he's 18 greater than 12 that means if it goes here and then we have to check from here he's 18 less than or equal to 18 over here we know that means that so this itself is equal to pointing to somewhere here so one more interesting property to observe over here is all the data structure actually retains the sorted order of the index okay so if you see here 3 4 6 7 8 9 10 11 12 and similarly here one so if you see a 1 2 3 4 5 6 9 11 so this this structure also holds the data in order also called a sorted order and also searching is always faster which is always in logarithmic of n time complexity so I think I covered all of that so you might be asking or you might be thinking if we are dealing with in memory why don't we just use a big hash table and and find the location of the record itself so we can always find the record like like a big hash table right so if I want to search you know 9th record hash of 9 and I can find the pointer where the data present it works well but we can't really do the range query that for the range query beam to have indexing the other structure this data structure actually helps in range query only if you want to just find one specific element we can actually do hash of that particular ID which we are looking for and then we can actually figure out the pointer as well so when we want to do a range query like select star you know from table between ID this to that then we can do say between 10 to 20 all we have to do is go to the root node 10 and 12 so we know that all the data from here and then we need to look for 20 20 is greater than 12 we go here and here again 20 and equal to so we only have to get all the data from here here here and here so we get the data and that is much faster otherwise we should we can't really do the range query on the hash data structure right so that's why we need indexing indexing on real time database as well so not just AVL tree you can actually use a lot of different data structure to create indexes with to make your data access patterns bastard so you can is erase which actually uses a lot of minimal spaces and also an index size total size itself is predictable but the only problem is the update to the index complexity is order of n so there are other algorithms like change bucket hashing which you can use or extendable hashing linear hashing and modified linear hashing also but I'm not going to explain all of those because this is out of context right now so now let's understand how durability is handled in real-time databases so if you know the word durability it means that if you write something to a system the data should be always present that's what durability means now in this case in the process of making query faster we sacrifice something because you know Ram is so much volatile that means that any data which is present in there after the system crashes or restarts all of the data is vanished then how do we make the system more durable for this we'll have to consider one approach called as right ahead log which is also called as wall files the whole process is also used in our DBMS form making durable as well and the same design he's also used in real-time database what happens is one more thing you need to understand about hard disks are up ends are much faster the append operation to the file is much faster because of one reason in hard days the spinning disk will be rotating and the head of the hard disk will be keep on going you know same direction and the sectors are lined sequentially so anything which you up into a file is actually written sequentially you don't need to move the head of the hard disk randomly over here so the data you know Iowa operation will be much faster so let's take that feature as a leverage and then design the system to make the system much durable so what happens is when the query is made from the client to the real-time database the query appears in the query engine the query engine has two things to do the first one is it has to write all the data into the whatever data structure we have so we had indexing data structure and also will actually have the data structure which could be a you know pointer and the rhetoric you can think of it as a hash table which has a pointer and you know values as a tuple say ID name and age so values will be present like that so basically this query engine will actually write the data into this data structures update the indexes and also it is responsible to write the query like DML query and edia query we don't need to write this select queries which are just the reads right we are going to persist the DML and DTL queries into a file maybe it can be done in the background thread as well aren't the query engine can take care it's always better to hand it over to background right what happens now is all the queries like update queries and creative creation of the tables and all the sort of that ways but actually we will be sending into val file as well and those instructions are keep on written into this file we don't need to do much we just need to keep appending all the queries which have done so far into this file so and there's one more approach also to this so we also need to see you know periodically take the backup of all of this data structure into a backup file so two things are happening here so this could be also by run threat which are which is may be a kind to a specific interval in Redis we have a setting to set that no snapshotting time which actually you can set it to ten minutes means or if the database is taken as a snapshot into a backup database into the disk okay so this is cheaper so this is the one which makes faster and costlier so we can keep dumping all of the state of the database into the disk and in future if you want to make it even more durable what with what people do is we can take this back up and put it into s3 as well so that's sort of the topic now so two things are happening here all the DML and DDL queries are written into one file and also periodically we are taking a snapshot of our of this structure and having it go back so both of these things are present in our disk you might be asking why do we need to keep on stirring this queries into a file the advantage is if the system crashes or if the system restarts then all of the data inside Ram in the database will be empty right now we need to create everything so what we can do simply is replay all of this queries from the beginning of the file you will actually end up in the same state when the system crashed it means that say if you have created a table updated some rows inserted some rows or deleted some rows you basically replay the same sequential order so all of the data will be you know or reproduce or retweet it in the structures so that way we will get back the data because when the system crashes we don't lose this data because this is persistent only we will lose this data so we'll get back all the data but you may be asking now what if we are making billions of queries and this fight will definitely overload and that this will fill out right so that's why and also one more interesting thing is we might be updating the same row million times like some first for example if you have a video and you will be updating the views count to that video or if you have a Instagram post or something you might be recording the number of likes to that post so basically you are updating the same row or you might be updating the status of some record like status is beginning started created processed finished deliver ship so you'll be having know five different five to ten different statements for the same record which are which are update statements basically the end state is the last query which we need so we need to do some some optimization for this file to make the you know to make the data of a small size so what we can do is we have to do some process called as compaction when we do the compaction on this Val file the Val file will become very smaller what we actually do is instead of capturing all the previous statements we only capture they're real when the compaction happens we will delete all the previous statements and we will only retain the last statement for example when we execute a query called as update the status to have created so if you have a take statement status set to create it and the next query may be set to start it just an example processed and then say shift and then finished so in this case we have five different queries over here so there is no point in replaying you know logs of this file on this database like all of these queries so there is no point in replaying all the queries because the end state is only this one right because finally in the database we will have this as the in a final state of the data so the compaction process can identify these kind of queries we can basically delete all of these things and we can only preserve these things so when we when we when we run the compaction this is what it does it will identify all of this kind of queries and then making it very simple so that database file is for example now it is reduced by five times so that's kind of optimization so we will live never run out of disk and you might be asking even after doing compaction we might suffer the same even still this file might be big so one more optimization is so we know when we have taken the snapshot of this database right so at any given point and say T if we take on the snapshot we can delete all the locks prior to that so there is no point to replay that one right because we already have taken a snapshot at point of the T that means that we have taken a state snapshot so all the logs in the log file older than T can be deleted we don't need to replay that because we already have a state until then so that way also we can keep on clearing this file or we can have a new wall files every time in its snapshot is taken so old file can be straight away deleted so these are the ways we can make the system more durable so what about high availability so high availability is somewhat similar to our DBMS system as well which in which case we have a hot standby backup machine which also have the latest state of all the data which is there in the master it's like a master slave once the master goes down you basically switch to the math you know slave real-time database it works the similar way how are they will miss works and apart from that if you look at the performance of real-time database versus our DBMS the transaction per second actually is huge there's a huge difference between a real-time database versus a DBMS so that's the kind of advantage and also initially I mentioned that the query time in in the real time database versus our DBMS is like 100 times difference so that's a kind of performance you get by using real time database in your systems apart from that one more important thing to know is what happens what is the limit of the RAM you can get it on your server so in your server as of now we get up we can set up up to 300 to 400 gigabytes of RAM in each server what if we want to store more data other than that like maybe we want to store two terabytes of data in our system what do we do that then we have to go for shouting these principles still remains the same like our DBMS so to understand more let's you take one use case and understand better to know why exactly we need real-time databases other than create our DBMS so the use case is called as real-time witting also called as our TB so where the bidding for advertisement impression happens where there is always suppliers and there is always a demand there is supply and demand that's where you always need an exchange who actually meets the supply to the demand so what happens here is so there will be hundreds of companies who are trying to show one ad to any given user this system will actually decide who is the winner and it is going to select that particular company's advertisement and show it to the user to look at the scale of the system there is billions of user who will be requesting to be shown to show ad on their respective web pages which they are browsing so there will be hundreds or thousands of advertisements who wants to show their ads so how the system works says suppose if I want to buy a car what I do is I obviously do Google search or maybe I'll be searching in Facebook or Twitter or whatever or the reviews on a different websites so Google definitely will know that I am searching for cars recently that most probably that I might end up buying a car so before that before I buy the car their responsibility is to show different companies advertisements to me to change my mind to buy a specific product so there will be hundreds of companies or there could be handful of companies who wants to show car advertisements that's a pity so there could be Ford which is one so wants to show advertisements to me as well and there is multi Suzuki also wants to show there is Kia are there is VN Rio which all of these companies want to show and advertisements so I actually see the advertisement most likely if I'm interested in that car I might end up buying so there is always competition in showing a specific ad to a specific person since these real-time bidding companies also have access to all of my information say for example Google AdWords have he does know about all about me based on all of the information they collect from Gmail two handouts or you know from Android phones they definitely know how much I earn what is my favorite color what is my age and where do I live all of based on all of these different attributes they know exactly what kind of ads to be shown so when when I browse a page say for example if I'm browsing a page it'll be advertisement in that page when I am loading this page the advertisement request will actually go to this real-time bidding exchange and these guys are the system based on all of the information they know about me and they will actually select the advertisers and they show an ad here the way bidding happens is if there are four different companies as I mentioned for Maruti and Kia and B&W and they would have set specific prices if that price meets then they will show the ad so whoever has the higher price they are the winner and this exchange will actually pick that users advertisement and then I'll pick that companies advertisement and show it over here so that's the bidding happens but if you look at the scale there is billions of requests coming and there's a lot of advertisements to be you know paid and then compare do a lot of transactions so there is a lot of complicated processing actually happening here and there's tons of data which is stored in the database why do we really need a real-time database over here you might be thinking why can't we just use our DBMS or no sequel over here so when I browse the page there is advertisement plug-in which actually shows the advertisement usually the load time for this whole page is let's take 200 milliseconds so that means that all of their sources should be downloaded and once the page is actually start to render before that we actually make a request to get an ad maybe it has already there is like 150 milliseconds other they happen to download all the CSS and J's and all the J's are related to the side road intimacies is also downloaded and after 150 milliseconds the request will actually go to real-time database from the user now real-time database should actually bid and select a winner so that they can show an ad over here now you have the goal is to actually show the advertisement as soon as a page rendering or you know the page has loaded that is before 200 milliseconds or you can show later as well most likely what happens is I might be already seeing the content over here in the page and I draw close the page or I was crawled and went to a different position in the page so that this advertisement is already hidden so the goal here is to show as soon as possible to get the attention of the user so so you basically have only 50 millisecond left with you to decide which advertisement to be shown here that means that your system should be highly performant or the performance of the system should be very high as to decide the advertiser advertisement to be shown and do all of the transaction increase the counter or whatever it is supposed to do and also rest respond to this page for that advertisement to be shown so within 50 millisecond you actually should display this advertisements that's the you know time line so if you just use our DBMS over here what happens is most likely that you can't really achieve all of this thing within millisecond maybe you will be spending 500 millisecond or even higher than that or maybe one second to given comparison if say for example one point is taking in 100 millisecond in our DBMS the same query actually works in one millisecond in real time database so hopefully that it's a you will get hundred times performance improvement if you just use real time database over our DBMS so these are the use cases where so when you have very limited time to process something and you want to query and then do all of lot of things and then show then most likely that you will actually end up using real-time databases I hope I have covered enough information about real-time databases if you like this video please like and share thanks a lot

Info

Channel: Tech Dummies Narendra L

Views: 21,928

Rating: 4.9600573 out of 5

Keywords: Amazon interview question, interview questions, interview preparations, algo and ds interview question, software interview preparation, developer interview questions, Facebook interview question, google interview question, Technical interview question, software architecture, system design, learn System design, inmemeory db, realtime db, mainmemory db, avl tree indexing, wal

Id: zkACt4NYkU4

Channel Id: undefined

Length: 34min 59sec (2099 seconds)

Published: Tue Apr 14 2020