Google Drive System Design | Dropbox System Design | File Sharing Service System Design

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone today we are going to discuss the design of a cloud-based file storage service like dropbox or google drive so if an interviewer asks you to design a file storage service whether it is google drive or microsoft onedrive or dropbox the design will almost be same and of course the design is always based on the requirements that you will discuss with the interviewer during the interview so let's first discuss those requirements before going into the requirement collection phase you need to understand one thing about the file storage service system design this interview question is a very tricky question there are so many different things which an interviewer can ask you during this interview question if you haven't watched my previous mock system design interview video or designing dropbox service you should watch the video the link to the video you can find here and you will understand that this interview question is very tricky an interviewer can actually go in much depth if he wants to if he asks this question and that is why you need to give due importance to this interview question and you should prepare well for this question and unfortunately the all the other online sources that are available right now they don't go in much detail in the design of a dropbox service they just discuss the design at a very high level and that's it but that is not enough especially if you are targeting l5 or l6 levels in facebook google etc so let's first discuss the functional requirement the very first functional requirement is the user needs an account in order to use the file sharing service so an account and what it also means is that we can have two types of users we can have free users and we can have premium users where we can assume that the free users are using our service in a limited capacity for example they might have a limited storage available as compared to the storage space that is available for the premium users similarly the network bandwidth that we provide to the free users might be way less than the network bandwidth that we provide to the premium users the second requirement is that the user can have multiple devices and now what he can do is that he can actually specify a root folder in any device and whatever files and folders he create in that root folder that is get replicated to all the other user devices right now we are assuming that the maximum file size that we are going to support is 1gb another very important requirement that the interviewer can ask you to design is that a user should be able to share files and folders with other users so file folder sharing so by default the original owner of the files and folders he has the desired access but when a user is sharing file or folders with other users he can either provide them read that access or he can also just provide them read only access so the original owner of a file or the folder has read that access to it and he can also the one who can actually share the files and fold up with other users a user can only share a file or folder which he owns a user cannot share a file or folder with other users if he is not the original owner of that file or folder so the owner has full access and what full access means he can read write the files he can delete the files he can share the files while the other users can have just the desired or read only so other users can have either read write or just read only access to the files and folders the other requirement is the system should support storing files and folders up to a certain limit only and once a user has reached his file storage limit the system should not allow any other rights and the requirement is the system should allow offline creation of files or offline update or deletion of files and folders so what it means that while a device is offline the user can go and either modify a file or delete a file or even create a new file and when that device gets connected with the dropbox service then all those changes get replicated to all the other devices for the user there can be some other extended functional requirements for example the system should allow multiple versions of a file and the user should be able to recover the file to a previous version if he wants however this requirement is a very tricky requirement because it imposes several design constraints and questions as follow the very first question that comes to mind is that how the versioning should be performed whether it is from an update to another update or it's a daily snapshot of all the updates another question is how it will affect the files and folders which are shared among various users if a user with read write access decided to revert some changes of a file how those changes will be replicated to the devices of all the other users also the third question is how it will affect the storage capacity of a user whether the total space consumed by each version or snapshot is counted towards the total space consumed by the user or not and what should be done once a user has reached its limit in that case should we start dropping older versions or should we just stop creating new versions etc so you should clarify these type of requirements and constraints with the interviewer during the interview and then based on what you decide with the interviewer you need to design the system accordingly another extended requirement is the system should be able to keep track of all the analytics related to the storage and network consumption also one of the exchange requirement is which actually comes into picture when you start allowing shading of files and folders among users that what will happen if two users try to update a same file and if a conflict arises because of that how you will resolve that conflict another important requirement is data security the risk of data read during transmission can be mitigated through encryption technology encryption in transit protects data as it is being transmitted to and from the device to the cloud or vice versa encryption at rest protects data that is stored in the cloud service another requirement is to provide such facility so that a user can actually able to go and search his files and folders so these are the different functional requirements now we are going to discuss some non-functional requirements of the system i think by now if you guys have seen all my previous videos you must already know now which are the important non-social requirements that we need to discuss so the very first non-functional requirement is the system needs to be highly available and fault tolerant the second non-functional requirement is the system should be highly scalable and it should scale with increasing load or increasing users and data the third requirement is the file synchronization should use minimal network bandwidth the fourth non-functional requirement is that the filed transfer should happen with minimal latency these two non-social requirements about the file simulation requiring minimal bandwidth and minimal latency are very important because due to these two requirements we will decide to upload every file by first dividing it into small segments or chunks this will enable us to only upload or download modified chunks of a file and also if uploading or downloading of a chunk failed we only need to retry the upload or download of that particular chunk instead of retrying the upload or download of the complete file the fifth non-functional requirement is the dropbox service guarantee acid requirements for the files that are stored with the dropbox service acid stands for atomicity consistency isolation and durability let's suppose in one of your device a file was changed from one version to another and those changes get replicated to all the other devices so now if you go to the second device in the second device the file which was changed in the in some in the first device is now getting all the changes which will move it from the first version to the second version and it's totally possible that in order to move from the first version to the second version there might be multiple chunks that got modified and so they all those chunks get uploaded separately and then on the second device they gets downloaded separately but now the issue is that if you just apply one chunk at a time it's totally possible that the user will see the file on that device in a transient state where he will see the first chunk get applied then the second change of blood and so on however it will break our atomicity requirement the atomicity means either all or nothing so the user should not see all the transient changes while we are moving a file from one version to another version so how we will achieve this atomicity requirement in the dropbox service is that in our client-side dropbox application we will first apply all the changes to our temporary file and that file could be stored in a temporary folder or maybe in the same folder with a different name and that file could be hidden from the user and once all the changes are applied to the file then we can actually use an atomic file operation like link or the name to switch the temporary file with the original file atomically however if you check all the online sources none of them actually discuss about this thing the fee in the asset requirement stands for consistency consistency in the acid properties means that data moves from one correct state to another correct state with no possibility that the readers could view different values that don't make sense together for example if a user has deleted a file in one device then other users or devices should not see that file getting partially deleted first this is an inconsistent state that would cause errors if someone tried to redirect the partial file this is different from the replication consistency which we are going to discuss later where an update happen in a device but takes some time to actually get replicated to other devices i in acid properties stand for isolation it means transactions that are happening concurrently will not affect each other in case of dropbox service it means two things first of all any updates to different files are totally isolated from each other they do not affect each other second is related to update of a single file in multiple devices if a file get updated in multiple devices at the same time then one of them will have to wait for other right to complete first however this is really hard to enforce a two distributed system consider updates that are happening to a single file on two different devices and both of those devices are actually offline so you see if we have one requirement where we say that we would allow updates on files offline then having this isolation requirement where we actually try to make sure that updates don't happen to the same file at the same time on the device actually impose a lot of constraints on our system because now our system has to handle this case also where updates can happen to a file and both these on two different devices and those devices can be offline and now when those two devices gets online then those updates will flow to each other and it could cause conflicts and then we need some way to resolve those conflicts which is a very hard problem now d in acid properties is durability it means that once a change has been made and uploaded then it will not be lost for example if a file has been created to a device and synced to the remote file storage then that file should not be lost this does not imply that another operation later on will not modify or delete the file it just means that the changes are available to the next operation to work with as necessary it also implies that any lights that happen to a dropbox service should be replicated to multiple data centers even across continents to make sure that any large-scale disaster for example earthquake or flooding of fire that can wipe out the entire data center does not cause data loss here i'd like to add one more thing if you check other online sources you will see that they discussed that using a relational database actually provide this asset requirements and if you are using no sql database then of course then you have to provide these asset requirements programmatically but this is where these other resources are wrong just using additional database will not provide the asset requirements because we need to understand what the asset means in case of the box service if the a in the atomicity means that when a file change from one version to another version on a device the user should not see any transient state then just having additional database in the back end will not be enough so this is i like to point out here that don't just assume and don't mention this to the interviewer that is okay if you just go with the relational database it is going to fulfill the asset requirements because that will be wrong and then the interviewer will ask you what does the asset requirement means and the same thing you can see in the previous mock interview where the candidate went discuss the asset requirements i specifically asked him what does the asset means in case of dropbox service now the sixth requirement is replication consistency the replication consistency here is different than the consistency in the acid requirement what it means here is that when an update happen on the device it will take some time to that gets replicated to the file storage service and also to other devices and the replication consistency will be eventual we cannot have a strong replication consistency here due to the nature of the problem we are allowing offline updates to a file so it means a user can actually update some file on a device which is offline and then only when the device gets online then the changes will get replicated to the fastest service and from there it could go to to other devices only when those devices are online as well so the replication consistency will be eventual in case of dropbox service and please do not confuse the depletion consistency with the consistency here in the asset requirements by the way if you want more details on the requirements and the design of the box service you should check out my course the link is in the description below now let's discuss some apis that our service will expose so the very first api is upload file it takes user token file metadata and file content the file content could be a stream of different chunks for the file the upload file will return a file id now there can be other apis like update file metadata it takes our token as input file id and then file metadata and what could be the file metadata is like uh file creation time for updation time file size etc etc another api could be update file it takes user token as input file id change metadata and change segments or modified segments changed segments and just change segments is the input stream for all the different chunks or segments of the file that are changed similarly we can have an api to delete the file which will be read file user token and file id a user can also list files so there could be a list file api this files api which takes user token root folder page size and page token another api is share file or folder share file or folder which takes user token file or folder id another user id and set of permissions and permissions can be redirect or read only similarly there can be an api for stop sharing stop share and it can take a user token and a name space id so by the way this name space id is the id that is returned by the share file or folder function the name space the concept i will discuss later and of course a user can always list all the shared namespaces so list shared namespaces so user token as input and this returns all the shared namespaces for a user now let's discuss what a dropbox namespace means in order to explain what a dropbox namespace is i'm going to consider an example suppose we have two users user a and user b the user a has this root folder where it has a file one file four and then it has a sub folder folder one which has file two and file three the user b also has some file let's say file 11 and then some folder folder 12 and then it which has file 12 in it here this collection of files and folders will be called as the user a root or home name space similarly this collection of files and folders is called user b home name space what is the name space a name space is nothing but just a collection of files and folders with certain permission access now what happens is that this user a shared this file for with user b let's suppose this file 4 gets shared here i'm just giving it a name file 4 bar so at that time what happens is that this file for when the user is shared with user b it moved from the home name space to a shared namespace and this file 4 bar here in user b is a proxy namespace that gets created when the user is shared the file here so we have three types of namespaces we have number one home name space number two shared namespace and number three proxy namespace so this was essentially home namespace when a user a shared a file with user b we create a shared namespace for user a and a proxy name space for user b and this proxy name space actually points to the shared namespace you will find all the details about this in the chapter on the dropbox service design in my course now let's discuss the overall high level design of dropbox service so on my right side we have a client application and the client application itself comprises of five different components we have a component called watcher the watcher component in the client application is actually looking for all the file folder changes that are happening under the root folder that the user has specified to be shared with all the other devices and with the dropbox service now we have a chunker the chunker is a component that is responsible for breaking a file into multiple chunks and it all it is also responsible for calculating the cryptographic hash for each chunk so the cryptographic hash function that we can use could be sha-256 or sha 512 then we have this file indexer the file indexer actually stores an index of all the files which are stored within that root folder that the user has specified with dropbox service and the internal database is a file database stored in the client app where the indexer stores all the metal data information about the file and all all its different chunks the dropbox client application can use any lightweight database like sqlite or berkeley database etc for internal db then the synchronous component in the client application is the one which is responsible for syncing all the changes that happen on the device to the remote dropbox service or vice versa you can find more details about these components and their design in my course but there are some questions i would ask you here the very first question is that we already decided okay we are going to divide every file into small chunks or segments and the two reasons and and the reason why we are doing that is that because we have two non-functional requirements that actually cause us to actually upload or download a file in the form of chunks the question is what would be the suitable chunk size should we use four kilobyte eight kilobyte 64 kilobyte 1 mb 4 mb 64 mb what do you think what would be the most suitable chunk size and how to calculate an optimal chunk size for the file so this was the client application and if you see the client application is communicating with the dropbox service over the internet and now these are different microservices within the dropbox service we have a gateway service we have a synchronization service we have file and folder metadata service we have a user and devices we have notification we have object storage service we have block service or chunk service and we have billing service i'm not going to details of the design of the gateway service if you see my previous videos on designing uber you will find it is there the user and devices actually store information about the user and all the devices that the user has the file and folder metadata service actually store all the information all the metadata information about a file and also it also stole the information about if a file is shared with multiple users then that sharing information is also stored in the file metadata service the synchronization service is basically a service which is the main component which is talking with the synchronizer here in the client application any changes that happens on a device the synchronizer informs those changes to synchronization service and then from there synchronizing service the changes are stored in the block service and object service service and then the file editor service for the file so whenever a new file gets created the information is stored here in the file metadata service similarly when a file gets shared with another user then the information gets added here in the file folder metadata service and similarly other user devices if they are running and they are online the synchronizer component in the client app communicates with the synchronization service to receive all the changes that has been done on different files under the user name space or the shared namespace which the device is subscribed to so and the synchronizer then talks with the synchronization service to get all those updates if you check out the online sources i think none of them has discussed what would be the internal design of these services what are the different database schemas used by these services you can find that information in more detail in my course now if you go and check my course you will find that there are multiple ways to store file metadata information and i have discussed all those different approaches and i have also discussed all the pros and cons of those approaches and this is very very important if you are applying for an l5 l6 or senior principal level engineer job what the interviewer is looking at is different trade-offs the interval doesn't ask you to just design a system and just give you one design approach and that's it he actually loves if you provide multiple approaches and you can discuss the pros and cons of different approaches and then based on that you decide which approach is better so for example there are multiple ways to design the file metadata database schema i'm just i'm not going to address right now and either you can check it in my course or if you want you can google it about different approaches to store the file metadata information the first approach is adjacency list model using the parent reference the second approach is adjacency list model with the children list the third approach is materialized path and the fourth approach is nested sets model you can get details of those models either in my course or if you want you can google it and see what it looks like but of course in my course i'm also defining all these schemas based on different approaches now there are some questions that are this like if you are using one approach to design the file metadata then how the file sharing will actually happen in case of that approach as compared to if you use a different approach to design the metadata for example if a user want to stop sharing files or folders with other users then what needs to be done and what are the changes that needs to be done in the in the metadata service in that case and now as far as the block service is concerned this is a block or chunk service this actually store the metadata information about each chunk of course the chunks itself are stored in the object storage but we actually store the chunk metadata information in the block service or the chunk service this blog service is also responsible for generating different version information for different files and in the course you will find the details about two approaches that we can use to generate versions or snapshots of each file in the video i haven't discussed the design of the synchronization service retail but let me tell you that it comprises of some app servers and of course some distributed queues for example kafka and some agents that are listening on those queues and so what happens now whenever a user create a file the synchronizer actually informs the synchronization service or actually the app servers here about file creation the request goes to one of the app server the app actually goes into the file manager service to create uh the entry for that file with the append status pending after that all the file chunks are uploaded and whenever chunk gets uploaded to object storage the metadata information for your chunk is also get added here in the blog service and then once all the chunks are uploaded we actually go and change the status of pending to upload it in the file metadata for the file similarly when a file gets updated it's the synchronizer on the app uploads the chunks that are modified those are written into the object storage and then the synchronization service calls into the blog service also insert the new entries for those modified chunks similarly when a file gets deleted the synchronizer and form the synchronization service and then the synchronization service actually go and mark the file excluded in the file metadata and then of course we run a workflow in the background that actually goes and delete all the different chunks of that file from the blog service and the object storage so the main point i like to convey here in this video is that the very first important thing is always come up with the functional and non-functional requirements once you have discussed the functional and non-functional requirements with the interviewer then you have to of course design the system based on those requirements and it is always better to design the system in terms of different microservices you see i'm not putting any load balancer here because right now these are just logical diagrams for different micro services what i would like to say is that do not try to design a system as a monolithic service in an interview always design the service in terms of different micro services and you can go deep into the design of these components individually later on and that's why you are not showing load balancer here right now in that design but of course each service will have some app servers behind some load balancer i hope that you must have found some information very useful in this video and if you find that video user please do like the video and please do comment below as well and please do subscribe to my channel there will be more videos coming soon thank you and take care
Info
Channel: Think Software
Views: 13,867
Rating: undefined out of 5
Keywords: Distributed Systems Design, Distributed System Design Interview, Think Software, Software Developer, Google Interview questions, Facebook interview questions, Amazon interview questions, System Design Interview, Detailed System Design Interview, distributed systems, file sharing system design, system design interview questions, grokking the system design interview, system design interview, grokking the system interview, google drive system design, dropbox system design
Id: 3RHjRXWAUvg
Channel Id: undefined
Length: 27min 53sec (1673 seconds)
Published: Fri Jan 08 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.