System Design Interview: Design Netflix

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

design netflix [Music] hi everyone we're here today for another exponent mock interview with andreas uh for those of you who aren't familiar with exponent exponent helps you get your dream tech career with our online courses expert coaching peer-to-peer mock interviewing platform and interview question database check it out at tryxponent.com alright thanks so much for being here with us today andreas would you quickly introduce yourself to our viewers yeah i'd be happy to hey everyone my name is andreas i'm a software engineer currently at modern health which is a mental health tech startup awesome all right so let's get right into it so the question today is design netflix so netflix is composed of like a lot of common thing or things common to a lot of tech platforms like you have users they can find some sort of video content in this case like movies and tv shows and then they want to watch that content so including those things are we designing the high level system to include everything from subscriptions users search and videos or would you like me to focus only on a particular part um so let's focus on the core product for now so that would be like the videos and the users that you described and you can ignore um other features like search or subscriptions okay awesome so um when you say users in this case hero write that down users and videos um when i say users uh do we want to worry about authentication or just users like in relation to the videos um you specifically need to focus on the user activity data so that's like the data that would go into their uh video recommendation engine um so you'll need to be able to find a way to both aggregate and process the user activity data that's on the website okay awesome um great and so [Music] given those things will i need to like come up with some sort of algorithm for this recommendation engine um not in this case that'll be outside of scope for this question but i'd like you to be able to focus on how the data will be gathered and processed for those recommendations okay great and um so we have this video service we have our recommendation engine and for the video streaming service uh it's i'm guessing it'll be important that we have high availability even with a trade-off of we can say maybe not as great consistency but decent consistency because we'll want like fast low latencies around the world is that correct yes that's correct you should definitely account for that in your design okay awesome so low latency and um and we'll also say global great um so now that we have all these things uh for the background or for the recommendation engine do you think we can run that as a background asynchronous job yeah that sounds great okay awesome oops okay so building a list of requirements here um and then in outside of these cases um i guess a larger question is how many when maybe transitioning to scale is how many users can we assume netflix has um maybe about 200 million okay cool 200 million users and um [Music] so i'd like to take us maybe based off of this number through some data storage estimations and then we'll come to i guess organically to some of the architectural decisions along the way if that sounds good to you sure perfect okay awesome um so we have a few different types of data that we're going to need so our types of data [Music] are going to be video content like static content tied to the video so things like um the names of the movies descriptions [Music] um what else thumbnails uh maybe like a cast list since we're doing movies and and shows um and then we'll also need user metadata uh so like when the last watched if they have watched a show or uh if when the last watch time stamp is so we can return them to the place in the movie that they were watching um maybe likes on a movie like we can thumbs up or thumbs down a movie on netflix uh then i guess the last bit that we talked about so we have this user metadata uh the video static content the video content itself and then these activity logs so that although we might actually have some overlapping user metadata i think we can probably do all sorts of tracking events like clicks impressions uh scrolls and we can fine-tune that a little more based off of what we're trying to look to optimize for in the recommendations but um since that is probably a lot larger in scope uh for now i'll keep that out of scope if that's okay with you and if we have extra time we can go into it yeah that makes sense great um and so in terms of some of the size of some of these uh we can start with uh given these this video content um so we have 200 million users and uh what what so how many movies and shows can we i guess estimate that netflix has uh let's say about ten thousand okay great so ten thousand videos um and then given maybe we can estimate that like movies are on average two hours and then the each episode of a show is maybe 20 to 40 minutes um can we say a video is then on average like an hour long overall uh so that would be assuming that it's about strictly half movies and half tv shows right yes is that all right that's okay assumption okay cool and then um given that we have that also netflix supports different uh levels of definition so we have standard definition high def version um let's say maybe we can say that standard definition is 10 gigabytes an hour and then high def is like 20 gigabytes an hour um so given that we have this i guess here also i'll add an hour average per show and then or per video and then we have this 30 gigabytes per hour of video um then given all of that we can estimate around 300 000 uh 300 000 gigabytes or 300 terabytes awesome and so given that um [Music] this is like a lot smaller than actually a lot of other video stores luckily because it's not user generated content like youtube or maybe facebook video so we have a ceiling on the amount of videos that are being hosted so given that requirement or that assumption we could probably use something like blob storage um like amazon s3 or google google cloud services and then we can just store and replicate the content there that sounds okay yeah so um why is blob storage uh in particular better for this type of data yeah i guess in this case um since we're just host like rather than using uh rdbms for example um i think it makes sense to be able to put it in blob stores where if if we had larger amounts of data we could just have ever increasing blob stores and then that would not be great for us to be able to sort through and find and index but in this case since we have a static uh you know a top level amount of that i think it is okay for us to be able to search across all of that and but also we don't want to just store it in line because they're pretty large files individually okay yeah that makes sense awesome so great i'll start also as we're going through this um some of the design so we have this blob it's got a blob store here and then i guess we can kind of we can assume we're adding this in some sort of uh i know either from an api or or like a client that we can upload the videos up into the blob store um if that's all right we'll just not go too deep into that but i guess we can just assume um on our side we can upload videos into our system sure okay cool and so now we have this blob store um now we'll probably want to take a look at the static content since we have these videos stored and so for the static content um here i'll write it down here oops go back in so for the video static content um okay so we'll need to store things like we talked about you know titles um descriptions uh cast lists all of those sort of things with uh then that will actually be correlated with the number of videos in this case because you know per video we're just storing this metadata about the video um so in that case we can probably just store these in a standard like uh relational database or i guess we could also use a document store um but in my case in this case i'd i think i would go with like just something like postgres um and then we'll probably want to add a cache to it just so that we can make uh reads and faster to this to the client because it's not too much data does that sound okay to you sure uh yes so brief question about this what would you be caching so in this case caching um the most frequently accessed uh like activity or in this case for a given user let's say they access a series a lot or like a certain show that they've been watching would cache that as they go to watch it so that when they go on subsequent views we don't have to re-render all of that uh okay yeah that makes sense so it'll make their user experience faster then yes 100 so like that latency i guess we were talking about trying to increase the speed of all of these parts of the service cool and then i'll put that actually create that here oops that's not what i wanted and so we'll have the postgres here i'll call this the static content and then um we'll have a bunch of api services uh so these api services um we can i guess just assume in this case that the the client is um from the client the users are making calls to the apis and it would be great it will assume because we're creating a distributed system like horizontal scaling or we could just add more of these services as we go and then also i guess we'll have a cache just with the standard maybe right through so as they go to make the call we'll update the cache and then go to postgres cool great so we have that and um so our final type of data that we talked about was the let's see so we have the video static content um video content right here and then we have a video static content and then we have finally the user metadata so this user metadata will be all sorts of things that we talked about like user watch last watch time said likes um [Music] so in this case since the metadata is tied to the number of videos as well um that will be the lower amount but since it's tied to it's tied to both amount of videos and amount of users but amount of users gonna keep increasing at a higher rate than the amount of videos or hopefully that's what we hope with our service um since we have 200 million users in this case that's a lot more than 10 000 uh we can assume based off of 200 million users that we'll have in this case maybe a thousand videos watched per user per lifetime for their the lifetime they're on our application so [Music] given that which is like about you know 10 of our total content we assumed earlier and given that and um so per video let's say we store all of this info information that we're storing of these uh user watched likes and things like that per video we're gonna say maybe it's a hundred bytes if that does that all seem so far okay as far as assumptions okay great um then from there we have so that puts us at 100 kilobytes and times 200 million users which is about uh 20 terabytes great so we have 20 terabytes of data here um but this is like our so our video content is actually more than this which is great um so that means we don't have to do anything too crazy um but since we'll need to be able to query it in this case whereas our video content we just needed to pull it from a store um we don't need to actually i think stored in an s3 bucket i think we can go ahead with something like with postgres in this case since we're already using that right here in video static content so [Music] we got postgres and but we will need it to be highly available because we are using that like we talked about we want low latency um i guess we have the option of going no sequel with something like cassandra but i think in this case we should do rdbms and if we're going to do that route we definitely need to uh shard chart the system i would say we could chart it based on off of the user metadata which in this case we could index it based off of user id i think would make sense since it's so user focused the data and then each shard we can say is 1 to 10 terabytes of data um and so that way we can have really quick reads and writes on the related data so to recap then that means that each chart is actually assigned to a range of user ids right yes exactly gotcha okay great point yeah not an individual user id for each chart that would be a lot yeah yeah um okay so that makes a lot of sense is there a particular reason why you chose to use postgres instead of a relation instead of nosql um yeah i i think in this case um although it's more complex for us to set up like a sharded system nosql out of the box would allow us to do a lot of this stuff um i think to be able to add make complex queries on this large set of data we'd rather and and also the background jobs we're running we'll want to use postgres instead of nosql it's i guess in my opinion the advantage of how we should use postgres versus using nosql right okay that makes sense cool um all right awesome so now here i'll go ahead and also add this up here um so connected also to our api services let's see this oh yeah and we can use like an in-memory cache for this maybe like redis or uh memcached i don't know if i mentioned that before but um we'll cache also on both of these um pieces of content so this isn't the static content this is the user metadata okay great and this is postgres also um okay awesome and yeah actually i'm realizing as we're going through this i one thing i wanted to mention is since we have these horizontally distributed um load balance or horizontally distributed services we'll want a load balancer to sit in front of them to be able to distribute the load evenly between all of them um and this would be the load from users right not from internal servers okay yes great point yes um i think if we wanted to uh since we'll have asynchronous jobs i imagine that we won't need to balance load from those but however i could see if we are like uploading a lot of um information we might want to add consider a load balancer on this side on the back end calls to upload videos i think that's a great point um okay great and um are there any ways that we can probably improve the speed of the requests um yes so i guess in this case we we have this um and as i talked about this reddish cache and i guess that's how this will come in um we'll want to make sure that because we're using this right through caching um that will have speed on both of these uh ferment for the user metadata as well as the static content okay great so that would be like for each user you basically cash the most recent like metadata for them yes exactly and so um keeping that hopefully would increase our uh our availability in that case gotcha okay um and so now that we have all this put together um we there's probably one last consideration also when you're talking about speed and lowering latency is the global nature of these videos since we're serving them all around the world how can we make this fast because right now we probably only or we're assuming maybe one data center and it's like how we have users all over the world um so in this case we can also add a cdn between our blob store over here and the end client so so i guess in this case we can add relevant [Music] run a job that has let's see here all this i'll call this maybe cdn populator and i'll switch these actually but oops there we go so and that'll connect to this the cdn itself um so in this case with from the cdn we'll serve the most relevant content to uh users that are around the world based off of where they're geographically located there's probably you know bridgeton came out in the us and then a bunch of people were watching that so we'd probably want us to cash that there um and so and the populator will take care of that geographic uh population of each of the individual cdns based off of the geographic location okay thanks so much andrea that was really great uh so i think now would be a good time to debrief a little bit what do you think went well and what do you think you would want to improve on more in the future yeah um so i definitely think uh i it was cool walking through all of like how netflix works um i think a lot of times we don't think about because it's so seamless when we're using it that or hopefully is that we don't think about like all of that goes on behind the scenes um i would say like the i was there's a lot of calculations so uh keeping my head on straight for that i'm uh i'll give myself a pat on the back for that i would if i were to improve on something i would probably say uh the manner in which maybe i go through all of this i would probably actually do these rather than congruently it felt a little jarring even myself going through it um going back and forth between them and instead take i would summarize all the calculations and thoughts i had and then go to the um to the board to design it but uh that would be the only improvement i guess or not the only improvement but what immediately comes to mind okay yeah i definitely agree with a lot of those points actually um and i particularly thought that you did a really good job of considering what the user experience would be like and try to incorporate those considerations into your design so for example you thought about how like the user might want um really low latency and they want their videos to load really quickly so you did a lot of caching to try and ensure that and you also talked about how we're gonna have users that are all over the world and so and the media might vary depending on where in the world they are so having a globally distributed cdn helps a lot with that and those are both really great considerations um yeah and you also did a fantastic job of considering trade-offs you know i always think that being able to um analyze a system uh using the cap theorem is a great way to decide like what to optimize for indesign versus what to uh what is okay to sacrifice on a little bit and you did a fantastic job of that um yeah and i agree like sometimes um it can be a little bit confusing to start out with trying to list out all of the detailed um calculations and perhaps as you're working through the higher level design knowing how different components connect to each other can help you work out the details a little bit more so i think it's okay to like once you've gathered the requirements and asked all of your clarified questions i think it's okay to start out with like some of the higher level components if it helps you better conceptualize the idea in your head how does that all sound yeah i completely agree so thanks for the feedback yeah otherwise this is a really interesting interview to do with you so well done thank you and thanks everybody who's watching and good luck on your upcoming interviews thanks so much for watching don't forget to hit the like and subscribe buttons below to let us know that this video is valuable for you and of course check out hundreds more videos just like this at try exponent dot com thanks for watching and good luck on your upcoming interview [Music] you

Info

Channel: Exponent

Views: 16,624

Rating: undefined out of 5

Keywords: yt:cc=on

Id: VvZf7lISfgs

Channel Id: undefined

Length: 27min 50sec (1670 seconds)

Published: Wed Nov 10 2021