This course is a detailed introduction to system design for software developers and engineers. Gaurav Sen developed this course. He is an experienced software engineer and he also has a popular YouTube channel. You will learn about basic engineering design patterns that are used to build large-scale distributed systems. In the second part of the course you will learn how to use the principles from the first part to design and code a live streaming video app. Hello everyone. Welcome to the System Design course. This course is for beginners in a
sense that if you have never done system design before or you have done
very little system design or you have just heard of it or read about it,
but not got an opportunity to actually do it, this course is right for you. At the end of this course, you'll be
able to identify some basic engineering design patterns, which are used to
design large scale distributed systems. Let me define each of
these terms in detail. Large scale distributed systems,
as the first part of the term, says large scale, meaning that something
which is being used a lot or is very intensive in terms of compute or data
or any, computer engineering principle. For example Google, let's say Google Maps. This is large scale because
A: it has a lot of data. The whole world's map has
to be stored, incited. B: it's being used by a lot of people. C: it's being updated very
frequently, and b, it has a lot of performance expectations from it. You don't expect Google Maps to go down. You expect Google Maps to
return results quickly. You expect it to be
accurate, so on and so forth. So it's a very good product
being used at a large scale. Distributed systems means that the server
or the code that is actually executing this program is not in one place. It's disputed all around the world. So you might have one server in
India, one server in the us, one in Japan for tolerance, so that if one
of these servers crashes, the rest of the servers can take the load. It's also for performance. If the Indians want a Google Map result,
they go and talk to the Indian server because that gives a quick result back. While if they go to the US server, they
have to go across continents and then get a response, which is a slow result. Okay? So that's the point of a large
scale distributed system. Many of these companies which
provide these solutions, Expect their engineers to know about system
design, to know how to build these large scale distributed systems. So to do this, their engineering
team depends on design patterns, which we mentioned earlier. Design patterns are particular practices,
principles, or processes which are used by engineers to build these systems. For example, you have a common
problem of a celebrity posting on social media and that post being
distributed to a lot of their followers. For example, if Brad Pitt posts
something on YouTube and then post on LinkedIn, the problem is very similar. It's one piece of content which has
to be made into an event and notified to millions of people potentially. And what you want to do is you want to
notify them quickly, but you also don't want to put too much load on your service
so that the rest of the requests, which are coming into it continue being served. This is a common problem, so you
extract out a common problem and you solve it using a common solution. , this would be a design pattern. Right now, you don't know what design
patterns exist, but one popular design pattern for this kind of a problem is
a publisher subscriber model where Brad Pit is a publisher and their event is
subscribed to by millions of people. And Intermediately decides the pace at
which you send these notifications, which keeps the server loaded low, and also
make sure that all the notifications actually reach all the subscribers. Now, seeing this, you can tell that as an
engineer you can take a lot of business requirements and convert them into
technical solutions, and engineers use system design patterns to make reliable,
scalable, and maintainable systems. This helps them convert business
requirements into technical solutions. If this sounds exciting to you,
watch's the rest of the video and you won't be disappointed. Okay, let's start with an example. Let's say that you join a company
which does broadcasting of videos to millions of people. This could be Hot Star, it could
be YouTube, it could be Zoom also. So some sort of events which are
being broadcast to millions of people. And you as an engineer have to devise
the solution for this, come up with some technology which can handle this problem. So the first thing that we need to
do as engineers is to define these requirements from the user's perspective. And often engineers don't do this. They're product managers and a company
which write a product requirement document based on user feedback and data. So that's good to have. You have a well documented, thought
business banked document, which the engineer can then read and decide
on how they can make it happen. Amongst these, you pick up the
most important features first. Taking the example of a live streaming
system, users who tune into your system should be able to watch the stream live. Whether you see it at HD quality or you
see it at four 80 p, that's probably a secondary issue, but being able
to see the video is a primary issue. So for that, you have to make sure
that your server doesn't go down, and you have to also make sure that your
bandwidth requirements are sufficient. Second, you reduce these
features to data definitions. So one of the features which might be
defined is users should be able to like a video or should be able to like a comment. What you then do is you look at
the concept of a, like the abstract concept of somebody liking something. What does that mean? It means that a particular user
likes a particular comment. Okay, so it has a like id, it
has a user ID who's done this. It has a timestamp. When was this video liked or
when was this comment liked? The comment itself is an abstract concept. It's probably going to
be mapped into an object. It has an id, it has a user
who has typed this comment. It has a creation time and maybe it also
has a thread in which it was posted. So you're seeing that I'm taking features
or I'm taking these abstract concepts from the product requirement document and
converting them into data definitions, which are used, useful for an engineer. These definitions can then be
mapped into objects, which can then be mapped into the database. Once you've defined the data that
you need to store, you need to define endpoints through which this
data can be manipulated or queried. So I want to read comments, give me an
API and sgp api ftp API for all WeCare. The network protocol at this point
is not important, but some method by which I can send an electronic
signal from one place to our server, and our server can respond to that
signal with the data that they need. So our server is now encapsulating
this data as per the user requirement and defining some
endpoints, which are called APIs. So that external users can
query and manipulate our data. We do this for every feature that
we have, core features and also optional or good to have features. Usually a product requirement
document does not define optional or good to have features. These are all core features which
are required in the document. The good to have features will
be probably picked up in the next document, so you don't need to think
of which features are optional, but you do think of which features
are most important as an engineer. The second thing is you also have
certain engineering requirements. When you're creating these designs or
coming up some code, you want to make sure that none of your services fail
If there is an outage, so for example, if you have an outage in India and your
common service is in India, then you don't want the entire system to collapse. We talked about this a little
earlier, that you have multiple server spread across the world to
avoid a single point of failure. . You may also have multiple
servers in India itself. So if one Indian server fails, another
one picks up its responsibility. So here you might have some sort
of data duplication because of which the other server is quickly
able to pick up the responsibility. Or you have some sort of partitioning
that 50% of the users are going to this server, 50% of the users
are going to another server. So just 50% of the users are affected
under the common and often missed out engineering requirement is
extensibility, which means it's not just about the technical solution
that you come up with, but also how easy it is to change that solution. For example, if you write code to
send a message to millions of users, and you need to make a small tweak to
that code because now you don't want to just send it to 1 million users. You also want to check whether
they have read those messages. So read receipt. The problem with writing code, which is
highly coupled with the feature. Is that whenever there's a changing
requirement, you have to put in a lot of effort to redesign, test,
and actually deploy that code again. So you want your features,
your engineering features. To be extensible for this, you
have to take out your engineering pistol ball and gaze deeply into it. Actually what you do is, as an experienced
engineer, especially, you look back at your past projects you look back at things that you have done,
you look back at things that people have done. You know what the world is doing. And based on that combined knowledge,
you build a system so that you can reasonably expect it to scale and
extend as requirements change and as more and more users join you, finally
this design needs to be tested. This is an important part
of system design, which is not really thought through in interviews, but in the real
world, when you have a system which is designed by, let's say a senior engineer that design has to
be tested. You run through a couple of requests with edge cases, with common
cases, and whether these requests are having a sensible
flow in the system at a high level. The other thing that you can do is you
can use sophisticated tools to load test this design. You can have some sort of capacity estimation
to actually see whether this design is feasible, but importantly, you
have to test this design before you start getting into the code. Okay.
Let's recap with an example. Let's say we have a live streaming system. How would we go about designing this? Requirements would be streaming
video, processing video, sending video to multiple customers. So that is
broadcasting not failing is a requirement. Showing advertisements, allowing
reactions, showing disclaimers or news flashes, having graceful
degradation of video quality in case you have low bandwidth,
allowing multiple device support. and so on. Amongst these, you'll see that
the product requirements are mainly showing video to a lot of users. That's a major requirement. And also, of course, comments is a thing
and being able to react as a thing. You may also consider product requirements
to be showing a banner in case there's a problem. It is, but the core requirement, of course, is just showing a video,
so I'll pick this feature first. This means that I have to capture
video from a source, let's say, which is shooting at eight K, and I should
be able to store it in someplace in my server so that I can query it
later in a live streaming system. The later part, the query it later
is probably milliseconds, so I may not want to query that data at all. I may want to directly stream it
from the video camera onto millions of people taking a step back. It looks impractical because if I'm
shooting at eight k at very high quality, and this is raw footage. Sending that much data to people on their
mobile phones is unreasonable. So yes, I need to store it in some
sort of a database or a file system, and then I should be able to stream
that or query that out so that I can distribute it to all of my customers. But I don't want my customers
to know exactly how I'm doing this or I don't want them to know about, there's a
change in implementation tomorrow. I want this to be a black
box so that they can pay me and relax. They just hit an API and their problem is solved. Of course, I might give them
clients which are download on the app store which take care of all of this API and querying
also so that as end users, they can just enjoy watching an event instead
of thinking about how the technical part of this is actually happening. These APIs have well defined signatures. You can tell that if I want this video Id in a particular format, then I have to query a
particular API called Get video. , which is going to be
returning me objects of type frames. And those frames are also well defined. So if you have ever written a program,
you know that these API signatures are very similar to method signatures. The only difference being that these APIs
might be queried not through a programming language, but through a network protocol
like G R P C, htp, ftp, any kind of protocol which defines exactly how an
electronic message is going to be taken from one place and sent to another. And also how the response is
going to come back and how the behavior of this interaction is going to be that is defined by the protocol. So this is great. We have a system which is storing some
data, which is valuable to us because we want to watch this event live. And this data is going to be queried using APIs. Also, these APIs are going
to be tested beforehand so that, the clients who are using this work, paying for
this service are not disappointed when they actually query it. As engineers, we have to think
of various failures and our use. . What if the database, which is
stored in your videos crashes? What if a particular firewall
on the internet starts blocking all of your requests? What if one of the services that
you've written in your entire system, one program, one piece of
code, starts misbehaving because you have, introduced a bug in it? Or there is somebody who has
maliciously entered the system and changed the code in that system? We have to use some design principles
that we talked about earlier. This challenge may also be a feature
request, like if you want the musician who's playing in this live event to
be able to talk to the audience, do a back and forth with some audience
members who they select either randomly or based on their activity. So you have to display those users to this
musician live and be able to broadcast two parties to millions of people out there. So taking these requirements, let's try
to design a live streaming application. So there's two ways to approach this. One is, From customers or our
clients to our server, which is out there in different parts of the world. And then to our database, which may be, again,
in different parts of the world. That's one way to think. And the other way to think is
from our database to our server. So what kind of detail do I need
to store to enable my server? And what kind of APIs do I
need to expose to enable my customers to be able to use my product? Okay. Both of these approaches are fine. They require different
ways of thinking, though. When it comes to a data based approach,
you will need to consider what kind of data do you need to store. And often you'll be thinking of these pieces of data as tables. So you need to store a video with an id. The video has a name, it has a size,
and it has some data. That's one way to think. I prefer the other approach where
customers define their problems, which are then fulfilled using APIs on the server,
which are then fulfilled by storing some sort of data in the server, in the database. And that data is then mapped onto tables like and so for this system, this is the
product we'll be using. Okay? In our case, customers are live streaming
customers, so they may be streaming from their cell phones or laptops or the tv and
we can't assume which device is being used more often. In certain countries you'll have
certain devices being used more. There, there could also be a tablet of course,
from which they're streaming this video. But that there are multiple devices which need to be catered to. This is a front end UI design problem. System design is more to do
with the distributed systems' backend part of things. There is some system design involved when
it comes to API interactions and how to store or cash data on the front end, but
that is not what we are focusing on here. We are focusing on the
backend part of the system. Okay? So these clients need to be able
to query our server in real time. So what does our server need to have? Its APIs are going to be something like
get video and you pass in the video ID and you pass in your device type. So I send you the resolution based
on your device type and you might have a particular offset that
you know, I've already seen the first 10 minutes of the video. Show me everything after the 10th minute. If I have a video API like this,
I also need to return something. So the return type could. Some frames, some video frames. Let's say each of these frames is 10 seconds long. So a single frame is what I
sent back, such that it is off this video for this device. And after 10 minutes, so 10 minutes to
10 minutes and 10 seconds, that frame is picked up and sent to the client. If you know about API design, that this API is not well named. Get video means you get an entire video,
but you're returning a frame. So maybe we can rename
this to get video frame. Other thing is if you're using a REST protocol get is going to be defined in the request itself. So g e t or here is going to say that Get Me Video is
enough to say that you want a video. In our case, like we said, it's a frame, so
maybe get frame or get video frame is what we are looking at. You notice that Wimu also has a similar
API when it comes to getting the next 10 seconds. YouTube also has an api which tells you from which point to
which point do you want to see the video? I'll give you all those things. Great.
So we have this, what else? You should be able to
comment on the video. Okay. So what you should be able to do is
you should be able to post a comment. Again, I'm using well-known concept in rest post means you want
to put something in the server, you want to manipulate some data, you
want to add some data to the server. So if I say post a comment with an ID
here, I don't care about which device it is coming from because the comment is
just data, which has to be persisted in the server. Based on that, I'm not
going to be, changing my response or changing the behavior
in which I'm posting the comment. So I don't need that. I don't need an offset, but
I do need the comment data. I need the author of the comment. I need the post on which
this comment was made. So maybe the video on which
this comment was made. So the video ID. And similarly for each requirement, we
can expose APIs, which will allow us to query and manipulate the data as we want. So that roughly defines our
server side capabilities. Okay, we haven't spoken about anything in detail. We haven't talked about the network
protocol also that we use, but roughly this is what it's, okay. Now let's go for the next part,
which is the database side of things. What kind of data we need to
store to satisfy these APIs. Returning frames is
something we want to do. Storing comments is something we want
to do and also probably get those comments. And to get video frames, we also probably
need to put video frames. What kind of database should we use? Comments are rather simple. You can store this in an SQL database
such that an SQL table having an ID with you. The data of the comments are text. being stored as a problem. The author, which is a foreign key
to a user table, right? So in a user table, you have an id and this author is going to be
mapped to an ID over here. Along with that, we have a video id. So there's a video table also,
which has an ID and some data. The video IDs are going to
be mapped in this way, okay? An example of this would be
video ID 10 is over here. This is a cricket match. Video ID 11 would be a musical event. Okay? And this is a comment table. So for this cricket match, a comment
was made in which they said, Hey, and the author was author number one. Author number one happens to be called. That's their username. So if you have to display all comments,
In the front end for a particular video. What you want to do is you want
to get all of the comments in this table for that video 10. Then you want to get the data for that
user because you want to display the username who has made that comment. And you also want to get some video data
because you have to display that on top of the page where the comment was made. So in this way, we are able
to satisfy our requirement of posting comments, posting video. Now, overall, this system is complete. We have a system which can answer queries. Okay?
This is good enough to start with. This diagram is not incomplete. It's very rough or
high level. There's nothing here which is concrete, nothing,
which is, let's say, useful when it comes to implementation. A at this point in time, we have a
rough idea of how we are going to be talking to each other, not what we
are going to use to make this happen. So let's get into those
implementation details. Firstly, on the client side, is
there something we need to do? Yes, different APIs require
different behaviors. Posting a comment means that I'm posting
this once and I'll be querying that comment maybe soon, but it's not like
I need something to keep happening. I don't need continuous
updates on the comment. Those notifications can be given
periodically to me, or maybe I don't need those notifications
at all after a few months. I just don't need
notifications on that comment. So you see over here, you
have non real time behavior. What about a video frame? When I ask for a video frame, I
usually need to ask for the next video frame immediately after that. So I'm watching a live video,
I ask for a video frame. I am sorted for the next 10 seconds. In five seconds, I'll be
asking the next video frame. . So that behavior is different. It's a more continuous behavior. Okay? So maybe we need to use different
network protocols to make this happen. What would I use for a comment? I would use the most common
network protocol when it comes to disability systems, which is hdp, HTP gives us the benefit that you
have a stateless server. You don't need to store any information
when you're handling a request. A stateless server is basically, I have no idea where you are from or what you want. Define everything in the request. Okay? Goon wants the next 10 seconds of video. What do you mean by the next 10 seconds? Who is God of San? Okay, so what should be there
in the request is here's Gossen. This is his user id. You can actually go and
look it up in your database. It's not the next 10 seconds that he wants the video from minute number 10 to 10 minutes. 10 seconds. Okay. That video length is also well defined. There is no concept of
the next point of view. And the video ID is mentioned. It's not like Gora wants the next 10
seconds of the video he was watching. No, I have no idea what we do. He was watching you define which
video needs to be pulled up here. You might think that this is an obvious
point, like what's so special about defining everything in the client itself. It looks a little tedious and stupid, but doing it is fine. Doesn't everyone do it? Not
necessarily. When you ask for the next 10 seconds of video, for
example, very often as a client, you don't know what the next 10 seconds should be. Take an example. Give me the video at 10 minutes, then 10. 10 seconds from 10 minutes. Okay? That's one chunk. Let's say you make that request. And the server is taking
time to give a response. Now you come back and you say,
give me the same video because you didn't receive it in time. The server is going to look
at that request and serve you the same response again. The other approach to this would
be the client making a request. Gimme the next 10 seconds, where
next is known to the server. Okay, so here the definition is not 10
minutes to 10 minutes and 10 seconds. The definition here is
the next 10 seconds. The server is now going to take
this request, look at the user's current pointer, and then decide
which 10 seconds to pull out. So in the first case, it'll take 10
minutes to 10 minutes, 10 seconds. In the second case, it's going
to make a decision to send you the video from 10 minutes, 10
seconds to 10 minutes, 20 seconds. What's the benefit? The client didn't need to know which
part of the video it wants to watch. , it lets the server handle it. It makes the overall network more
efficient because the client is not making duplicate requests, and it makes
the client code a little more simple. You don't need to define everything
every single time you let the server figure it out with context. That's a difference between a
stateful and a stateless protocol. HTP is stateless. Its benefit is that the
server is kept simple. If the server crashes, there is no
context on memory that is lost in the server. If you forget where the client currently was
pointing to, then this is a serious problem, right? You can go and store it in your database
instead and make a service stateless. Therefore, the protocol that you're going to use htp can be stateless. The best part about this is
you can add new servers without any issue. You have no state being stored in server. So whenever a new server pops up a new request comes with total context in the
request itself, which you use. To query your data, right? What about video frames? Do you want to query video frames? You want to get the next one? What kind of a protocol will be nice? You can use HTP here also, but
a much better protocol would be something which is designed for video
transmission, because in video you have to consider some other parts. What happens if I have a mobile device,
which does not have much resolution? What happens if I have poor bandwidth? What kind of video am I transferring? If it's a live streaming thing, then the
current video is the most important video. What happened previously
does not matter anymore. If I missed that packet, if
that packet was dropped in the network, let go. If it's a, if it's a live streaming lecture
where you know, there needs to be full context to understand the
next part of this lecture, then you probably want to send it properly. You want to send it over a reliable network and over a reliable protocol. Okay. So if you want a reliable protocol,
a TCP back protocol is a good idea. If you want a realtime efficient protocol,
then a UTB backed protocol is a good idea. So over here I'm going to use
a protocol like Web RTC here. Web RTCs, a peer-to-peer protocol. So you actually are able to send
video from the server to the client. Certain protocols have a
client to server expectation. So the client is the only person who can send data who can make a request to the server. The server cannot send data
by itself to the client. Okay? So what I'm doing again here is
using a peer-to-peer protocol for the video, but for comments, it's
always gonna be client to server. So you see that network protocols
are also important when it comes to designing systems. Okay, finally, on the server side, what
are some considerations we have to take? Similar to the server side we have to think about how are we going
to talk to the server, but. Most database solutions, let's say MyQ
or post case, define exactly how you're going to be talking to the databases. So this is a TCP backed protocol
usually, but the protocol is defined well, so we don't need to think of that
when it comes to talking to service. Elastic Search, for example, has HSTP based protocol Cassandra, Amazon db, Amazon Diameter db MySQL, post Grace, all of them, the protocols are identified. The problem then becomes which
database solutions should we use? Because there's a ton of solutions
out there, and they have tradeoffs. We could store data in the MySQL database also, but it's going to be expensive, and
it could be potentially very slow also. So what is this video data? It's effectively a file. So storing it in a file
system is not a bad idea. And you don't want to build
a file system yourself. So you want to use a
well-known file system solution for this maybe hdfs. That's one solution. You could also use a video
hosting solution like wimu. Yeah, it's off the shelf. You can just use it. You can use wimu, by the
way, to host events also. So it's best to mention this in the system design requirement, like maybe an enterprise solution from
will take care of the entire thing. But if you can't do that,
if you can't have a live streaming requirement and you don't wanna lose your job, maybe because then you can use a file system like sdfs or s3, Amazon s3. The benefit of S3 or
sdfs is that it is cheap. It's easy to query and you can store very large files inside it, okay? In a
database. Yes, it's also, it's not cheap but it is easy to query and you can store very large files inside. The capabilities that a database gives you in terms
of updating data or querying data may or may not be very relevant
to you when you have a static file, which is a video file, okay? You're primarily looking at low cost
when it comes to storing video solutions. What about the user or the comment table? These two tables can have a
skill solutions back in them, let's say MyQ or post quiz. You may say that a comment
is a complex data structure. Every time, a requirement changes, the comment table also needs to change. Maybe you want to persist a lot of data
in the comment table per entry. You want to persist all the replies
of the comment also in the same entry. So on MySQL database or a SQL database
is not what you're looking for. You're looking for a NoSQL database,
which is not ideal when it comes to transactions or relational. Joins, but you don't
have that requirement. You're looking for scale When it comes
to comments, you just want to persist that data in a key value fashion. So that's no sequel. Great. This is very rough idea of how we are going to be designing,
which is satisfying the requirements of the system, and also defining
the protocols or the solutions that we'll be using to make it possible. For example, we talked about web
RT Cstp when it comes to network protocols, and when it comes to
database solutions, we talked about MySQL and a file system, a design can go in more
and more depth based on how important that requirement is. So doing a recap, we see that we have
a system, a very rough blueprint even now, which talks about how our customers
are going to be accessing our APIs and how those APIs are going to be
accessing the data in our database. The data in our database may be being
filled by customers like in Facebook. Usually the customers are the people
who are filling the data, or it may be filled by an external service. In a live streaming system, it's probably going to be a really high efficient camera, which is
going to be recording the video live and persisting to our database. Okay. The network protocol that you probably
want to look at is rtmp Realtime Media Protocol which is a guaranteed protocol. You don't lose any data
when you are shooting video using this. And the idea is web rtc, you might lose some data. That's okay. It's the end user watching a live stream. They want data quick and real, but at the source
of everything, you don't want to lose any data because everybody else will lose
the data if you lose it at the source. So a highly reliable network can
be set up over here in a high bandwidth, expensive network because, that video camera is really going to need a
high bandwidth, expensive network to process that amount of data into your
database and then this data can be sent. You're seeing that I've skipped a lot of requirements. That I can't take this high quality data
and just send it, broadcast it to everybody on the planet. What I need to do is transform this data. Okay? So now what we are doing is we have looked at high level what the solutions are going to be and now really getting into the nitty gritties
of designing the system. So the first part of this is how
do we take this raw footage, this data over the higher level network,
which comes to our database, our file system, and convert it into data
that we have to serve our customers. Four adp, seven 20 P, so on and so forth. You can't serve eight K definition. So how do you do that? There needs to be some sort of a
transformation service over here, which is going to be taking this live
stream and converting it into different resolutions. 10 80 p, which is full hd. You have seven 20 P, which is hd four 80 P, which is decent for quite a few mobiles. And I don't know if this is something
you want to do, but if the live part of your stream is really
important, then 1 44 P is also okay. Because as long as people are getting
information and they're able to see roughly what's going on, they're going to be happy. Sometimes these resolutions are defined well in
the product requirement document itself, the p I d, because after speaking
to customers and getting their real feedback, which resolutions are acceptable or tolerable for the
customer, there might be some customers who ask for premium 4K transfer also. Okay, so that's fine. You're watching it on tv, then you probably
wanna watch it at 4k. Okay. This raw video needs to be
converted to these resolutions. How do you do that? So the first thing you'll do is
you'll break this video to segments. This entire raw video is going to be
broken into segments of 10 seconds. So you collect video for 10 seconds, chop
it off, you break that into one segment, and then you give it for processing
at point number 0, 1, 2, 3, and four. So let's say you have these zero
to four, which is five programs, which take the raw video and
convert it to different resolutions. So if you pass in 1 44 p,
then a particular program picks it up. If you pass in four 80 p then another program picks it up. And maybe you have the same program, which is
running in, in five concurrent pieces and five threads based on the resolution that you pass. It converts the video, the ten second
video footage into that resolution. Similarly, you can think
of different formats. So if you have a device, which is
an Android device, , it probably has to see the video in a different format. Okay. It can't read all sorts of video. It has some formats which are well defined
inside it in the device, so it can read that format. Similarly, apple devices might
have a different format. So these formats define how you're
going to read this piece of data. This video data, which is going
to be sent over you through the network common video format is H 2 64. Apart from H 2 64, you might
have your own proprietary formats. Let's say you have a format which is more
efficient than H 2 64 or for your particular type of event, a music event. You research and you find that there's a way to store video more efficiently. So this is more of a very large scale
problem, like Netflix has its own formats, but if you're a small company, edge 0.2 64 is very good. Okay? So different resolutions. Different formats. We need to take our raw video
footage and convert this into a combination of a resolution and format. So over here we use a design
pattern of map reduce. I won't go into too much detail of
this design pattern, but the basic idea is you can take one video split
into pieces, which is 10 seconds long, and send it to different
servers to get different outputs. Okay? You might have something else also
in the process, apart from just transforming the type of video. Maybe you want to compress
it here in the next step. That can be in step two. The servers being used here might
be s3, S two and S one. So you see S one is
being used here and here. S3 is being used here and here. So there is no there is no guaranteed execution server that
you have for one part of the process. Any server, which is free. We'll pick up a task and execute it, whether it's compression
or transformation. Again, finally you get three
different outputs, which you can store in your database. So this would be the map reduced pattern. It's a very high level overview of this. It's almost a ca of the map
reduced pattern cause there's no reduce in this process over here. But have a look at this design pattern. It is useful when it comes to
taking a single piece of data or a data lake and converting it into
the data streams that you need. Next, how is this data actually
going to go to the users who are looking to view it? So this data has to go over here to the server which is exposing these APIs of get put post. When you query this data using a
protocol, you should be able to get it. We spoke about web RTC earlier, which
is a peer-to-peer protocol and it's really good for video conferences. Because multiple people are
streaming video together. However, in this case, it's a
broadcast, not a conference. So we can take a step back and say,
instead of web rtc, we are going to use a protocol which is more suitable for streaming. There's a couple of protocols here. Also. The most popular one would be EG
dash, EG is a popular protocol. DASHER stands for Dynamic Adaptive
Streaming or sstp, which means that depending on your bandwidth,
depending on the network that you're on you are going to be able to see high quality
video or low quality video. For
example, you are going in a car, you're watching the video
at 10 adp, you enter a tunnel, your network is really poor. So you start watching it at 1 44 p as
a client, you don't want to handle it. You want the network
protocol to handle it. It's
defined in impact dash. Very similar protocol is hls. This is
useful for iOS or MAC devices. Okay. And finally, what kind of data do
you want to store on the server? Do you want to store any data at all? Do you wanna make it totally stateless? Yes, statelessness is useful
when it comes to request serving and keeping context for every user. But for some things you can keep some state when it comes to video, you can cash the last 10
minutes of video on your server. So anybody asking for video in the last, that video, which is inside the last 10 minutes,
is going to get it from the cash. Instead of you making a full network call
to the database all the way over here, you instead server from server itself. So you avoid this network call
saving time and bandwidth, both. So these are the high level
considerations that you have when it comes to a system design. Like we said, this is a large
scale distributed system. Our assumption is that you have
a lot of users because of which. This much planning and this kind of a design makes sense cost-wise and engineering effort wise. We mentioned that there needs to be
fall tolerance here. There needs to be performance here. So you can use things like CDN solutions. So you can use content delivery
networks to persist some static data and have the clients actually pull
the starting data from here. Webpages are a good idea. You can also have some video data posted on the CDNs, in which case authentication
becomes a bit of a challenge because you don't want everyone to be able to see this data. And does the CDN do the authentication for you? Do you write some code and then
host it on the cdn, which does the authentication? These are the challenges that you have,
especially when you are designing a large scale distributed
system where performance is key. Okay,
so this is at a high level the things we need to consider when
we are designing any sort of a system. Where the requirements are defined. Maybe in the product requirement
document, maybe it's a startup. It's at a very, nascent stage. There is no requirement document. You're directly talking to the
customers and coming up with a technological solution, which is
going to satisfy the requirements. The important things to notice are
that you define the requirements as abstract concepts like objects. These objects then need to be able to be manipulated
and queried using APIs on your server. The data representations need
to be stored in databases. Once this high level blueprint is done,
we start thinking about what exactly do we need to make this system possible? So what protocols are we going to use? What kind of database
solutions can we use? In some cases, you also think of what. Intermediate design pattern solutions
you can use, like you have load balances, you have message queues. These are all design patterns. And these have been converted into
tools by various companies. Redis, for example, provides cas load balances and many other things are also
provided by cloud solution providers. Aws, you want to use these already
well-built, well-tested solutions instead of rebuilding it yourself unless you know the trade off in terms of cost and performance
is significant and it's worth it. Finally, once you have decided on the
tools, you think about the interactions of these tools and the interactions of
these services to meet your requirements. So we thought about how are we going
to take a video and show it to the user? Oh, we have to transform that video. We have to convert it into different
formats, otherwise it won't be visible in different operating systems. We then thought about how we are going
to stream this video to the server. We then thought about how we
are going to stream this video. To the client device. And finally, we are touched upon actually
streaming the video through a CDN solution instead of from your server, because
maybe your server is in the us and even if it is, close to the person who's streaming, you don't want
your server to handle all that load, all of that load all for some static content. You want to be giving that load
away to well-known solutions like content delivery networks, which
can be tied up with their ISPs. You can have a look at
how Netflix does it here. Another part of system
design is low level design. This is in contrast with what
we just saw, which is high level design where we took different components of the
entire system and thought about how those components are going to interact
with each other using network calls, using APIs. And then you're going to be sending data
from one place to another. But largely if you look at these systems, they own the business data related to that service. Now what we are going to do
is take certain functions of these services and try to code it out, which means we
are going to go into much more depth. But because of that, we are probably
going lose some breadth. Okay? When you, at the high design
of a system, you are looking at everything from how the users will be interacting with your gateway. The gateway is going to be
actually sending these requests to your internal services. The internal services
are looking at databases. You can't code all of this out. Even in, at work. You can't just have this all
in one document, one page. So you take small chunks of the system
and you try to elaborate on each chunk. That would be the Louis design. The core functionality of live streaming
is to be able to view a video as a customer, as a person who's paid. Maybe it's a subscription,
maybe it's a one-time purchase. But what we are looking at here is how
does a user fetch video? View video and fetch more of the video
and continue doing so, video ends, right? We are not looking at onboarding
the video, onboarding a movie or how the livestream is going to be moved from the
source camera to our system. We are looking on the other side,
the user side of how it's going to be consumed by the users. To do this, we again, have two approaches. One is to think of the code in the start. So if you are an object oriented
programming language person, you might think of what kind
of objects do I need to have in my system? How are these objects to interact with each other? Is there any kind of inheritance I have
to take into consideration? But as earlier, this is a little difficult
to think of unless the requirements are extremely well specified or they're
so generic that you have to look at the data that you're storing for. Before you think of the function, I
would suggest you think of the user. So even over here, we think about. , what are the actions
that a user can perform? So what I'll be doing is on my phone
or on my desktop, I'll be scrolling to a particular place, watch the video. Okay? So I'll be scrub the video up to a
particular point that can happen. I can also click on play from the start. So start the video at the beginning at timestamp zero. Both of these are very close to each other. So effectively the action is play video at timestamp X
under the functionality that I need is to pause the video. Okay. In which case, what do you do? Do you continue fetching more
segments from the backend or do you stop touching segments? So the behavior is dependent on the
user experience that you want to give. If you don't wanna clog up their mobile
with the entire video, while they've just kept it, in the background then you have to be smart about it. Probably the next 20 or 30 seconds or
one or two minutes can be buffered. into your device. So that behavior also has to be
coded inside, despite pausing. We go for the next two minutes of video
from the place that you have seated right now under the important requirement
is that depending on your device, the video quality, which is going to be
fetched, is going to be either hd. Let's say you are using
another desktop or for atp, let's say for low resolution devices, and one final feature is up
to what point have you played a video? Which means if you had a one hour long
video, let's say there's a cricket match between India and Pakistan, and it's
one hour long and you are seen up to the 20th minute, if you log out and come
back and want to watch the same video, we should start at the 20th minute. We should store that somewhere so
that you have a good user experience. You might want to cash the video like we
were doing earlier, the next two minutes. You want to hit that buffer so that you start playing immediately when the user
has come, or you may not want to do that. You might want to do that just for some videos,
which have been recently watched. The older videos then can be kicked outta the cash. That What I'm considering here are
memory optimizations, user behavior, and API calling. This is largely what lu design
will mean when it comes to interacting with services. Depending on your level of seniority,
and depending on the use case, you might have issues of concurrency, latency
and throughput also come in here. An example use case will be
the workflow that we had for chunking and transforming videos. I'll leave that as an exercise to you. But there are cases where you want to increase
your throughput to the maximum, so you don't really care that much about latency. How quickly do you get those videos
through the pipeline? That's not your concern. You want to make sure that
your pipeline is continuously functioning right. Another cases where the moment a video
comes in, the new video. You want to respond to it as
quickly as possible with the assumption that video is really important. So there's going to be some sort
of context switching over there. Okay.
I'll leave that as an to you. I won't give you too many details, but it's
interesting to think about how these different low level design requirements
affect the code that we write. Similar to what we did earlier,
we are going to take a structured approach to solving this problem. The first tool that we like to
use is called a use case diagram. As a name suggests, we think about
what are the use cases that we need to fulfill for every user. For example, you can see
three actors in this system. An actor is a person who can do actions,
so an admin can do actions in our system. They can add videos. Maybe a videographer is a person who
can upload videos, so it may not be necessarily that every video which has
been uploaded will be added to the system. Before it gets added, it
has to have some metadata added to it. What's the description? What are the timestamps what kind of video is this? How do you tag it? That might be taken care of by an admin. The video rougher shoots the video, uploads it specifies the quality and everything else. So that's a separate actor and a customer. The person who consumes the video, we
said that this is the most important person for us, so this is the only actor
that we're going to be thinking about. Let's get rid of these two actors. We mentioned that the four things
that we want the customer to be able to do, let's note them down. The first thing is to be able
to play a video from her. Okay? Then comes another requirement, which is, This is them coming back to
the old video that they watched partially and they want to start watching our game
from the left of timestamp. We also want to, we are the maximum quality allowed by network and device. So if you're on a low quality network, that's okay. You low quality video is also fine. But if I'm on my home wifi I wanna see the best quality video that I
possibly can have paid for the subscription. There's things which are also happening
in the background. Of course, like we said, concurrency,
fault tolerance, throughput. These are things which the
end user does not need to think about. You might say latency is something that affects the user. It does. But we are not looking into it. We are assuming that all of our requests
have to be answered within 10 seconds quickly enough. And this brings us to our next point. We need to continuously buffer. Our video, right?
So have nonstop play when watching videos. This is assuming of course, that bandwidth is not messed up. If your bandwidth is messed up, you
can't prefer, so that's not a problem. But if my bandwidth is fine,
then for the next two minutes, I should be able to watch the video. Okay? This is a customer if we can fulfill these requirements, they're
going to be a happy customer, which matters a lot as an engineer, okay? These would be called use cases, right? You can have many use cases. Some of them are core use
cases, some of them are not. When it comes to system design, the
expectation is that if there is a PRD that has been given to you, a
product requirement document, usually it's just on one use case, right? You add a feature and
each feature is important. So we'll take all of these and
mark them as reasonably important. The next step is to
convert these requirements. Into classes and objects. This is where things usually go wrong. When you look at the use case diagram it looks like everything is to do with the
customer view at maximum quality allowed by networking device. But is the customer actually doing that? Is the customer saying that, please
gimme the best quality video? No, that is obvious. That part is obvious. So who's going to handle that? If the customer's not going to handle
that part of the system, then the system has to handle that part. Okay? So there needs to be another actor in this
system, but it's not a, it's not somebody who does actions like physically, they're
going to be interacting with your system. So we are maximum quality offered by the network. Who's going to handle that? There needs to be some sort of a controller or the brains behind. How much video should you send? So I'll just call this, okay. sometimes the entire service,
the entire functionality can be handled just by using a tool. So we offload the problem to a
tool, or in our case, what would be ideal is to use a network
protocol, which takes care of this. Depending on my device and depending
on my network requirements, you take care of the bandwidth. An adaptive protocol is
going to handle this. Sstp dash will handle it. So we'll assume that this entire service,
which was going to look at the user's requirements and then offer them video particular bandwidths,
can be taken care of by simple protocol. So the speed limiter does not need to exist. Instead whenever you're connecting to our system,
it's going to happen over TP dash and that is a pretty big deal. One entire use case taken care of just cause you
know what tool to use now in an interview. Scenario you might be asked, how
does S STP dash work exactly? How does it work internally For
that, you can either read some papers which help you learn the protocol in terms database internal are also very similar. Operating system internals
are also very similar. Or you can guess in this case,
you should probably specify that, hey, I'm just guessing over here. But I think what's going to happen
in this adaptive retreat resemble exponential back off and dcp. Now for the next bit clear video from a
timestamp, this means that every video needs to actually store a corresponding
timestamp for a particular user. Okay, that can be done. We need to have an object here. Our service, which is video
server, a video consuming service is going to be used by end users. Play a video from a timestamp. Okay? So this has to be taken
care of by the video zooming service. So that will be play video for a user. And this is the video I, okay, so I'll
just remove the play for this. And I'll also mention the time
stamp that is being busti. You see that? I'm still thinking in terms of APIs. The clearer your APIs are, the
easier your lower design will be. The more you think about how the
users or each feature is going to be implemented using the services that you have,
the easier your design will be. Okay? Go back to video and watch from left off timestamp. This is quite straightforward. You have seek for this user and this video,
what is a seek position? So let's say we call it
get seek position, but like we say if you're using a. Rest api. Then you can just, you can mention
in the method it's get, so that will send you back the seek position. It's best to mention the return type. Also, like I said the clearer you make your API, the
better it usually is. So video frame will be sent back here. And similarly, what will be
sent back here is the timestamp. So what's going to happen is when you
come back to a video, which you have left off, you're going to first sync what
is the position where you should go to. So that will give you a timestamp. And then you say, okay, play this video
for this user from this timestamp. Alright? Have nonstop play when watching videos. This is interesting. We need to get frames in future. So what do we do? Do we save play here or do we
save, get video frame for a particular user? And a video with a given timestamp. Do you see that these two are very
similar since what this API is also going to return you is a video frame
and you're gonna stack these video frames together and buffer them
into the video, into the phone. Yes. I see a very close linking between this API and this api. At this point, you ask yourself, what does the product need? Is play a different action from
fetching the future content. And here I'll make a decision
of yes, it is different. It's strange to think of because
they're doing the exact same thing. What they're doing is they're,
the user is saying, claiming this video from this time stamp. That's okay. And the other one is that the player,
the video player is saying, get me The video frame for this timestamp looks
extremely similar, but the business use cases are very different. If the user says, get me a video from
this timestamp, it means that they are probably seek to that position. They like what they saw their, or they
found it exciting and they clicked on it. So it's very different behavior
compared to, Hey, get me the next two minutes of video in the background. Okay. That's one very important thing. When you are designing any api think about who's using it, what is
it being used for, right? And then think about the
common functionalities. So these two APIs are very similar. And on the server side, what you would,
I'd really like to do is just merge these two APIs. The client should figure out by itself that what video frame doesn't want to pick. , so maybe you cut down on the APIs that you have
to maintain by replacing the current API of play
with what we have here. So maybe every time you play from a
particular point, we should sign an event to the backend service saying that
this is an exciting part of the video. The user actually came to this
timestamp, seek Q and clicked on the video. Okay. That there is a seek api, which is primarily for when you're left off a video. So if you left off, came back and started
watching from there, maybe it's not super exciting, it's just the place that you're leading off from you, you wanna continue from. But during the video while watching
it, if you came to a particular position and started viewing,
that requires an event that shows that, this part of the video is sought after. So it feels like you're splitting
hair here, but from the side of the business or from the side of analytics,
they're very different use cases. From the side of engineering,
it's the exact same behavior. Someone asks you for 10 seconds of video, you
give them 10 seconds of video. But for analytics or for the
business, this part should be converted into a three load. This part is just for user experience,
so it depends on you As an engineer, I personally would take this API and
I would see who's making this request. So a flag would probably say
that the user made this request or the device, the mobile made
the request all good buffering. And depending on that flag, I would then file an event for showing interest in this part of the video. Alright, so that is the use case diagram. We can now think about the class diagram. Let's draw the class diagram out. The first class that we need is a video. Two things that we need to store for
every class are states and behaviors. States are data that an
object needs to perform. Behaviors. For example I am speaking right now, so I need a throat. I need a tongue. I probably need a brain to speak and what
I'm doing right now is teaching. So the behavior will be
teaching while the data that I need is my body parts to actually work in tandem. Similarly, you might have a video
which has certain data that it needs. So that would be the bites of the video. Let's say frames. Cause we have been using
this term that's in every video. You might have
some metadata who's the uploader of this video? How long is it? What kind of tag do you want to add? So on and so forth. , what operations can you perform on the video? You can get a
frame, right? That's pretty much it. Okay. You can't add a frame after you've added a video. You can't do anything apart from just getting a frame
from a particular point. So that's the simple class. You also have the class of user. That I'm not focusing on the class diagram. So user has a name, an email more metadata around them, but nothing else really. The most important thing, probably to
be an ID in your case could be an email. A video also will have an id and for
the user, you can probably get there. Id, okay. Very simple class again, which
brings us to the most exciting class, which is watch video. Okay, so a watched video is
going to be an action by a user who's basically watching a video. So this needs a video id. Which video have you watched? Who is watching it? That is the user id an ID of the
action so that you can refer to it later. Up to what timestamp have they watched this video? So seek timestamp. And coming back to the use cases,
we might want to buffer up to a particular point, but let's assume
that the client handles that. The client knows how long, how much
video you have buffered already in the device. Going back to a particular place is possible
because of the timestamp. The final class is
video consuming service. This also requires a class
to be explicitly shown here. So that is video consuming service. Let's first define the
behaviors that this class has. So that is API one. And if you two, and to do this, it just
needs to set watch videos, which user, okay? That takes of our entire class. You've seen, this is a
very simple class diagram. What is more challenging
actually is the sequence diagram. Okay? How is a user going to watch a video? So this would be called a class
diagram where we have defined what state and what behaviors are possible
for each object in our system. Fine. These two diagrams of a class and use
case are sufficient in most cases. However, in some places where
the interaction is complex, like over here , you need another diagram which defines
the sequence of actions. In this case, it's not very
clear how the user is behaving. These are the actions that they can do. Okay? These are the things that you need
to store and the behaviors you need to expose for the action to happen. That's also right. What is the sequence of actions? What happens first? What happens second? That's not clear. And for that you need a sequence icon. So let us try doing that. This is what you call a timeline. The y axis basically is time, but in descending order. So you have three timelines here. One is that if a user does certain
actions things happen to a video, okay? And the video consuming service
actually uses a video's current. Seek time to get the next frame. So this might not just be a video,
it could be a video service, which is providing an api to consume the next frame of the video. We'll see how this happens. First the user sends a message. What message is this? Like we said earlier this is going to continue your information
and the video information. This is responded to by the video
consuming service immediately because it knows up to what point has a user, the video. So that is you return a timestamp. Now the user is going to make
another interaction, which is play or get video frame, right? So the video frame as this user or this video at a timestamp, and the tool I'm using here is lucid. As you can see, if you use tools, it helps
instead of reinventing the wheel or, doing it in a
half way. This is much better. The video consuming service
cannot get you the frame though. However, it's best that the user interacts
with the video consuming service, so it could send a message to the video service and say get me the video frame Okay. The important thing to notice
here is that the video service has no idea who the user is, whether they're
authenticated or not, to watch this video. And it just gives you a timestamp. Okay. I could have taken this request and
sent it directly to the video service. Okay. And the video service could say, sure, I don't care about what, who, which user is is trying to access this video. Just tell me the video and the time
stamp and I'll give you the response. But authentication would
be a bit of a problem. So that's the reason why I'm
assuming authentication is going to happen here for every frame that you ask. And then the video timestamp, as we mentioned,
I'm going to be sending back a response of a video frame. I could also send back multiple
frames in the hope that you can use them later. But I'm assuming that the
frame lasts for 10 seconds. And if you really feel like you can send a
request in between and get another frame. So it's a single frame and
this is then sent back. To the user. Okay, that's the interaction for a while. The moment you get a frame,
you want to repeat this action. And so the data keeps flowing. Very similar to how a TCP connection works actually. Initially you have some sort of a handshake, right? You get some initial information, set
up the connection, and then what's happening is you're constantly pinging. One of the drawbacks that I can see here in my diagram is that the video
consuming services are intermediately. There's so much communication happening
between the video service and the user. This is just wasting time. I don't know if that is worth
it, if just authentication. is worth completing our flow this much. You can put some authentication here in the
video service itself, so you'll save on two network calls, which is a lot. So yeah, maybe I should just take this, put this here and take this, put this over here. Don't go into the video consuming service. That's what we'll be doing in the code. So now we finally jump to coding. Remember to use the diagrams that we have made
as a reference because most of our thoughts and most of the interactions
are documented well over there, it would be a waste not to use it. The whole purpose of making these
diagrams is to take away the thinking effort required by coding. Coding is basically us typing
out or writing the things which have been mentioned or thought
through in these diagrams. Okay, that will speed things and also help you avoid mistakes. So the first class that we talked about
is a video class video. The state that we talked about is it has an id, it
has frames, so maybe a set of frames here, but then the frames are also ordered. So an area of frames make sense. And then some metadata. So in our case, I'll
just say meta data, json. Of course, in the real world, you
would actually have the creator of the video, the uploader of the
video, and many other parameters, the length of the video, everything else. But we don't need that right now. So I just create a class. That an ID helps you create classes much better. These classes that I'm creating are
just plain Java objects. You can use the same thought process for c plus for C shop and many other languages. Any object or in programming language. If you're coding in Python or Scala it might be slightly different, right? But the logic is very similar. Come to the point of defining things as
objects and the interactions between them. And finally, define the states
and the behaviors of every object and then put them out. Okay?
So this is largely language agnostic. This approach. You
have a video which needs a thing. A frame is going to be a bunch
of bites, so that's okay. And maybe it has a timestamp That is in thanks stamp. You also have a class of
user from the diagram. A user has a string id, it has
a string name string email. These are basically metadata. And we talked about some behaviors
that these classes need. So written return of frame,
when someone says get frame and which frame should we return? You should return the frame which is being called for a particular timestamp. So you see our class diagram
missed that we can go back and fix it or we can fix it in code. I would suggest fixing it in both
places because if the documentation is up to speed, it helps in timestamp. You can also have timestamp as an object. for, making sure that the inea is greater than zero and all
that, all the validations that you want. But I'm just keeping things simple. Pass on a timestamp. Here we are going to be
iterating over the frames. This can be improved of course, but
what you want to do is you want to go over the frames and return the timestamp and return the frame. Only if the timestamp of the frame
is less than the timestamp you have asked for less than equal to. And there also has to be some
sort of an end times time. So we can assume that every frame is of 10 seconds in which case the start time is this. This has to be less than equal to,
and the end time mentioned over here, plus 10, is greater than all equal to this time. Okay. Or rather greater than then
this frame belongs to this. If you don't get the frame,
then you just return none. You can also throw an exception here. Let's see. I would say throwing an exception makes sense because the meaning of an exception is that I didn't
know how to react to this. None would mean that, oh, you get blank. So if a timestamp of 20 hours is sent for a video,
having just one hour, a blank frame is not exactly what you want. You wanna say, oh, you're out of bounce. So index, under bounds exception being thrown here is going to be okay. Now there's another small
problem I see here, which is the magic constant of plus 10. You don't really want to do this. It doesn't
look good. . Because tomorrow if your frames get optimized and can
store 30 seconds of video, then you have to go and change it in the code. So you might have this in a constant,
in the video itself, let's say no, a class constant, so public
static in frame time equal to 10. And then what you have is if you
need to make any change, then you make a change in a single place. But even that would not be suggestible because this is
something related to the frame. So let's take that, put it in
the frame class, and then what you have is framed or framed. Time has to be this, but that is also, that's fine. It's good. There's just one. data point for the entire class. All of your frames have the same amount of frame time. It could be different, it could be that these frames some frames are really high quality. There's a lot of movement over there. So it's not even 10 seconds long. It's just two seconds long. And some frames are, like this coding frame over here. You can store 10, 15 seconds of video because
there's not much happening over here. There's not any face moving or something. So I'll just take this and
I'll leave it to the object, not the class, but the object. So it does not become a class property,
which is going to be for every object. It is defined by each object. So any timestamp, and then this becomes start timestamp. Alright, so this is. This
is better. I don't need to say, I don't need to
mention start timestamp. I just said timestamp has to be greater. Okay. This is much more flexible code. This is what will set you apart from just the portal. You're going to be a engineer then. Okay. So now we have a user you can get their id. This is not really related to
what we are doing, but that's okay in get ID and return the id. Okay.
The next class is more interesting. There's watched video, which
is a class here we have the ID. Of this action we have the video id. We also have the user who's actually
watching this video, who's doing this action, and we have a seek bank stamp. So that is seek tank. Okay, we mentioned that
there's going to be a method to get seat time. Important to notice that APIs having
rest have the keyword, get inside them often, like you can't mention that
you're posting something or getting something, but the objects here can have
a get of their own because you don't exactly know what's happening, right? It's not a HTP API here it's a Java object, so it gets 10. Makes more sense here. So this defines everything. Except for the last couple of behaviors, which are
actions by a user or actions by a system. So the video consuming
service, show me the seek time. This is going to be a public ink, seek
time, return for this watched video, seek time, had a user ID and a video id. Let's assume that watched
video is a database call, and once you have this watch video,
what you want to do is you want to get this done and return that. Okay. For this to work, we need to have a database, which we are going to
create as a dummy object. Okay? Create that and turn on. Okay. This is great. It gives you the seek time. And then finally, the other
API that we look for is class. We do service. So the final class is going to be using
a file system, let's say, because videos are usually stored in the file system. And let's define that
class, this class file system. Okay. Now there's a public method similar to the previous
service, which is going to be returning a frame. And that is get frame for a particular video
with a video ID and a timestamp. Okay.
And how's the interaction going to happen? We first need the video, so
let's say our file system gets us a video with this video id what? Once you have the video you need to return the frame
corresponding to this tank stamp. And that's it. That's pretty much it. You have something backing your services. Some sort of database or file system. Once you have them persistent, you can always get
to them and pull data from them. Once these systems are being
backed by a database, then you can actually manipulate this data that you have by exposing APIs. And the best way to manipulate the data
is to abstract them out into objects which have their own state and behavior. So the code is simplified and you
can reuse a lot of the behaviors for different types of use cases. This would summarize a
low level design process. Alright then that's a reasonably well
detailed introduction to system design. If you want to look at more videos on
system design, there's a free section that I have on Interview Ready, which
is all about the design patterns and the basics of system design, load balancing, rate limiting charting and scaling. And if you want a more advanced version
of system design, I have a paid section in that same course. I would suggest go ahead and check
out the free resources first. If you really like it and you
think that you know it's time to level up, go for the paid section. All the best.