Hey everyone! Today we are going to
talk about content delivery networks. And specifically, let's take this example. Let's say that you have users from
three different countries, India, USA, and Netherlands. And
we also have a server, which has HTML pages that is
served for all of these users. When you type in facebook.com, you get an HTML page through which
you interact with the server. Also you have databases. These databases are going to be storing
information like user information, session information, anything which is specific
to your application and
dynamic is going to be stored here in this database. While you have a lot of static
information like HTML pages and so on. The problem is that whenever person
needs to connect to our server, it goes and hits our server to ask
for an HTML page. So this is fine. I mean, if you have a few
users that you're catering to. But not only could you
make this much more fast, you could also make it
much more efficient, by using ideas that we commonly
use in computer science. So firstly, you're seeing that there's a lot of
distance that'll be travelled every time a user is actually calling
for a single page. So can we use the first
principle of caching things. That's idea 1: we want to cache static pages onto the
server so that we don't need to make this call every time. Maybe this
is in a distributed file system. The second thing is that you
may have different types of HTML pages that we want to send to different
types of devices. So this is a desktop. This is a mobile. The HTML page for a
mobile is going to be a little more than, let's say, smaller and longer. It's also going to be maybe more
responsive. Depending on the device, you are changing the
requirements of the page. So there is customized
data that you want to send (and this is according
to device or location). So how many devices do you
have for the application? Maybe you note down 50 devices or
maybe you note down just 5 devices: Desktop, Laptop, etc... You
get five combinations that way. And for each combination you're
looking at the location. So there's, let's say 200 countries. So you have at most (if you're serving
all 200 countries with unique webpages) then you have 1000 unique data
points that we'll be serving. So, which is a manageable number. You can actually keep this in the cache
that they we are going to be keeping. The other thing, point 3 is
that you want this to be fast. You want your web pages to be
served to the users quickly, because if they're decided on buying
something on Amazon, let's say, if it takes two seconds to load their
interest in the product will reduce dramatically. The same thing
happened to the Google search. When you open a new tab,
and you type something, you want the search to complete
quickly. Otherwise it's irritating. It causes some cognitive load. So you want this to be relatively fast. So taking this stuff up, what principals can use? Firstly, we don't
need to keep the cache on the server. We could keep a specialized global cache, which is going to be taking all these
requests for HTML pages and serving them. So before this over, you have
another entity, which is a cache, and this is just taking
static information. Let's say that we don't want to deal
with the problem of dynamic data, and us invalidating the
cache, and stuff like that. We just want to use that information.
Sounds simple enough: "write once", "delete never", "update never" cache. So all the requests for getting
an HTML page or for getting static content are now redirected
to this cache and all the other more interesting requests are being
sent directly to the server, which then consults the database to get
relevant information and send it back. There's one possible problem with
this cache. Whenever you see this nice bottleneck in the system, the first thing that should probably
strike you is that this is a single point of failure. If this crashes everything
crashes, the whole system collapses. So what can you do to
mitigate this problem? We can make it a distributed cache! All you need to do is draw a couple of
extra boxes on the background and it becomes a distributed cache.
At least on the whiteboard. You have a distributed cache. Let us assume that distributed
consensus is no longer a problem: you have Paxos and you have
Raft to take care of that. Even then we have some issues. If. We take users from India, they are
only concerned about five pages, five Indian pages. And we are storing a lot of irrelevant
information in this cache for no reason. When it comes to Indian users. So if you going to split this
cache (we can horizontally shard, that will be sharding), based on country. Based on location. So you might have states within the U.S.
Maybe California and maybe Texas need to be shown different pages. But
you have states within the U.S, which will be served different
pages based on the location of the user. So we understand that this
cache is a distributed system. It has this group consensus. But more importantly you're taking
requests from different users and treating them differently. So the green user is going to
the green part of the cache (one of those boxes) and the blue one is
going to the blue part of the cache. So one instance. This makes sense. However there is still a problem. The problem is that you have the
distributed global cache in your cloud (in your system). Which is probably hosted in
a single country like USA. So the Indian users are being sent
to a group of nodes which are caches. But it's still not serving the purpose. Because the Indians are
still sending a message, which is travelling all the way that
the US and then coming back to them. Just to get a silly HTML page.
So you don't want to do that. What you want to do is: (1) You want to
either construct a data center in India, which is how things. Seem to be going, or you can (2)
Create a cache in India. Through which Indian users. Can access the HTML page. With that change you are able to serve
Indian users using an Indian cache, which is placed very close to them
geographically. Which is basically India, let's say, or somewhere in Asia. So the
request and response times are quick. The latency is less, which means
that the system is now fast. That is one requirement
that is taken care of. Similarly for the U.S users and
similarly for the Netherlands users. The second thing is that this customized
data is now stored in the relevant places, according to the location
of that data. So Indians, Chinese, people in Asia can be served by
a single cache, which is close to them. While the people in USA, maybe
they speak a different language. Maybe they want to see different
things on the Amazon page. They can see it here in the U.S.
cache. And similarly in Netherlands. So this is taken care of. And
the third thing is "Caching". So why are we talking about a cache?
Well, in any distributed system, there needs to be a
single source of truth, which is here and here. So your server always has
access to that source of truth. What is being stored here in
these boxes is not, let's say, the source of truth. Because if someone needs to know what
exactly InterviewReady is talking about, they don't go to the CDN and
see what they're talking about. They actually ask the server. Which makes the data in these
boxes copies of the original data, of that source of truth.
So that will be called replication or a copy. But because these blocks
of data and be removed, we call this a cache. And I'd said
"insert once remove never". Well, that's not practical. Because if you are going to
change the page (HTML page also) for a particular country,
you have to remove that page. Or at least you have to version
it and then put it out of service. We are bringing in a new page.
So these are effectively caches. As a small company like it InterviewReady, how do you afford to build
caches in different countries? How do you make sure that that
regulations have been followed? How do you make sure that these are
invalidated in the right time? Ta-da! People have already done it. If you have a look at certain
specialized solutions like Akamai, what they provide is a
content delivery network. A content delivery network
takes care of all of this stuff. Hosting boxes close to the users
is something that specialize in. Making sure that they follow
that country's regulations
is something that they specialize in. And also invalidation or
posting content in these boxes is easy because they provide a
nice little UI. This does this for you. If you are an engineer here, all you need to do is use the UI that
they are providing to fill out these boxes in different places. And then you don't need to worry about
when and how you'll be able to handle these caches. Most of these
caches have "time to live" also. Sometimes you want data to
only exist for 60 seconds or 30 minutes. Whatever we like: I mean, you get a nice UI to set this up. Otherwise you would need to
actually create a cache and
then have some sort of a cron or you would have some sort
of a check on every "Get" request that: "Hey, is this invalidated or not? And if it is then go and fetch data from
here." All that stuff is taken care of by the solution provider, which
is a content delivery network. The most popular content
delivery network, I think is S3. We use this at InterviewReady
to host our files. So in case you are seeing some sort
of "system design" or any kind of "architecture" diagram, it's on S3. And the reason we use it is
because (A) It's super cheap. (B) It's very reliable.
And (C) It's easy to use. So all of the three things for a business
are ticked marked when you use S3. But if you have any other
solutions that you know about, do let me know in the comments. Of course, if you haven't doubts or suggestions on
this video, let me know in the comments. If you want more detailed
videos on system design, you can head to
https://get.interviewready.io
! And if you like the video, make sure to hit the like button!
I'll see you next time. Bye-bye.