Hi everyone!! Welcome to CodeKarle. My
name is Sandeep, and in this video we'll go over how do we design a navigation
application like Google Maps. Now let's go over some Functional and Non
Functional Requirements(NFRs) that we want this platform to support.
So the very first thing is when a person wants to move from point A to point B, this
platform should be able to give them a couple of information .
1) What is the route that they should follow 2) Given that route how much distance
they'll have to cover and how much time it will take. As an enhanced variant we
could also build a system in which we show them 2-3 routes and then let them
choose that they want to minimize the distance or minimize the time.
3) The next thing is that it should have a very pluggable model. So let's say we make a
very basic design which just has no traffic data or anything of that sort.
Now let's say if you want to plug in traffic information it should be easy
enough to add it. Now post that let's say if you want to add information about
weather or accidents or construction going on somewhere or road blockages or
something of that sort, it should have a very easy way to input this data and not
make a change in whole of the architecture. So that's the main idea of making a pluggable.
4) There's one more thing that we'll cover in a very brief
thing, that is basically how do you identify roads. So there are two three
ways in which we can identify new roads. So one very basic or the most efficient
way is to just ask government sources to give us all the information about
all the roads data that they have. Now they might not have all the data but at
least we'll have a good starting point. Now beyond that we can capture a lot
of organic data. So let's say if a lot of users are going on to a place or a route
kind of a thing which we don't identify as a road, then we can easily say that it
there is a road that we don't know of and we can kind of onboard that Road
onto our system. Now we can use some data like number of people traveling on it,
which sides are they traveling to, what is the volume of traffic, what is the
speed and correlate that with the existing roads that we have and try to
even come up with how wide the road is!! Is it a single lane road or if it's a
four lane/six lane kind of a road. All of those information could be figured
out. But we will still not go into this in much depth over the video. Mainly what we'll focus on is How do we efficiently find the route between two points? Now coming to the non-functional requirements the very first thing is the system should be always Available. This should not go down.
The next thing is it should have a good enough accuracy. Now let's say if you are
not giving the very best route but if even if you are giving a good enough
route that should be okay, unless it's not a terrible route, okay. And it should
not be too slow so let's say if it takes 1 second/2 seconds/3 seconds to
do all the calculations and come up with output that should be okay. It's not that
it should respond back in a millisecond with the best route between two points.
That's a bit too much to ask for, a couple of seconds one, two, three seconds
should be okay. But it should not be too slow. Like it should not take 15 seconds to
come back with the response, okay!! Coming to the scale, again I do not know
if these numbers are authentic, but from what I've heard and read, Google Maps
roughly has a billion MAU (Monthly Active Users) and these users generally access
Google Maps a couple of times in a day or maybe a couple of times in a week at least. What that means is we'll probably
have roughly five to ten billion requests coming in a month for time,
distance and route between two points from users. Then there are also five million
companies which use Google map so companies could be companies like Uber
who are using Google Maps to come up with the Navigation and Pricing System,
or it could be smaller companies which have a very similar use case.
But that means that we need to keep these things in mind that we have a very large volume
of users who are coming into system who will be using this platform and design it accordingly. now building this Navigation App is a very hard problem to
solve. Not because of the kind of Algorithms that we'll use. But more because
of the kind of requirements this has. So for example, if we try to calculate the
number of roads in the world there are various theories, which you know come up
with various numbers. In general people cannot claim that there are somewhere
close to 50-100 million roads in the world. It could be you know even a much larger number.
Now if we try to model it as a graph it would probably have somewhere
close to 50 million vertexes and maybe you know hundreds of millions of edges
if we divide the roads into multiple chunks and all of that, right. So that is a
very very massive data set.Now a lot of companies don't even have access to this
kind of a data, where in what road is connecting from where to where. That kind
of information is kind of non-existent throughout the whole world and there is
no single place where you can fetch that, which makes it a very hard problem
to build an navigation system like this. The next thing is, it is very
hard to even quantify a lot of attributes that will impact the ETA.
For example traffic and weather condition and Road quality. Things like these are
very hard to quantify so there is not a very easy mathematical formula that you
can come up with to calculate the ETA that will be required to cover a
particular distance. One more challenge this is, a lot of things into the whole
road systems are very unpredictable. There could be and a random accident
happening somewhere, there could be a road closure due to any kind of reason
and those things are not really predictable and that's the main reason
why you know a lot of companies have not been able to build this kind of system successfully. We will be using an ideology of Dynamic Programming while we
try to solve this problem. So basically what we will do is, we will try it for a
very small area, and then we will try to solve it for a larger area, and then even
larger area and then across cities and countries and so on and so forth. So
before we get started let me introduce you a concept called "Segment". Now this
is not an industry standard term I just made up this name on my own, so just keep
that in mind. A Segment is basically a small area that is small enough that we
can easily operate on to it. Think of it like this we can have segments of one
kilometer by one kilometer throughout the globe. So let's just say if you have
a city, something like this, what we'll do is we'll divide the city into multiple
segments. Now ideally there will be a lot more segments than what we are drawing
but for understanding this should be good enough, okay. Now the idea behind
these segments is that it will have four coordinates which are the four corners
of a Segment, through which we can identify the segment boundaries and each
segment would have an identifier. You can say this segment s1, this is s2, something
of that sort. Now the thing with segments is whether a point lies within a
particular segment? - is something we should be easily be able to calculate.
So for example think of it like a coordinate system you have these x axis
and y axis, this is your point (0,0), this is your point (0,1), this is
your point (1,0) and let's say this is your point (1,1). This is your
square of one each, right. Now whether a point lies within this or outside this we can
easily identify based on what its coordinates are right and similarly for
this based on its XY coordinates we should be able to identify whether this
lies in this particular segment or not. So what we will try to do is each user
will try to map it to some segments basis their coordinates, right! Now one
good thing about these segments is because these are functions of Lat-Long,
we should easily be able to identify some approximate distance
between two segments. So let's just say if you want to identify this distance
between these this segment and this segment, right. So basically let's say if we come up with the center points of both of these and these are two
lat/longs we can easily find the distance between (aerial distance) between
these two points. We use this in the later sections but it's a very easy
mathematical calculation given two lat/longs, find the aerial distance
between these, right? Now what we'll actually do is, let's say the whole globe
for example, we will actually divide the whole globe into multiple
segments some of the segments would just have water and nothing else but most of
the segments would have some localities and we will be solving for one segment
each and then we will extend the solution for multiple segments segments
which will then cut across multiple cities and countries and whatnot, okay!!
Now the way we model the road network is like a graph. So think of two points
let's say this is point A and this is point B, and then the road between these
two points, okay. So this becomes one vertex(A), this becomes another vertex(B) and
the road between them can be represented as an edge. Now each road would have
a weight. Now will not have one weight like you normally have in usual graphs,
will have multiple weights. so let's say this road is of length 2 kilometers so one of
the weights on this road would be 2 We could have a bit more
weights along with that. We would have a traffic weight, we could have a
time-based weight. So let's just say we come up with the average number thing in
general it takes 150 seconds to cross this road, so 150 becomes another weight.
okay! So this is how we represent a particular road and similarly there could
be other weights that we can add. Now a road, even though it looks like a
non-directional thing, but it is actually are directed relationship. Normally
how would you represent it? - you'll say that this one Road from A to B which is 2
kilometers long, takes 150 seconds. There's also another road which is from B to A,
which is again two kilometers long, but this one probably takes 200 seconds
to cover, right! Why do we need a directed relationship here? - Firstly in order to
handle if there is a different ETA or different traffic pattern or something
of that sort. Plus more importantly, if this says that it takes infinite amount
of time to reach here that's basically a way to say that it's a one-way Road. Only
you can go from point A to point B but not from B to A, right! So we can use all
of these ways to track various things like one ways or various different kinds
of roads. Now even though if it's a directed graph but for all the like
talking within this video I will draw a single line thinking of it as a non
directed relationship, because it makes the drawing easier and explanation
easier, but when you are actually going to implement it, you will implement it as
a directed relationship. Now let's look at what happens when you want to find
the distance between two points within a segment or even ETA or something of that
sort? So let's say this is a graph of all the roads, within a particular segment
and there are these ABCDEFGH... are the junctions or the points where you might
want to start or end from, and everything else is basically in edge. And
edges could have weights, assuming that all the edges have certain weight. Let's say this
edge has a weight of, first attribute is the length which could mean 2 kilometers the
next attribute might say that how much time it takes to cover that 2 kilometer. This would have a length of 5 kilometers and would take 700 seconds to
cover that. Now let's say if you want to go from point A to point C, how
can we go? - There could be multiple routes. So one of the routes
could be from A to B to C. There could be another route from A to B to E to C
there could be another route from A to I to H to G to B to C and there could be
another, a lot more number of routes. How do you find the best route
from going from point A to point C? This is a very standard problem in the
Graph Data Structures, there are various algorithms to solve it. We can use
anything like a Dijkstra or a Bellman Ford Algorithm to come up with the shortest
path between point A to point C. Once you have calculated this and we
might as well decide to cache it. Because within a segment, it's good to
know how can you move from point A to point B. Normally what you would do is
you would run something like a Floyd-Warshall Algorithm, which calculates the
shortest path between all possible edges in all possible vertexes within a
particular segment and then store that information so that you do not have to
recalculate it. So what it will end up being is, there are these all
normal roads which are edges and then there could be a calculated edge also
which says that, for A to C, the shortest path takes ten kilometers, it takes two
thousand seconds and it is via Point A then B then E then C, something of
that sort could be a calculated edge as an output of Floyd-Warshall Algorithm,
which can be stored in another Data Store which can then be looked up if you
want to quickly look at how much time does it take from reaching a point A to
point C. Now what we looked at earlier was given a junction A and given a
junction C how do we calculate the shortest distance between those two
junctions. What if the point with what we are looking at is not really a junction
which is the meeting point of two roads but any random point in the globe which
just comes on a particular road. Let's say that point is X, okay. And now we want
to find the shortest distance between X and C. What that translates is basically
find this distance (X->A), lets you call it . Find this distance (X->B) let's say we call it J.
Basically find that two short... two or three closest junctions from that point.
Normally if that is on a road it'll just be two junctions right, and find the
shortest distance from those two junctions. So whatever is the distance
from A to C and B to C, add I and J respectively into that and then
whichever is the shortest one that becomes your answer. Now if you want to
extend it further and say that I want to find out the distance between X and Y? -
What you basically mean to solve is A to C, the distance between A to C,
the distance between B to E, and then these two good respective distances
add it. So this (C->Y) is let's say i`, this(E->Y) is j` So what you want to do is a comparission
between [ i + i` + distance(A,C)] or this the distance of [ j + j` + distance (B,E) ] So this is how you
can handle any point even if it is not a junction. Now whenever you're calculating remember we looked at that we can use a
Floyd-Warshall Algorithm and then calculate all the possible distances
within a segment, right. What we should also do at that time is look at what all
possible exits are there some a Segment. Why that's important? we'll come to.
Let's say there are these two roads which are exiting a particular segment. Let's
say this is a Exit 1(E1). Let's say this is a Exit 2(E2). And the segment Id
is let's say S1. So what we say is this particular Junction is S1E1 and this
particular Junction is S1E2. What we'll also calculate is distance of each
Junction not from just other junctions but also from these two points. So we
will also know that what is the shortest route from reach from C to S1E1
and from C to S1E2. And we'll store this also as a cached information. Now
let's come to how do we solve it across multiple segments. Now let's say if you
have this point A. From now on we will just look at junctions because that
makes the calculation easier okay. You want to calculate distance from point A
to point B and all these rectangles that you see are unique Segments and
these two points are in different Segments. Then how do you actually
calculate the distance between the two points and the same logic would be used
to calculate the ETA and anything between these two points okay. So what we
are essentially saying is, we have this point A and we have this point B. Now
given the lat/longs we would be able to identify an aerial distance between them.
Let's just say that the aerial distance between them is 10 kilometers, okay.
We also know that each segment is one kilometer by one kilometer in length and
width right. So what we can say is we'll at max go... so because this is 10
kilometers on aerial distance, what we can say is.. adding some buffers we will
not cross more than 20 Segments on each directions either going from North to
South or going from East to West. That 20 number is basically a buffer that you
can come up with based on your conversation with your interviewer.
But the idea is this number(10 KMs) will give us some input onto how many segments do I
want to look at. So 20 segments on the North-South side and 20 segments on East-West side is what we'll look at to come up with the optimal route from point A to point
B. Because normally what would happen is there will be millions of roads into the
world. And if you start doing a Dijkstra across the whole globe just to
calculate distance between these two points, that will be very un-optimal.
So we have to break that Dijkstra at some point in time. So this number would be used to
break that Dijkstra saying that we will not go further that beyond this point.
Now one obvious way to solve this is you have all the roads connecting the whole city
which cuts these two points as well. Run a Dijkstra across the city and then come
up with the shortest distance between point A to point B. But if you look at it
and if you look at it that way that there are probably millions of users
querying at the same time to find distance between any two points, Running
those many number of Dijkstra's Algorithms would be heavily inefficient.
So we'll do some slight optimization there. What we'll do is, there are... we'll
basically say that there are two exits out of this block(Segment having A). Basically to reach
this block(Segment having B) we need to exit this block and there are these two ways. Let's say if we have
calculated that these two are the only exists within this segment. Let's say this a
segment S1. So A belongs to S1 and these two are the exits, this is S1E1
and this is S1E2 and they have some great W 1 and W 2 let's say.
Now what we'll do is, this segment... this junction is basically having two points.
One is S1E1.. two identifiers basically, S1E1 and S2E1. this is segment ID
S2 and this is exit number one of that segment so this has another identifier
for S2. What we are saying then is this has another point on the same route
which is S2E1 and weight between them zero. similarly there'll be a lot of points.
This graph will fan out. On the other side for B,
let's say B belongs to segment S4 there are another two exits through which you
can reach B. One is S4E2 and one is this one which is S4E1 even. Now if we just connect all the exits to exits for ten blocks(Segments) on the North side,
ten blocks(Segments) on the South side, ten blocks on this side and ten blocks on that side,
we'll basically be able to come up with a crisscross of a lot of Road which connects these four points right and there'll be a lot of junctions within
that, there'll be a lot of roads, but these will be basically just entry points and
exit points of the segments. We don't really care what are... what is happening
within the segment when we are cutting across segments. Why? - because to reach
this segment we anyway need to cross all of the other segments right. Now we don't
care if this route which we are calling as let's say this is segment S6 and this
point is S6E1, this point is S6E2 Now this is basically a calculated
path from these two exit points. This exit point to that exit point basically.
This real path would look something like this right but we don't really care
what's happening within the segment. All we care about is from this point(S6E1) to this
point(S6E2), I know the distance, and I'll use that. Then from this point to this point I
know the distance I'll use that. So we'll basically be using all the exit points distances and use them as weights, and come up with this graph at runtime. Now given like ten
segments on each side, we'll probably have this graph of a couple of hundreds of
edges at max right. Now we can run a Dijkstra on this graph staying with that
what is the shortest path to go from here(A) to here(B). This Dijkstra runs on
top of the graph which was created by just entry points and exit points in
everything else, and entry point and start point and entry point and end point in the
starting and ending segments. Everything else was just a connector
from individual end point to individuals basically exits of a particular segment.
Now this is fine when we have to travel for like ten odd kilometers we can still
create a graph of these 100-200 edges and calculate. What if you want to go
inter-city? What if you want to go from City 1 to City 2 and then you
want to calculate it? Then it becomes a big trouble to... because let's say the
distance between those two city is thousand kilometers then there are
possibly thousands of segments multiple thousands of segments that you'll
look at, right. That would too complicated to run a Dijkstra at runtime. Now like we
extended the solution from one segment to multiple segments, we can basically
create a Mega-Segment, right? We can say this whole thing that you see
here is 1 Mega-Segment and there could be this exit to this Mega Segment this exit
to this Mega Segment and probably an exit here to this Mega Segment, right? And
then what we can say is, let's say you have this whole map of a country. I
know it looks terrible but let's say this is one country, what you can say this is 1
mega segment you can basically divide this country into mega segments.
And then recursively solve for it even further. Now when you were to if you are
to solve in the mega segment world you have basically converted this whole
country into possibly 50 mega segments and now if you want to go from here to
here, you basically need to just connect the exit points of individual
Mega Segments. You don't really care what is happening within a mega
segment, you don't really care what is happening within segments under those
mega segments and roads under those segments. All you care about it from my
start point how do I get to the nearest exit or all the exits of that mega
segment and from these exits of mega segments how do I get to further mega
segments recursively and then somehow reach this end point, right. Now you can
have n levels of nesting. Ideally three levels of nesting is more than enough.
Now what is the ideal size of this mega segment? - Again there is no right answer
you can come up with based on some calculation with your interview again on
this. But the idea if you still want to calculate this thing across cities or across states, this segment approach would still not scale you need to do an
abstraction on top of segments which can be represented as a Mega-Segment and
then solve it recursively. All good so far but we have not really
talked about how do we come up with the weights on an edge and how do...what all weights do we even consider? So the very first obvious weight is the amount
of distance between two points right so that is one thing that will definitely have.
Along with that we will also keep an ETA that normally under normal
traffic conditions, in the normal weather conditions, how much time does it take
for a user to go from point A to point B. Now when we have both these
information we can obviously keep a next obvious thing saying what is the average
speed of vehicles. Speed = (distance/time) so basis that will
be able to come up with an average speed. Now you might think that the next
obvious you things are weights like traffic and weather condition and all of
that but I would definitely not recommend that. Even... let's assume
this is what we have build already and now somebody comes up and asks you to
implement traffic data into this okay now if you want to add traffic as a
weight onto your edges that'll be a very very bad design. Why? - Because with an
additional attribute that you are adding you are basically changing the logic of
your underlying Dijkstra which runs on a segment, right. Now let's say tomorrow
somebody comes up asking you for a weather information you will again
change the whole logic of your Dijkstra. That piece is a bad design, okay. So ideally..
basically the set of things that a good design should entail is basically that
it should not restrict addition of new features plus each change should be
small enough and contained. It should not basically impact the whole feature.
So what we'll say is traffic and all all of those in things are basically attributes
which impact your speed okay and basis whatever your average speed is you can
calculate the ETA as a function of distance and average speed, right?
So all we'll say is traffic, weather, roadblocks all of those they'll never be weights on your
graph, okay. They would just be attributes that impact your average speed. So
average speed would be a function of traffic, it could be a function of
weather it could be a function of any number of
fields that you want. Let's say if you want to add a accident, or construction work
going on someplace or whatever you want to add. So all of those would basically
impact your average speed. Now normally you don't have to do all these fancy
things, never ever! Why? - Because it's for a company like Google Maps they know...
let's say there are hundreds of people going from point A to point B. They
anyway know how much time they took right plus a very important feature
about this kind of data is, this traffic data would always be normally
distributed. So if we want to plot, Let's say this is basically any metrics let's say
ETA of how much time it took for people to go from point A to point B on a graph
it would always look like a normal distribution, okay. What that means is you
do kwow that there is this point which most of the users are hitting and as a...
with some statistical calculations of one standard deviation from each of
these point you will be easily be able to calculate this range that normally for
most of your users at least 50-60-70% of your users, this is the
window of time that it will take. Let's say this is 5 seconds this is 7
seconds and let's say average is 6 seconds for example so statistically you can
come up with this number as a function of lot of users who are crossing the
roads. So you don't really care how much is the traffic, what is the weather, when
you anyway have this real-time information So for all the
geographies where you have people using your application, there you have organic
data from real users at this point in time so you don't really care about
traffic and all of that. Traffic and all other things will come into picture only
in places where you can either not get enough data from users or there is some
other reason why you're maybe legally not allowed to get information of users
right because of which if you don't have real data users, then is when you will
bother about using traffic and weather and all those attributes to come up
with this average speed number basis that you can even know the distance from
point A to point B, you can come up with a ETA, okay.
Now how do we quantify traffic? you cannot say that if there are 805
vehicles, it will take this much amount of time, right. The number of vehicles on
a road and things like that cannot be calculated at all. No matter how good
technology you build right. So when these things cannot be quantified how do you
use them as a measure? So what we can say is basically create multiple "Tiers".
Let's say this is traffic measurement and this is Weather Management.
All you can say is traffic is low or it is medium or it is high and whether it is
good or it is bad, right. So you will have these kind of Tiers for each of the
attributes that you have and all you can say is every time traffic goes from here(Low)
to here(Medium), it has a 20% increase onto average speed or something of that sort.
right every time traffic goes from low to medium, average speed reduces by 20
percent. Basis these transitions of traffic volume... now you can always say that
traffic is low or high based on the number of users that are probably using
your application or from some other sources, right. Let's say you have this
kind of a measure, so there are companies like Waze and all, which will give you the
kind of information that how's the weather, how is the traffic in a very
abstracted form. Now once you have this information you can calculate
average speed as a relative number from the original value calculated. so let's
say with the good weather, with the medium traffic you knew the average
speed was later 20 kilometers per hour Now if the traffic goes to high, you
kind of reduce it by 20%, so it basically becomes 16 kilometers per hour. Basis is
this kind of a calculation, you will calculate the ETA. Now as and when
inputs are coming, you are making changes into your weights on the graph. How
does it impact the overall system? So coming back to the previous drawing that
we made remember we had something called as a segment and we had lots of roads
within the segment and let's say the traffic increase we got a signal from
some third party saying traffic has increased on a particular Road. Now do
whatever you want to. Now let's say there was this one exit point and one exit
point and distance between this was some Let's say there are some X weight to this
particular theoretically calculated road. And let's say it was actually
using this particular road on which the traffic is increased, let's say it
actually was following this kind of a graph, something of that sort.
Now similarly all other roads are getting changed. So all you can say is remember
in this all the theoretically calculated roads that we have to be calculated we
were caching that what all real roads are a part of this theoretically
calculated road. Which is... it is not a real route, but a calculated edge.
So each time this thing changes, weight changes on any real road, you basically
try to infer, what all actual cached roads are impacted. And then you
basically make a change into the cached roads also. So you say that okay the time
taken initially was 10 seconds, for this path, now it has become 15. So let me
increase the value of this particular calculated road from point E1 to E2 by 5 seconds Now let's say you do this for one
segment, then what basically happens is you recursively bubble it up over a lot of cached things. So let's say if there was a mega segment who's...
who had this kind of an entry-exit point that this is ME1 and this was in ME2
and you know that ME1, the route from every ME1 to ME2 uses this E1-E2. So what
you will do is you will basically bubble up the amount of time for this route by 5
seconds. So what we are essentially saying is as then when you get traffic
data you basically update the ETA on a particular road. Then update the ETA on
all the calculated routes that this road is a part of.
And update and then basically that becomes at a segment level. And then do the same
thing recursively for all the segments that are a part of a mega segment.
So basically what you are doing is you're bubbling of all the calculated values
and then updating whole of the graph. Now these things are not happening
at once not all the signals are coming up at one, right. So as in when something
changes you bubble up till the level where it goes and then stop. So let's say this
road was not a part of this ME1 ME2 for example, then it will stop at the
segment level, right. But if it is a part of a mega segment road also, then it will
bubble up in the mega segment level. Now, other than just traffic and weather,
there's also one very important factor that you can use. So you can also use
historical ETAs to come up with a real or a predicted ETA for a particular Road.
And this historical ETA is basically calculated in a way that you
basically look at the day of week and the hour of day so let's say Monday, 5 p.m. this road takes 20 seconds to cover, but on Sundays 5 p.m. is this road
takes just 10 seconds to cover so something of that sort is what you can cache
at a level of a day of week and hour of day. And then use that historical data to
infer. So let's say if you do not have traffic information also, then you can use this
kind of an information to somehow come up with the predicted ETA.
Now coming back to the segment, due to some activities within the segment so let's
just say there are a lot of roads on the segment. There is there are these two
exit points and there are these points which have some roads now let's say
we've calculated that the ETA of fastest way to go from point
SE1 to SE2, involves 500 units of time. let's say 500 seconds, okay. And it uses
the route SE1 to A to B to C to SE2, okay. Now let's just say that there's a
traffic increase on A to B. Now we could identify it by a lot of attributes. Let's
just say people moving between A to B their average time increases right.
Or we get a signal saying you know the traffic is increasing significantly. For
whatever reason let's say we increase the time of
basically the ETA for going from point A to point B. So it would impact the ETA from
SE11 to SE2. Now there is a limit to which it can bubble up right so
let's say it increases from 500 to let's say 550 due to some increase in A to B
so all that is okay. But let's just say that we have real data of users, and for
some reason it is taking a lot of time for some people but we put to go from
point A to point B and let's just say this increases to let's say 850. Now a
jump from 500 units of time to 850 units of time is a very big jump. It is
having a.. roughly a 70% right. So now it is possible that the
fastest route from SE 1 to SE2 does not include this route ABC! It could be some
other route. So what we will do is if ETA increases or decreases, generally in case of increase only we'll do this. if ETA increases by greater than some
percentage. It's a configurable amount. Let's just say we have it set that if
the ETA increases by 30% then we recalculate things within that segment.
Because all the updates that you are getting are against a
segment only. So whenever a particular road's time increases you can
calculate everything within that segment, right. Which means recalculating paths from SE1 to SE2. Let's just say, that we figured out that
now the new shortest path is not this path and it is basically the pass from
SE1 to A to H to G to F to E and to SE2. So, this basically becomes your new shortest
path, right. Now if it is the shortest path, then we need to bubble up that
information, right. First of all we need to say that whoever was traveling from
SE1 to SE2, now needs to follow a new path right which is a path from
SE1-A- H - G - F - E - SE2, right. This is one part of the things. But what if there was
a mega segment over here which encompasses a lot of segments, whose exit
points were also including this particular path. This is let us say ME1
and let's say ME2. Now if the exit point path of this
Mega Segment also includes this part then even that mega segments path would
change so the part where they use ABC now that would get converted into this,
right. So these are the things that you'll have to do each time a weight changes.
so when the increases beyond the threshold then you have to do. So let's
just say increase happen from 500 to 510 then all of this calculation is probably
not even worth it. Because the best you can get is probably somewhere between
500 to 510. Now let's say 10 seconds would not be too much of an enhancement
right. But if it is such a big jump from 500 to 850 for example then you might as
well recalculate everything within the segment and then bubble up whatever is
required. Now you'll not just calculate with respect to exit points, you'll also
calculate the shortest path to reach from let's say SE1 to A, or SE1 to B.
All of those will also be calculated but for representation this was a good
enough way to explain you the concept. Now let's look at the architecture of
the whole system. So we will do it in two parts: First we'll look at the flow wherein all users are connected to the system
wherein we are capturing their location information from all the users even if
they are not using our map service. This is basically for improving our map
service altogether and a bit of user profiling. In the next section we'll look at
how does the actual navigation flow works for the user who want to find the
shortest route between two points and what are the systems required for that.
And at the end we'll look at some of the analytics. With that let's get started.
A bit of a convention to start with: Things in green are basically user devices
could be mostly these will be mobile phones, could be web browsers as well.
Things in black are load balancers plus reverse proxies plus an authentication
authorization layer in between which will authenticate all the external requests coming to the system.
Things in blue are the things that we have built on our own. It could be web services, Kafka
consumers or anything of that sort. Things in red are either some infrastructure
component that we've used or some databases or some clusters like Kafka, Hadoop and all of that, cool. So let's look at how the user flow begins.
The very first interaction with the user is when they when they have the app
installed on their phone, with their location services turn on, so we get
regular pings from a user's device every few seconds. Now if the user is
stationary, the device will have the logic to not send the location ping
every five seconds and maybe send it every five minutes or ten minutes or
something of that sort, because that will reduce the load on our system if the
user is anyway at the same point + Better battery life. But while the user is in transit, we will
increase the frequency, so as to make our data more accurate.
This talks to.. this basically keeps the persistent connection with the
device. The reason for a persistent connection is, so that we don't have to
create a new connection everytime. Also this would be generally a Websocket connection,
because sometimes we might need to send some information on to the user devices as well.
So this WebSocket handler is basically the service that talks to all
the user devices. Now there would normally be thousands of such WebSocket
handlers for talking to all the users that are currently online. So we need
somebody who can manage the fact that which handler machine is talking to
which user. So then comes something called as a Web Socket Manager.
It basically keeps a track of which machine(read user) is connected to which WebSocket handler,
so which user device is basically connected to which machine of
this WebSocket Handler Service. It basically keeps that information in a
Redis. If let's say it lost that information is lost we can kind of pull
that information again from the servers. So we don't need to kind of store that
information in a persistent Data Store. The next thing what happens is, as in when
users are sending their locations the frequency of the location pings is
handled by the device and then the device location comes to us
basically coming at the Location Service. Now Location Service is basically a
repository of all the location related information . It basically has some
endpoints through which you can ingest some users information, something like a
user ID with the timestamp with a particular Lat/Long. That kind of
a structure. It puts all of that information into a Cassandra as a
permanent data source to keep a track of the fact that which user was at what
location at what point in time. Remember this is not just for users who are using
our app right now for navigation, but for all the users in the world. How this
data would help us we'll come to in a while. Now, as and when it's getting location
pings, it is also putting all of those pings into a Kafka.
All the location pings that are going into this Kafka, are basically
read by this Spark Streaming Consumer. This does a lot of calculations,
some of them are here and some of them we look at in the later sections.
First of all what it does is, if let's say a lot of users are traveling at a place
where.. which we do not identify, okay. But their movement pattern tells
us it looks like a road. So this basically is a job which would be
a Streaming Job which would look at last ten minutes of data and see how the
users are moving and basically it'll be used to add new roads into our system.
Now as and when a new road is added each time a new road is added that would
basically impact the particular Segment within which the roads are added and
then basis that all the Segment related information that we calculated
earlier that we talked about earlier that would be recalculated. So that might impact
a lot of segments and Mega Segments recursively if that's a very critical new road that we've found. The next thing is
basically an average speed job. What that does is, it basically looks at again if
we look at within a segment a lot of people moving from point A to point B, it
tries to figure out what is the average speed that these users have over let's
say last fifteen twenty minutes and use that information to as a proxy for a
real-time information and then suggest that and basically it will then update the weights of the graph that we talked
about earlier. Again that would bubble up to various Segments and Mega
Segments if that's a part of a critical road. Now how would that bubble of
process happen? - So this Spark Streaming Job, individual jobs would actually write
back things into Kafka. So let's say if a new road is identified, it would write
that to a separate topic saying I've identified a new road which
connects point X to point Y. Now there will be a map update service which
listens to that particular topic and it updates it in a Data Store. It updates it
in a graphs Cassandra, which is managed by this Graph Processing
Service, so each time a new road is found, Map Update Service would invoke Graph
Processing Service to tell that I have found a new road just store that in your
Datastore which sits on top of a Cassandra. Now what this service is, we'll look
at in the next section in more detail. Now, each time average speed job finds out
there is some change that has happened it would again put an event into Kafka
in a separate topic now and there'll be something called as a Traffic Update
Service that listens to this topic which would update the traffic related
information again into the same Cassandra through Graph Processing Service.
There is something called as a Hotspot Identifier. So let's just say
there's a particular locality on which there is some average number of people
per given area kind of a metric that we have. Now let's have suddenly a lot of
people start gathering in that area it gives us a signal that there is some
activity happening at that point. It could be that there is some social
event, some sports thing happening or but something is happening there so that
could be able to identify hotspots within the graph. Because eventually once
we identify that hotspot we'll know that after probably a while, there will be a
lot of traffic in that area as well right. So that could help us in certain
ways. Now for certain things Spark Streaming would itself
insert a lot of information and put out events but in general it will dump all
of that data into Hadoop cluster, which has location pings of all the users
across a lot of time. Now we could do some ML Jobs onto it which
classify these certain other things. So let's say if you've identified a new
road but we do not know what kind of a road that is. If it's a one-way, if it's a
two-way Road, if it's a single lane road or maybe a six-lane Road. So all of those things would be identified by this Road Classifier and this particular job could now put that event into that road topic and Map Update Service could now store that additional information saying now that I have not just identified that
road but now I also know that this is a probably a six-lane Road or something of
that sort There could be something called as a Vehicle Identifier. While you are getting location pings, if you
start getting those pings at a very high frequency let's say every one
second if you are getting that you can input a lot of information about the
vehicle in which the person is traveling. So let's just say that the traffic is
going normal at certain pace but one particular person is stopping every few
kilometres let's say, okay. Now what that could mean that could mean that either
there's a red light or there's a traffic jam there, possibly. But we know what are
the junctions you know where the red lights can be we know how the
traffic jam are because we know the data for the user as well. Now if it's not
any of these two criteria then that means that the person is traveling in a
public transport something like a bus and then the bus is stopping at each bus
stop, all right. Other kind of things that we can infer is if let's say the ride is
bumpy or if the person is going left and right here and there very frequently or
having a very fast acceleration and very fast breaking it's probably a two wheeler.
if not in all other scenarios if it's a smooth enough right stable enough
acceleration going straight in one road it is very likely a four-wheeler. Now we
do tell information to Google specifically that we are actually
traveling in a two wheeler or four wheeler while we select a map but if
let's say that option is not available then this kind of data would be good
enough to infer which kind of vehicle a person is traveling in, which will help
us in a lot of other attributes. So these individual jobs what all they identify
they could put that into back into the same Kafka in probably a separate topic for each job and then those information could be
analysed even further. Now let's look at the user journey when they want to
actually get the navigation from point A to point B. So normally what people do is
they search for a particular location which then gets converted into a
particular lat/long right. So all of those is being done by Area Search
Service. This basically does two kind of things. One is, it has some areas within
Elastic Search on which it provides fuzzy searching and once the person has search
for a particular area or locality or location anything. It then
gets converted into a lat/long. So basically it stores the lat/long along
with the place name, description all of that, in a document. The other thing is it
tries to dynamically figure out: Given an address kind of a thing which lat/long
that address points to. So it does that address resolution into a lat/long as well.
Now let's say... so post the response from this user will have a start point
and end point as a latitude longitude both of them and then they would want to
request the path, okay. First let's quickly go over what is this Navigation
Tracking Service. So while the person is actually started the navigation and
they're going towards the destination all of their location coordinates will
be tracked as part of Navigation Tracking Service. This does two main things:
1) If the user is deviating from the route that we suggested, we will kind of inform
the app back and then that will probably show a pop-up kind of a thing, saying that
you are probably going wrong, maybe you want to course-correct to the right path.
2) It will also store all the data so it will push all of the data
into Kafka, while the person is traveling then that data could be used for later
analysis saying, at this point in time a person searched for a route from point A
to point B, and this was the route that we recommended. It could be used for
figuring out how good the recommendation of the path is. That being said, let's
jump to the main piece which actually figures out how to connect point A to
point B, okay. So all of those requests are being handled by this Map Service.
This Map Service over here is basically an interface service which provides
multiple interfaces for people to request the directions from some point
to some point. One of the interface is given to the user devices. A similar
interface could be given out to companies where in all the rate-limiting
and their throughput calculation against their account information is being
calculated and stored, okay. It's not an intelligent service it just
forwards the request to something called as a Graph Processing
Service. Graph Processing Service basically does a lot of things.
First of all it queries Segment Service which sits on top of its own Database of
Cassandra, in which it stores all the segments and their corner coordinates
along with a lot of information of those segments. So let's see if they if the
Graph Processing Service wants to figure out that what are the segment's for the
start point and the end point, it will get that information on the
Segment Service. Now if it so happens that both the points are within the same
segment then it's a short duration ride. Then the Graph Processing Service might choose to
do a couple of things it might look up that, Do have this
data already cached? if yes, it will directly return to the Map Service saying there is a response. If not, it'll then decide what do I do next?
It can now if it is the same segment it can probably say that I will run a
proper Dijkstra's algorithm and then try to figure out the shortest path between
the two points. It can do that and then return the response. Alternatively, if
it's in different segments then it will try to fan out into the whole basically
it will try to create a pick the sub graph of the main whole world's graph
that we want to look up as part of the Dijkstra and run it, okay. Now for doing
that it will look up on a lot of things. First of all it will look up for all the
roads within those segments, the entry-exit points of the segments,
into its Cassandra. It will also have live traffic information which would
have updated the ETAs into the graph again from the same Cassandra. Now it would have also got some input from something called as a Third Party Data Manager. Now it doesn't query Third Party Data Manager at runtime. Third Party Data Manager in fact, pushes data into Graph Service which then
impacts the live traffic information so let's say if it says that there is a
huge amount of traffic at certain point it'll update that. Let's say if it says
that there is a massive kind of a rain happening somewhere, it will update the
live traffic information thing there is some rain and that would probably cause some delays and the average speed would slow down and ETA would increase,
something of that sort. Now using all of that information it basically creates
the whole graph that we looked at in the previous sections and then it either
looks up in the cache and returns the result or it might run the Dijkstra's Algorithm or some Algorithm to basically figure out the shortest path between two
points and then return the result. let's say if it doesn't have any traffic
information, any ETA information it might also query Historical Data Service to find out at this point in time how much ETA can expect from point A to point B. if it doesn't have that information for some of the points. Now Historical Data Service sits on top of its own Cassandra which basically stores the information of the form that at some day, day of the week like Sunday, Monday, Tuesday in some hour, let's say 4 PM, 5 PM, something of that sort, from point A to point B, how much
was the average speed of people normally and how much ETA did it take. So all of that kind of information would be stored in Historical Data Service. Bases all of that, Graph Processing Service will return back the response to Map Service
and then to the user going by the exact same set of logic that we looked at in
the previous sections. Now all of this is done a lot of events have gone
into Kafka. One of the common events was that by the Area Search Service which we
didn't go over. So each time a search is happening the service will put an event
into Kafka. Now what that would do is it will basically give us a list of areas
that are being searched most often. So we can do some optimizations for them.
Another thing would happen is, so not we can do a lot
of analytic on top of this data okay that is coming into Kafka. One of the
very first thing we need to do is to figure out that out of all the third
party that we are integrated with which of them is actually giving right
information and which of them is actually given the wrong information. How
do we do that? So let's say some third party gave us some
information saying from point A to point B, there is a lot of traffic, and
ETAs would increase. But our data if it says that ETAs have not been impacted at
all, that means that particular information was incorrect, right. Now if
that happens from the same Third Party a lot of times, then we can safely say
that, that third party has a wrong information or it gives us wrong information. right. Now this Third Party Data Manager, abstracts out on a lot of third party companies, so we could be pulling data from a lot of companies all of that is abstracted under this Third Party Data Manager but the reporting
would be done at an individual organization level. That being said, there
are a lot of other kinds of analytics that we might want to do. First of all we
might want to identify what is the road that a particular person is on, right? so
as part of navigation you want to probably have a audio kind of a thing
saying turn right in ninety meters to get on Mahatma Gandhi Road,or something of
that sort, right. All of that inferences could be done by the Spark Streaming
saying where exactly I am in how much time am I about to reach a
junction and what is the road name on which I will get to next. So all of that
could be done by this. There are a lot of other analytics that we could also do on
top of this data. Let us look at it next. So out of all the events that are being
put into this Kafka, there is a lot of analytics that we can do and we should
do. So one of the very important things to analyze is how accurate our ETA
predictions are? So let's just say we predicted that it will take 40 minutes
to cover a distance from point A to point B and if it takes something around
38-39 minutes then that's fairly accurate. But if in real world it takes somewhere
around 20 minutes then that's a fairly bad prediction. So we need to have this
data on how accurate our predictions are so that we can figure out that we need
to invest more time into you know upgrading that algorithm that comes up
with the ETAs. Along with that it will also give us some information about
which of the routes are the ones that we are recommending to people, but people
are not taking. Maybe there is something wrong with the routes that exist in our
database, right. So all of those things could also be inferred. Some common things
that we can infer is what are some common hotspots in the city that people
gather at. These could be some social gathering places or things of that sort.
So that will kind of give us some points of interest that we can build over time
using this data to start with. Other than that some obvious things that we can also
figure out is, what are the home locations and work locations of some of
the people. We could also infer user profile data. So for example if a person
is going to a lot of pubs for example then they are more of a social kind of a
person. If a person is traveling outside the city very frequently on
weekends, the person likes travelling. So things like that can also be inferred
basis just a location data. Coming to the last thing, now this normally wouldn't be
a part of a design interview but there's something you should be aware of because
this is an important thing that companies like Google Maps have to take
care of. So how do you handle disputed areas? So while in Google Maps there is a
clear-cut boundary of states and countries and all of that but if there
is a disputed region how does Google Maps decide where to make the boundary of the
country or state right. So a classic example of this would be a dispute
between the countries of India, Pakistan and China in the state of Jammu and
Kashmir. So India claims that whole of the land is under their territory. China
claims that some of the parts of that state belongs to China and Pakistan
claims that some of the parts of that state belongs to Pakistan. So now how
does Google decide or any such mapping provider decides, that how where do they
define the country boundaries, right. Now this becomes very tricky for Google
because they cannot take anybody's side and whichever decision they make they
are going to kind of piss off some of the countries. Plus that's even worse
in this scenario because these three countries combined together have more
than one third of the world's population. So that's a very rich source of revenue
for Google. So how can you handle? - so what Google does is something very smart and
also very tricky. So what they do is, based on the country where you are coming from
they'll show you a different country boundary. So for people who are coming
from India, to all of those people they show that the whole part belongs to
India. Now to the people of Pakistan they show that the part that they are
claiming belongs to Pakistan and the parts that are disputed between India
and China are shown as the dotted line. Similarly for people coming from China
they show the part that China claims is theirs they show under China's
territory and remaining disputed area between India and Pakistan is shown as a
dotted line with no country boundary specified. So this is something if if you
come across this kind of the thing in an interview could try to replicate but
keep this out of the scope of the interview at least, but still something
good to know. So yeah I think this is it for Google Maps kind of a system.