Hi, I'm Gayle Laakmann McDowell, author of Cracking the Coding Interview. Today I want to talk about graphs and common
operations, like breadth-first search and depth first search. So to go back to the
beginning, a graph is basically a collection of nodes where each node
might point to other nodes and these edges can be directed so like one-way
streets or undirected so kind of like two-way streets. Now suppose you want to go walk
through this graph, and specifically suppose you want to do something like,
figure out is there a path from one node to another. There are two common ways of
doing this and we'll talk about both of them. The first one is a little more
simplistic and it's called depth-first search, and it's a typically recursive
algorithm and the way it looks like it's saying, okay we have have this initial
node that I'm gonna call s. Now you basically are asking a question, hey S, do you have a path to node T, and S says, hmm, I'm not sure, let me go ask my
children. And first S goes to node A and it says hey A, do you have a path to node T? If you do then hey I'm done I can give you my answer. If you don't then let me go ask B and
then C and then D. And so, the first person we ask is A, so A gets asked, hey A do you
have a path to T? I'm not sure let me go ask my children. And eventually we might get
to a node who says why yes of course I have a path, I am T. And so then
we go and say boopboopboop all the way back up. Yes there's a path and that's the basis
of depth-first search. It's called depth-first search because
we go deep into some node before you even ask any of the children. Now the
problem with this is that we might run really really far away. So imagine for example that B actually
has a path directly to node T has edge directly to node T. A might go to all of its children, all
of its children, all of its children, before you even get to B, and he was right
there. There could have been a really fast connection so that's why we might
often prefer to use breadth first search instead. Breadth-first search says hey go a
level by level out. So first we ask S hey do you have a path to T? And S will say
well let me check if any of my nodes are T. No, there's no edge right there. So each of those get in line. And then we
ask the second level out, and then the third level out, and the next level and the next
level and the next level. And so we go level by level, breadth, we go wider before you
go deep. That's why it's called breadth-first
search. So I want to talk at a high-level about the implementation of each before
I dive into some of the details. So depth-first search is implemented with a
recursive algorithm. It's probably the simpler algorithm to implement. The only
little trick is that we have to make sure to use it is visited flag so that
you don't wind up in some sort of infinite loop where there is, you know, a
cycle and you keep asking each node of it has, you start running around
in a circle. With breadth-first search, the main trick need to remember is you
want to use a queue. So when you look, when you look at S, you say hey, do you have a path to T? You're going to go add all of its children to the queue, and rather than going recursively
you pull out the first element from the queue, check if it has a path, check if it is this final element and if not go add all its children to it. So use a
queue so that you go through things in the correct order. So that's the high-level, how it works.
Let's turn to the real details of the implementation now. I've gotten a bit of
a head start on the implementation but I'll show you just quickly what I've
done. So first we have this node class here, and it's going to have some sort
of ID that represents the node ID and what I've done is give ourselves a
mapping of from node ID to the actual node and this is mostly going to be used
for things like get node and add edge. This way we can actually just go and get, get immediate access
to the node of the particular ID. And then I've also given us this has paths
depth-first search method and it's going to call after this recursive method. So
if you remember with depth first search, we need to have a way of
flagging nodes to say hey I've already visited this don't retry it. So one thing we can do is we can
actually modify the node class to give ourselves and is visited flag, but that
requires then making sure we clear that flag later on. Another way of doing
it is giving ourselves a hash set that lists all of the IDs that I've already,
of the nodes that I've already visited. It's sort of a replacement for a flag so I'm gonna
do it this way, this way I don't have to modify and add a whole bunch of flags in
and then make sure to clear them later on. So I get the source node, I get the
destination node, and then I get this, create this visited hash set and then I
go out and call this recursive method. So now is where the fun begins. So first if I've already visited this
node, if visited dot contains the source ID then return false because there is no
path then. Okay otherwise, then what I want to do is, so now I want to go and
make sure I update this, mark this node as visited, and then I want to say okay
if I'm at my source then if I'm at my destination rather, then
return true because yes there certainly is a path then. Otherwise go and check
all my children and see any of them have a path because if there's a path from a child
to me then then there's certainly a final, then there's certainly a path from me
to my destination. So for each node child in source dot adjacent, if there's
a path from child to destination passing in again visited, then return true and
that will bubble all the way up the stack. If I get down to the very end and I
haven't found a path yet then there is no path from me to my destination. Now
let's turn to how breadth-first search works. Ok so breadth-first search, so what I
need with breadth-first search is I need a linked list of sort of what I'll call
the like next step. So these are the, I'll call this
next to visit, these are the nodes that I need to visit next. And just as before I need
this visited hash set that represents everything I've already visited, and then
I want to say next to visit, because the first I need to visit is in
fact my source. Then as long as there is nothing in, so while visited, sorry
next to visit dot is empty so while it's not empty, keep going. Alright so first thing, I need
to look, grab my very first node to visit so node node equals next to visit dot, I'll
call this remove so remove the very first node in that list, if this is my
destination then there is certainly a path. Now I also want to do my visited
checking so if visited dot contains of node ID, then actually just continue,
let's go to the next value, otherwise visited dot add of node dot ID so mark it as
visited, and then go and actually add my children so for each node child in node
dot adjacent go and add each of those to my next to visit. So next to visit dot add of child. Right.
And that's all there is to depth and breadth first search. And then of course
if i get down to the very end and I haven't found a path yet return false. Let's walk through this code again and
make sure that this makes sense. So has path DFS takes in the source and
destination IDs and then I get those nodes and then I create this hash sets,
that should be hash set not a hashmap. And then goes and actually does this
recursive method, so the recursive method has this visited thing so we use a
visited hash set instead of marking the actual nodes with a flag. So visited, check
if it contains the source ID, if it does contain that, if I've already visited this,
return false because there's no path. Then otherwise go and add this visited,
mark this node as visited, read you know, check my sources, my destination children
already there return true. Otherwise go and search all
of my children. Then so that, and then if I haven't found a path go to return
false. So with BFS we are taking in a source
and a destination. So here are views nodes, I'll switch this actually to be symmetric and use IDs again and so I'll make that public
and this private now. Ok so I take in the source and
destination IDs and then I go and call this recursive method. My recursive method
are, sorry this other just helper method, so this takes in the source and
destination, I create this list of the next nodes I have to visit, and this
visited hash set, that should be hash set again, and this marks all the nodes that
I've already visited. So I say okay next one to visit is my source then as long
as there's something left to visit pull out the next node to visit,
check if I'm where I should be if so, return true, otherwise check and
update my visited, who I've visted and then go and queue up my
children at the very end of the queue to be visited next. And those, and this will ensure that my children not visited immediately, but are visited once, sort of everything, everything scheduled has already been visited, so it will ensure it
will visit level by level by level. And then if I get down to the very end and I haven't
found my destination after all of this, then just return false. So that's how breadth-first search
operates. In many cases when we want to find if there's actually a path,
breadth-first search is often the better approach, because otherwise we could wind
up searching really really far away when there's actually a very short connection,
and it's certainly better when we want to find the shortest path. So now that
you've seen breadth first and depth-first search why don't you try these out on a new problem.
Good luck.