Algorithms: Graph Search, DFS and BFS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hi, I'm Gayle Laakmann McDowell, author of Cracking the Coding Interview. Today I want to talk about graphs and common operations, like breadth-first search and depth first search. So to go back to the beginning, a graph is basically a collection of nodes where each node might point to other nodes and these edges can be directed so like one-way streets or undirected so kind of like two-way streets. Now suppose you want to go walk through this graph, and specifically suppose you want to do something like, figure out is there a path from one node to another. There are two common ways of doing this and we'll talk about both of them. The first one is a little more simplistic and it's called depth-first search, and it's a typically recursive algorithm and the way it looks like it's saying, okay we have have this initial node that I'm gonna call s. Now you basically are asking a question, hey S, do you have a path to node T, and S says, hmm, I'm not sure, let me go ask my children. And first S goes to node A and it says hey A, do you have a path to node T? If you do then hey I'm done I can give you my answer. If you don't then let me go ask B and then C and then D. And so, the first person we ask is A, so A gets asked, hey A do you have a path to T? I'm not sure let me go ask my children. And eventually we might get to a node who says why yes of course I have a path, I am T. And so then we go and say boopboopboop all the way back up. Yes there's a path and that's the basis of depth-first search. It's called depth-first search because we go deep into some node before you even ask any of the children. Now the problem with this is that we might run really really far away. So imagine for example that B actually has a path directly to node T has edge directly to node T. A might go to all of its children, all of its children, all of its children, before you even get to B, and he was right there. There could have been a really fast connection so that's why we might often prefer to use breadth first search instead. Breadth-first search says hey go a level by level out. So first we ask S hey do you have a path to T? And S will say well let me check if any of my nodes are T. No, there's no edge right there. So each of those get in line. And then we ask the second level out, and then the third level out, and the next level and the next level and the next level. And so we go level by level, breadth, we go wider before you go deep. That's why it's called breadth-first search. So I want to talk at a high-level about the implementation of each before I dive into some of the details. So depth-first search is implemented with a recursive algorithm. It's probably the simpler algorithm to implement. The only little trick is that we have to make sure to use it is visited flag so that you don't wind up in some sort of infinite loop where there is, you know, a cycle and you keep asking each node of it has, you start running around in a circle. With breadth-first search, the main trick need to remember is you want to use a queue. So when you look, when you look at S, you say hey, do you have a path to T? You're going to go add all of its children to the queue, and rather than going recursively you pull out the first element from the queue, check if it has a path, check if it is this final element and if not go add all its children to it. So use a queue so that you go through things in the correct order. So that's the high-level, how it works. Let's turn to the real details of the implementation now. I've gotten a bit of a head start on the implementation but I'll show you just quickly what I've done. So first we have this node class here, and it's going to have some sort of ID that represents the node ID and what I've done is give ourselves a mapping of from node ID to the actual node and this is mostly going to be used for things like get node and add edge. This way we can actually just go and get, get immediate access to the node of the particular ID. And then I've also given us this has paths depth-first search method and it's going to call after this recursive method. So if you remember with depth first search, we need to have a way of flagging nodes to say hey I've already visited this don't retry it. So one thing we can do is we can actually modify the node class to give ourselves and is visited flag, but that requires then making sure we clear that flag later on. Another way of doing it is giving ourselves a hash set that lists all of the IDs that I've already, of the nodes that I've already visited. It's sort of a replacement for a flag so I'm gonna do it this way, this way I don't have to modify and add a whole bunch of flags in and then make sure to clear them later on. So I get the source node, I get the destination node, and then I get this, create this visited hash set and then I go out and call this recursive method. So now is where the fun begins. So first if I've already visited this node, if visited dot contains the source ID then return false because there is no path then. Okay otherwise, then what I want to do is, so now I want to go and make sure I update this, mark this node as visited, and then I want to say okay if I'm at my source then if I'm at my destination rather, then return true because yes there certainly is a path then. Otherwise go and check all my children and see any of them have a path because if there's a path from a child to me then then there's certainly a final, then there's certainly a path from me to my destination. So for each node child in source dot adjacent, if there's a path from child to destination passing in again visited, then return true and that will bubble all the way up the stack. If I get down to the very end and I haven't found a path yet then there is no path from me to my destination. Now let's turn to how breadth-first search works. Ok so breadth-first search, so what I need with breadth-first search is I need a linked list of sort of what I'll call the like next step. So these are the, I'll call this next to visit, these are the nodes that I need to visit next. And just as before I need this visited hash set that represents everything I've already visited, and then I want to say next to visit, because the first I need to visit is in fact my source. Then as long as there is nothing in, so while visited, sorry next to visit dot is empty so while it's not empty, keep going. Alright so first thing, I need to look, grab my very first node to visit so node node equals next to visit dot, I'll call this remove so remove the very first node in that list, if this is my destination then there is certainly a path. Now I also want to do my visited checking so if visited dot contains of node ID, then actually just continue, let's go to the next value, otherwise visited dot add of node dot ID so mark it as visited, and then go and actually add my children so for each node child in node dot adjacent go and add each of those to my next to visit. So next to visit dot add of child. Right. And that's all there is to depth and breadth first search. And then of course if i get down to the very end and I haven't found a path yet return false. Let's walk through this code again and make sure that this makes sense. So has path DFS takes in the source and destination IDs and then I get those nodes and then I create this hash sets, that should be hash set not a hashmap. And then goes and actually does this recursive method, so the recursive method has this visited thing so we use a visited hash set instead of marking the actual nodes with a flag. So visited, check if it contains the source ID, if it does contain that, if I've already visited this, return false because there's no path. Then otherwise go and add this visited, mark this node as visited, read you know, check my sources, my destination children already there return true. Otherwise go and search all of my children. Then so that, and then if I haven't found a path go to return false. So with BFS we are taking in a source and a destination. So here are views nodes, I'll switch this actually to be symmetric and use IDs again and so I'll make that public and this private now. Ok so I take in the source and destination IDs and then I go and call this recursive method. My recursive method are, sorry this other just helper method, so this takes in the source and destination, I create this list of the next nodes I have to visit, and this visited hash set, that should be hash set again, and this marks all the nodes that I've already visited. So I say okay next one to visit is my source then as long as there's something left to visit pull out the next node to visit, check if I'm where I should be if so, return true, otherwise check and update my visited, who I've visted and then go and queue up my children at the very end of the queue to be visited next. And those, and this will ensure that my children not visited immediately, but are visited once, sort of everything, everything scheduled has already been visited, so it will ensure it will visit level by level by level. And then if I get down to the very end and I haven't found my destination after all of this, then just return false. So that's how breadth-first search operates. In many cases when we want to find if there's actually a path, breadth-first search is often the better approach, because otherwise we could wind up searching really really far away when there's actually a very short connection, and it's certainly better when we want to find the shortest path. So now that you've seen breadth first and depth-first search why don't you try these out on a new problem. Good luck.
Info
Channel: HackerRank
Views: 915,333
Rating: undefined out of 5
Keywords:
Id: zaBhtODEL0w
Channel Id: undefined
Length: 11min 48sec (708 seconds)
Published: Tue Sep 27 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.