Search A 2D Sorted Matrix - Fundamentals of Search Space Reduction

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

okay so the problem that we're going to be investigating today is a searching problem and I think we need to do more searching problems because I think it's a good topic to become sound in and what we are going to look at is how do we search a two-dimensional matrix that is sorted so there's two variants to this problem that we're going to address we're going to address this one and then there's gonna be another one that I'll describe so here's here's how it goes we have a two-dimensional matrix I excluded the brackets here but you get an array of arrays and what you need to do is you need to find an item and the special property of the matrix is that every row is sorted every row is sorted every row is sorted there's a reason I keep saying sorted because anytime you see the word sorted you immediately immediately immediately think my optimal solution will be logarithmic somehow some way if there is not a logarithm in your final solution there's probably it's probably not the most optimal solution given that the array is sorted or if it ever has a sorted property you know it's just a pattern you notice and whenever you see sorted always think this is going to go towards binary search but we're not gonna get there yet we're gonna we're gonna go towards the thought process and thinking about this but what we see is the rows are sorted a special property about this matrix is that the last elements of every row is less than or equal to the first elements of the next row so look at that three that three is less than or equal to four right it's less than or equal to the first element in the next row so okay okay let's think about this but when I'm thinking about searching immediately brute force it we are looking for an item's say we're looking for five what could we do well we could look at every single item in the matrix that's going to entail doing M times and time I've given us the columns we have n columns for columns we have M rows we have three rows and what I want you to see is if we search every single element in the matrix we touch 12 elements M times n four times three we upper bound to the rows times the columns M times n now this is one way we could do it but I mean this is obviously not what our interviewer would 1 because we we got a very special property matrix very special and there's no there's no way it would be this simple right so let's think a little more so I said binary search we're computer scientists we know about our logarithms we know about our sorting algorithms and our searching algorithms so we can think a little harder we can be smarter than this right so what we can actually do is why don't we do a binary search in every row I mean that makes sense every row is sorted right if every row is sorted if we do binary search we maintain the invariance that if we chopped a search space in half the search space stays sorted and we can continue to do chopping so that is how binary search operates right so what would we do let's think about the time complexity for every row Big O rows times what what is the work I'm gonna do in each row well binary search runs in logarithmic time upper bounded asymptotic ly so we would be doing logarithm of the columns in the matrix right every row would have a certain number of columns we're gonna search against so we would do rows times log of columns or Big O of M times log of n so I don't have notes for this video so I'm going off the top of my head so I'll probably catch any mistakes in editing but that is what's gonna happen if we do a binary search on every row but there's there's a problem here I mean immediately I saw this when I when I first saw this problem I saw the little trick we could do but let's say we don't notice it where where where the trick is in this problem let's look at the time complexities put them right there those are the asymptotic bounds or the class of common asymptotic bounds we normally work with we see we have our linear time we see we have our logarithmic time we have n log n time so where we're sitting at right now is right around and well again we have two variables so technically it's not in log n but can we do better than this can we do better than this could we jump to linear time or could we jump to something that is logarithmic in some sense so in fact we actually can do that and one special thing that I want you to notice is the two special properties that were given to us one the rows are sorted number two the last element in every single row is less than or equal to the first element in the following row what does that mean for us so this is a mental jump I need you to make this mental jump because we can't see how we find the optimal solution unless we can really see this pattern so I have I have this matrix shaped like this and I gave each each cell a number like that this is the indicee that is the index I don't even know if indices a word someone told me that wasn't word each of these cells signifies a index and you see a two-dimensional array but in fact we know we know that the data that the data does not dictate the way we interpret it we saw that in heapsort we had an array but our visualization the way we interpreted the data was not how the data was formatted in memory we had a mapping system we had a visualization system right and here in this problem we have a very similar thing to that we have a 2d matrix but what I want to see is a one dimensional array I want to see a one dimensional array because if I can see a one dimensional array I can perform binary search on that one dimensional array so think about this we have M times n items rows times columns items if I were able to turn this 2d matrix into a 1d array I am going to be able to chop M times n items in half chop chop chop in half and then my time complexity becomes log of M times n log of the total items how many times can I cut M times n items in half log base 2 of M times n again this is logarithms and we have videos on that that explain logarithms but that allows us to go logarithmic remember how I said if we have a sorted array the optimal solution if I do not see logarithms in it there's a problem and I'm probably missing something so ok we can go logarithmic in time but my problem the only the only hitch to this problem is our mapping system so how do we generate a mapping and it's actually kind of tricky but whenever we try to do something with wraparounds and with with arrays that map indices to where we need to wrap around between different arrays we normally use a division sign and a mod sign so this is what the equations look like right there so our row if you give me an index to find the row I'm in I need to take the index divided by the number of columns so think about this six six divided by the number of columns 4 6 divided by four is the same thing as 3 divided by 2 which is going to be 1.5 and we're going to truncate the decimal place because these are not floats these are integers truncate the decimal place and now we have 1 what row is 6 in row 1 now let's try another example so the 10 the 10 divided by the number of columns 10 divided by 4 is the same thing as 5 divided by 2 so what we're going to have is we're gonna have to point 5 so 2 point 5 truncate the 5 get the 5 away we have row 2 so again we can do 11 divided by 4 11 divided by 4 is going to be something like 2 points what's 2 point 7:5 right remove the 75 and we get to the second row so it would take a little thinking it would take a little playing to see that we use the division sign to find the rows and that's fine because if you really think about like think about it if we're trying to find the row that an item sits in the the row that an item citizen is contingent to how many items we put in a row again the row that an item sits in is contingent on the amount of items we can fit in a row if I can fit four items in a row that's going to affect the row in item sits in because the less items we can put in a row the farther down our item probably will be if that if that kind of makes sense that's kind of explaining the intuition behind the rows and now let's explain the intuition behind the columns why do we use the modulo why do we why do we use mod let's let's first validate our our calculations so if I'm sitting at six again if I do six mod for six mod four is going to leave us a remainder of two and what column is six in it's in zero one two yeah that's correct so if we have eleven eleven mod four well if we do 11 divided by four our remainder is going to be three what column is eleven in we have zero one two three column three this is just kind of one of those things where you'd have to draw the pattern like zero mod 4 is 0 1 mod 4 is 1 2 mod 4 is 2 3 mod 4 is 3 and then 4 mod 4 will bring us right back through the beginning right so I think the best way I can explain this is if we are adjusting for wraparound we're going to use the mod operator because mod helps us with wrapping around a fixed sized you know section or fixed sized array / thing right and that's how we generate our mapping system so we have a row and column mapping system and now that we have our mapping system it's just a matter of doing the binary search so what we can do here is why don't we actually execute a binary search this will be free really fast so what I want to do is I want to search for the number seven so we're looking for seven so what happens is I set my left and right bounds so I set my left at zero I set my right at eleven so all the time this whole time I speak in one dimension but I have my mapping system to take me to two dimensions at all times all times so what we need to see is okay 0 and 11 so now what I want to do is I'm going to calculate the midpoints so the midpoint is just going to be you can calculate the smarter the code in the description does a smarter but it's going to be 11 divided by 2 which is going to be 5.5 and that's going to become just 5 so our middle element sits right there our middle element sits at that position but how do we get that in terms of rows and columns so we need to do our math so we're going to find the rope we're going to do the 1d position 5 divided by 4 is going to give us 1 so we're in row 1 and our columns are going to be 5 or 1d position mod 4 which is going to be 1 so we're in Row 1 column 1 it's that guy right there so is that equal to 7 we're searching for 7 so we see that that's not equal to 7 so which way do we go do we go left no we don't go left we would lose value we need to increase value so we go to the right so our new bound for left becomes middle plus 1 it becomes 6 or a 1d coordinate 6 so 6 to 11 so ok now we need to do another finding of the midpoint 6 plus 11 is 17 17 divided by 2 is 8 point 5 truncate the point 5 we have 8 so how do we get to 8 we need to do our mapping system we do 8 divided by 4 is 2 we're going to be in row 2 and 8 mod 4 is going to be 0 so we're going to be in row 2 column 0 so that guy right there and I I never did this way so you can immediately see the position but we still have to do the calculations so what we need to see is that's our midpoint did we overshoot or undershoot eight is greater than seven I need to go to the left so let's go to the left we need to move our right endpoint one minus the midpoint because we need to move to the left so we need to narrow it and now we come to seven so we're between six and seven are left and right balance in one dimensional coordinates are six and seven so let's find the midpoint so six plus seven is going to be 13 13 divided by two 6.5 truncate the 0.5 we're going to be a position six so position six what is that in terms of rows so six divided by four is going to be one point one point five we remove the point five we see we're in row 1 and we're going to do 6 mod 4 which is going to leave us with just 2 and we're going to be in column 2 and that's correct row 1 column 2 and we're indexing off zero so what we see is we under shot in value 6 is less than 7 go to the left so our left mid our left point becomes 1 plus the midpoint so now we have our right and left pointing at 7 we do our calculations our midpoint is going to end up being 7 and bang we found 7 that's the end of the search if our left pointer went past the right our search ends we didn't find the item so isn't it isn't this fascinating we were able to take a two-dimensional structure and flatten it into a 1 dimensional structure conceptually we did not touch the array at all but the the the special properties of the array allowed us to execute this we were able to execute binary search our final runtime is Big O of log M times n because we're going to have 12 items we can cut 12 how many times log base 2 times so we would be able to cut 12 2 6 6 2 3 3 2 something like 1 point 5 1 and then cut it one more time down again that is this approach that is this variant and now let's look at another variance and investigate further how we can take this searching story to another application and another type of 2d search okay so now it's time to investigate a different variance on this problem where we have the same property on the rows the rows are sorted going from left to right and the columns are also sorted going from top to bottom but the the change here is that there's no guarantee that the last item in any row is related or less than or equal to the item the first item in the next row right so we can we can see an example of that look at the last item in the first row 11 it's not less than or equal to the first item in the next row so our visualization gone we can't turn this 2d matrix into a 1d matrix there's no way we can do that because we can't draw the same relationships we can't merge everything into one one visualization one conceptualized structure we can't do that we need to find another way to search this so if this was the first variant you get I mean immediately you of the brute force you have m times n again we have the same amount of rows which is three rows same amount of columns which is four columns so same thing M times n time we upper bound to all the elements in the matrix we're going to search every element and do constant time constant time work in every element so that's just em times n upper bound so let's think about this we really need to do something do something that's critical when we're doing search problems whenever you get a search problem the number one thing you think is how can I reduce my search space how can I reduce my search space when we are looking at logarithmic problems well the way we reduce it is we chop it in half and we maintain an invariant we maintain the invariants that the array is sorted we chop it in half so we can't we can't really do the same thing here as straightforward well the most off solution sort of it's like that but we can't really do that here so back to the drawing board we need to rethink how we are approaching things so let's let's do an example let's start or let's look for item 10 we're looking for 10 what I am doing is at any point in time what we're trying to keep in mind is how do I reduce my search space and a reduction of search space entails a decision and in binary search our decision is go left or go to the right or go left to the right to you guys but the decision is going to fundamentally brick it's going to bring us closer to an answer while maintaining the properties that the array is sorted so what we need to think of is what are the fundamental operations I can define to search this matrix what are the fun I keep saying fundamental I'm gonna stop that what are the core core things the core choices I can make it any step to reduce my decision space that means we need to take a forking a forking in decisions we need to either go lower in value or go up in value so ok we need to go lower in value or up in value I said that we need to keep that in mind so how do I go up in value well we need to think about this this matrix is gradient 'add it there there are gradients on it there are gradients going from the left to right because the rows are sorted there is a gradient running from top to bottom because the columns are sorted and remember how I said we need a fundamental decision so what we need to see is we need to see that if I go left to right I increase value if I go right to left I decrease value if I go top to bottom I increase value if I go from bottom to top I decrease value so let's put that right there for reference and what I just establish there is something I mean it's right in front of us it's not something where we would need to think too hard on it it's it's given by the problem these are our actions we can do to reduce our decision space and whenever we're doing a search problem we are looking for a reduction of the decision space and that's that's our desired that's our desire to solve this so what we need to think about is we need to think about how do these decisions play in to finding the most optimal solution and what we're going to achieve is we're going to be able to achieve a near optimal solution the most optimal solution is fairly involved and it's actually addressed by cracking the coding interview it's it's not in the code sample because it's fairly lengthy but we're going to address the very close to the most optimal solution here which is a linear time reduction of the search space with respect to the rows and columns so how is this achieved so follow me think about this and stay stay with me here if I am looking for the item 10 what position do I need to place myself so I can express my abilities to decide to narrow my search space so let's think about this let me try starting from the bottom left so if I start from the bottom left what can I do I can go up in value I can go this way I can go down in value I can go this way okay yeah that works for me what if I start at the top right I can go up in value I can go down and I can go lower in value I can go this way okay that works for me those are two places we could start this is theoretically possible now let's try something else let me start at the top left corner if I start at the top corner there what can I do I can go up in value but then I can go up in value so do you see how if I start at the top left I'm not able to express my core decisions like I can do here like I can do there if I start at the bottom right I can go lower in value or I can go lower in value so do you see the problem with the top this dis corner and that corner the problem with those is they don't allow us to narrow our search space they don't allow us to narrow it in a valid way where we wouldn't need to search everything so follow me we're gonna do an example and you'll really see why the the core thing we can do here is starting here or up there and why our decision or operations allow us to stay within the width inability to find our answer so let's start at the bottom left corner so we're starting there is 11 greater than or less than 10 11 is greater than 10 so immediately what do I understand 10 cannot be in the last row it only gets worse it only gets worse in the last row 11 is the smallest quantity in that row because it's the first quantity and going to the right I only can go upwards 10 can't be there 10 cannot be there so let me express my let me express my abilities to decide I decide to go down in value how did we establish we go down in value well we're gonna go up right we're gonna go that way so we're gonna come to the 8 is 8 greater than or less than 10 and again if you equal 10 we just return true 8 is less than 10 okay what are my decisions I can go up which is gonna decrease value or I can go this way I can increase value so I want you to think about this if 8 is in the middle of the column can 10 be above 8 no it's impossible why the column is sorted 10 cannot be above 8 because we lose value so what you need to see is where it eights the only way to go is to go up in value how do we go up in value we go this way now we're sitting at the 9 again 9 is less than 10 if I go upwards I I can't do anything 10 cannot be up above 9 because we've lose value we need to keep going and then we find 10 so we found 10 and okay that works so my theory was right and now we need to look at the top right corner so let's start there at 11 so let's look for 10 again we're gonna look for 10 again and what we need to do here is we need to look how 11 relates to 10 so 11 is greater than 10 so I need to lose value if I go downwards I increase value I'm getting farther away from the answer I'm I'm losing time I'm wasting time that's not going to get me to the answer so what I need to do is 11 I need to reduce value I need to come this way I jump to my 7 7 is 7 greater than or less than 10 7 is less than 10 I need to increase my value if I go left in the same row I'm getting worse I'm gonna go from 7 to 6 2 5 2 4 possibly I'm not going to get 10 I need to increase value so what we need to do is we need to go down and we literally reduce or eliminate the whole row and then we are sitting at 10 and that's our value we want right what we're doing here is we're reducing the search space at every step in the problem and we're not breaking anything because it is impossible it is impossible for the item to be in any of the positions that we just established that it cannot be there's nothing we're doing we're not breaking any rules we're not saying it might be over here and and it possibly could just be over there it is literally impossible that the 10 could be to the right of that 11 why because we know that the row is sorted right so that's what this solution is all about ok yes that makes sense so what I need to what I need to do is I need to analyze this so what is the worst work I can do and I mean really think about how what we were just doing we were going either up or this way or we were going down and this way what is the farthest I can go left or right well the farthest I can go left or right is the amount of columns n what's the artist I can go up or down the farthest I can go up or down is the amount of rose so we can upper bound the total amount of operations to be the at worst at worst the total operations we do is the number of rows plus the number of columns and the reason reason that's the case is because at worst we're doing we're moving we're doing a little shift operations shift / shift over at worst I can shift this far and I can shift this high right so that's why we add it would come to a concrete value so actually let's find the concrete value just to really nail this in so if I went all the way that way that would entail one two three four checks which is four that's how many columns we have and then if I went upward we would do one two checks and then we might not find the item but how many checks did we just do one two three four five six six checks what is the Rose plus the columns or what are the Rose plus the columns well we have seven total four plus three is seven so that is an upper bound that upper bounds the amount of operations we could do so asymptotically we say Big O M + n now there's actually an even faster solution to this and that is actually a very lengthy solution it's addressed in cracking the coding interview I put it in the in the code in the description and it's very lengthy and like I said if you don't see a logarithm in your time complexity and something is sorted then there's probably your you probably don't have the most optimal solution so it doesn't run in logarithmic time with respects to the rows and columns but I'm going to leave that for another time because this video is already very long so yeah that is how you search a 2d sorted matrix and yeah so that is all for this video if you liked this video hit the like button subscribe to the channel my goal is to build one of the world's largest software engineering resources it might never become the largest it well probably won't but I wanted to be one of the world's largest software engineering resources because I think we just need more resources in this space of stuff and I think it's a cool thing to do and yeah yeah I always think like this stuff useful kind of isn't I mean it kind of makes you think harder but I mean I still feel like

Info

Channel: Back To Back SWE

Views: 42,074

Rating: undefined out of 5

Keywords: search a sorted 2d matrix, search a matrix, search a sorted matrix, searcing algorithms, sorting algorithms, Back To Back SWE, Software Engineering, computer science, programming, programming interviews, coding interviews, coding questions, algorithms, algorithmic thinking, Google internship, Facebook internship, software engineering internship, swe internship, big n companies

Id: FOa55B9Ikfg

Channel Id: undefined

Length: 29min 30sec (1770 seconds)

Published: Sat Mar 09 2019