Search A 2D Sorted Matrix - Fundamentals of Search Space Reduction

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay so the problem that we're going to be  investigating today is a searching problem   and I think we need to do more searching problems  because I think it's a good topic to become sound   in and what we are going to look at is how do we  search a two-dimensional matrix that is sorted so   there's two variants to this problem that we're  going to address we're going to address this one   and then there's gonna be another one that I'll  describe so here's here's how it goes we have   a two-dimensional matrix I excluded the brackets  here but you get an array of arrays and what you   need to do is you need to find an item and the  special property of the matrix is that every row   is sorted every row is sorted every row is sorted  there's a reason I keep saying sorted because   anytime you see the word sorted you immediately  immediately immediately think my optimal solution   will be logarithmic somehow some way if there is  not a logarithm in your final solution there's   probably it's probably not the most optimal  solution given that the array is sorted or   if it ever has a sorted property you know it's  just a pattern you notice and whenever you see   sorted always think this is going to go towards  binary search but we're not gonna get there yet   we're gonna we're gonna go towards the thought  process and thinking about this but what we see   is the rows are sorted a special property about  this matrix is that the last elements of every   row is less than or equal to the first elements of  the next row so look at that three that three is   less than or equal to four right it's less than  or equal to the first element in the next row   so okay okay let's think about this but when I'm  thinking about searching immediately brute force   it we are looking for an item's say we're looking  for five what could we do well we could look at   every single item in the matrix that's going to  entail doing M times and time I've given us the   columns we have n columns for columns we have M  rows we have three rows and what I want you to   see is if we search every single element in the  matrix we touch 12 elements M times n four times   three we upper bound to the rows times the columns  M times n now this is one way we could do it but   I mean this is obviously not what our interviewer  would 1 because we we got a very special property   matrix very special and there's no there's no way  it would be this simple right so let's think a   little more so I said binary search we're computer  scientists we know about our logarithms we know   about our sorting algorithms and our searching  algorithms so we can think a little harder we   can be smarter than this right so what we can  actually do is why don't we do a binary search   in every row I mean that makes sense every row  is sorted right if every row is sorted if we do   binary search we maintain the invariance that if  we chopped a search space in half the search space   stays sorted and we can continue to do chopping so  that is how binary search operates right so what   would we do let's think about the time complexity  for every row Big O rows times what what is the   work I'm gonna do in each row well binary search  runs in logarithmic time upper bounded asymptotic   ly so we would be doing logarithm of the columns  in the matrix right every row would have a certain   number of columns we're gonna search against so  we would do rows times log of columns or Big O   of M times log of n so I don't have notes for this  video so I'm going off the top of my head so I'll   probably catch any mistakes in editing but that  is what's gonna happen if we do a binary search   on every row but there's there's a problem here  I mean immediately I saw this when I when I first   saw this problem I saw the little trick we could  do but let's say we don't notice it where where   where the trick is in this problem let's look  at the time complexities put them right there   those are the asymptotic bounds or the class of  common asymptotic bounds we normally work with we   see we have our linear time we see we have our  logarithmic time we have n log n time so where   we're sitting at right now is right around and  well again we have two variables so technically   it's not in log n but can we do better than  this can we do better than this could we jump   to linear time or could we jump to something  that is logarithmic in some sense so in fact   we actually can do that and one special thing  that I want you to notice is the two special   properties that were given to us one the rows are  sorted number two the last element in every single   row is less than or equal to the first element  in the following row what does that mean for us   so this is a mental jump I need you to make this  mental jump because we can't see how we find the   optimal solution unless we can really see this  pattern so I have I have this matrix shaped like   this and I gave each each cell a number like  that this is the indicee that is the index I   don't even know if indices a word someone told me  that wasn't word each of these cells signifies a   index and you see a two-dimensional array but  in fact we know we know that the data that the   data does not dictate the way we interpret it  we saw that in heapsort we had an array but our   visualization the way we interpreted the data was  not how the data was formatted in memory we had a   mapping system we had a visualization system  right and here in this problem we have a very   similar thing to that we have a 2d matrix but  what I want to see is a one dimensional array   I want to see a one dimensional array because if  I can see a one dimensional array I can perform   binary search on that one dimensional array so  think about this we have M times n items rows   times columns items if I were able to turn this  2d matrix into a 1d array I am going to be able   to chop M times n items in half chop chop chop in  half and then my time complexity becomes log of   M times n log of the total items how many times  can I cut M times n items in half log base 2 of   M times n again this is logarithms and we have  videos on that that explain logarithms but that   allows us to go logarithmic remember how I said  if we have a sorted array the optimal solution if   I do not see logarithms in it there's a problem  and I'm probably missing something so ok we can   go logarithmic in time but my problem the only  the only hitch to this problem is our mapping   system so how do we generate a mapping and it's  actually kind of tricky but whenever we try to   do something with wraparounds and with with  arrays that map indices to where we need to   wrap around between different arrays we normally  use a division sign and a mod sign so this is   what the equations look like right there so our  row if you give me an index to find the row I'm   in I need to take the index divided by the number  of columns so think about this six six divided by   the number of columns 4 6 divided by four is the  same thing as 3 divided by 2 which is going to be   1.5 and we're going to truncate the decimal place  because these are not floats these are integers   truncate the decimal place and now we have 1 what  row is 6 in row 1 now let's try another example   so the 10 the 10 divided by the number of columns  10 divided by 4 is the same thing as 5 divided by   2 so what we're going to have is we're gonna have  to point 5 so 2 point 5 truncate the 5 get the 5   away we have row 2 so again we can do 11 divided  by 4 11 divided by 4 is going to be something like   2 points what's 2 point 7:5 right remove the 75  and we get to the second row so it would take   a little thinking it would take a little playing  to see that we use the division sign to find the   rows and that's fine because if you really think  about like think about it if we're trying to find   the row that an item sits in the the row that an  item citizen is contingent to how many items we   put in a row again the row that an item sits in  is contingent on the amount of items we can fit   in a row if I can fit four items in a row that's  going to affect the row in item sits in because   the less items we can put in a row the farther  down our item probably will be if that if that   kind of makes sense that's kind of explaining the  intuition behind the rows and now let's explain   the intuition behind the columns why do we use  the modulo why do we why do we use mod let's   let's first validate our our calculations so  if I'm sitting at six again if I do six mod for   six mod four is going to leave us a remainder  of two and what column is six in it's in zero   one two yeah that's correct so if we have eleven  eleven mod four well if we do 11 divided by four   our remainder is going to be three what column  is eleven in we have zero one two three column   three this is just kind of one of those things  where you'd have to draw the pattern like zero   mod 4 is 0 1 mod 4 is 1 2 mod 4 is 2 3 mod 4 is 3  and then 4 mod 4 will bring us right back through   the beginning right so I think the best way I can  explain this is if we are adjusting for wraparound   we're going to use the mod operator because mod  helps us with wrapping around a fixed sized you   know section or fixed sized array / thing right  and that's how we generate our mapping system   so we have a row and column mapping system and now  that we have our mapping system it's just a matter   of doing the binary search so what we can do here  is why don't we actually execute a binary search   this will be free really fast so what I want to  do is I want to search for the number seven so   we're looking for seven so what happens is I set  my left and right bounds so I set my left at zero   I set my right at eleven so all the time this  whole time I speak in one dimension but I have   my mapping system to take me to two dimensions at  all times all times so what we need to see is okay   0 and 11 so now what I want to do is I'm going to  calculate the midpoints so the midpoint is just   going to be you can calculate the smarter the code  in the description does a smarter but it's going   to be 11 divided by 2 which is going to be 5.5  and that's going to become just 5 so our middle   element sits right there our middle element sits  at that position but how do we get that in terms   of rows and columns so we need to do our math so  we're going to find the rope we're going to do   the 1d position 5 divided by 4 is going to give  us 1 so we're in row 1 and our columns are going   to be 5 or 1d position mod 4 which is going to be  1 so we're in Row 1 column 1 it's that guy right   there so is that equal to 7 we're searching for  7 so we see that that's not equal to 7 so which   way do we go do we go left no we don't go left we  would lose value we need to increase value so we   go to the right so our new bound for left becomes  middle plus 1 it becomes 6 or a 1d coordinate 6 so   6 to 11 so ok now we need to do another finding  of the midpoint 6 plus 11 is 17 17 divided by 2   is 8 point 5 truncate the point 5 we have 8 so how  do we get to 8 we need to do our mapping system we   do 8 divided by 4 is 2 we're going to be in row 2  and 8 mod 4 is going to be 0 so we're going to be   in row 2 column 0 so that guy right there and I I  never did this way so you can immediately see the   position but we still have to do the calculations  so what we need to see is that's our midpoint did   we overshoot or undershoot eight is greater than  seven I need to go to the left so let's go to the   left we need to move our right endpoint one minus  the midpoint because we need to move to the left   so we need to narrow it and now we come to seven  so we're between six and seven are left and right   balance in one dimensional coordinates are six  and seven so let's find the midpoint so six plus   seven is going to be 13 13 divided by two 6.5  truncate the 0.5 we're going to be a position   six so position six what is that in terms of  rows so six divided by four is going to be one   point one point five we remove the point five  we see we're in row 1 and we're going to do 6   mod 4 which is going to leave us with just 2  and we're going to be in column 2 and that's   correct row 1 column 2 and we're indexing off  zero so what we see is we under shot in value   6 is less than 7 go to the left so our left mid  our left point becomes 1 plus the midpoint so   now we have our right and left pointing at 7 we  do our calculations our midpoint is going to end   up being 7 and bang we found 7 that's the end of  the search if our left pointer went past the right   our search ends we didn't find the item so isn't  it isn't this fascinating we were able to take a   two-dimensional structure and flatten it into  a 1 dimensional structure conceptually we did   not touch the array at all but the the the special  properties of the array allowed us to execute this   we were able to execute binary search our final  runtime is Big O of log M times n because we're   going to have 12 items we can cut 12 how many  times log base 2 times so we would be able to   cut 12 2 6 6 2 3 3 2 something like 1 point 5 1  and then cut it one more time down again that is   this approach that is this variant and now let's  look at another variance and investigate further   how we can take this searching story to another  application and another type of 2d search okay so   now it's time to investigate a different variance  on this problem where we have the same property on   the rows the rows are sorted going from left to  right and the columns are also sorted going from   top to bottom but the the change here is that  there's no guarantee that the last item in any   row is related or less than or equal to the item  the first item in the next row right so we can   we can see an example of that look at the last  item in the first row 11 it's not less than or   equal to the first item in the next row so our  visualization gone we can't turn this 2d matrix   into a 1d matrix there's no way we can do that  because we can't draw the same relationships we   can't merge everything into one one visualization  one conceptualized structure we can't do that we   need to find another way to search this so if this  was the first variant you get I mean immediately   you of the brute force you have m times n again we  have the same amount of rows which is three rows   same amount of columns which is four columns so  same thing M times n time we upper bound to all   the elements in the matrix we're going to search  every element and do constant time constant time   work in every element so that's just em times n  upper bound so let's think about this we really   need to do something do something that's critical  when we're doing search problems whenever you get   a search problem the number one thing you think is  how can I reduce my search space how can I reduce   my search space when we are looking at logarithmic  problems well the way we reduce it is we chop it   in half and we maintain an invariant we maintain  the invariants that the array is sorted we chop   it in half so we can't we can't really do the  same thing here as straightforward well the most   off solution sort of it's like that but we can't  really do that here so back to the drawing board   we need to rethink how we are approaching things  so let's let's do an example let's start or let's   look for item 10 we're looking for 10 what I am  doing is at any point in time what we're trying   to keep in mind is how do I reduce my search space  and a reduction of search space entails a decision   and in binary search our decision is go left or  go to the right or go left to the right to you   guys but the decision is going to fundamentally  brick it's going to bring us closer to an answer   while maintaining the properties that the array is  sorted so what we need to think of is what are the   fundamental operations I can define to search this  matrix what are the fun I keep saying fundamental   I'm gonna stop that what are the core core things  the core choices I can make it any step to reduce   my decision space that means we need to take a  forking a forking in decisions we need to either   go lower in value or go up in value so ok we need  to go lower in value or up in value I said that   we need to keep that in mind so how do I go up in  value well we need to think about this this matrix   is gradient 'add it there there are gradients  on it there are gradients going from the left   to right because the rows are sorted there is  a gradient running from top to bottom because   the columns are sorted and remember how I said  we need a fundamental decision so what we need   to see is we need to see that if I go left to  right I increase value if I go right to left   I decrease value if I go top to bottom I increase  value if I go from bottom to top I decrease value   so let's put that right there for reference and  what I just establish there is something I mean   it's right in front of us it's not something  where we would need to think too hard on it it's   it's given by the problem these are our actions we  can do to reduce our decision space and whenever   we're doing a search problem we are looking for a  reduction of the decision space and that's that's   our desired that's our desire to solve this so  what we need to think about is we need to think   about how do these decisions play in to finding  the most optimal solution and what we're going   to achieve is we're going to be able to achieve  a near optimal solution the most optimal solution   is fairly involved and it's actually addressed  by cracking the coding interview it's it's not   in the code sample because it's fairly lengthy  but we're going to address the very close to the   most optimal solution here which is a linear time  reduction of the search space with respect to the   rows and columns so how is this achieved so follow  me think about this and stay stay with me here   if I am looking for the item 10 what position  do I need to place myself so I can express my   abilities to decide to narrow my search space so  let's think about this let me try starting from   the bottom left so if I start from the bottom  left what can I do I can go up in value I can   go this way I can go down in value I can go this  way okay yeah that works for me what if I start   at the top right I can go up in value I can go  down and I can go lower in value I can go this   way okay that works for me those are two places  we could start this is theoretically possible now   let's try something else let me start at the  top left corner if I start at the top corner   there what can I do I can go up in value but then  I can go up in value so do you see how if I start   at the top left I'm not able to express my core  decisions like I can do here like I can do there   if I start at the bottom right I can go lower in  value or I can go lower in value so do you see   the problem with the top this dis corner and that  corner the problem with those is they don't allow   us to narrow our search space they don't allow us  to narrow it in a valid way where we wouldn't need   to search everything so follow me we're gonna  do an example and you'll really see why the the   core thing we can do here is starting here or up  there and why our decision or operations allow us   to stay within the width inability to find our  answer so let's start at the bottom left corner   so we're starting there is 11 greater than or less  than 10 11 is greater than 10 so immediately what   do I understand 10 cannot be in the last row  it only gets worse it only gets worse in the   last row 11 is the smallest quantity in that row  because it's the first quantity and going to the   right I only can go upwards 10 can't be there  10 cannot be there so let me express my let   me express my abilities to decide I decide to go  down in value how did we establish we go down in   value well we're gonna go up right we're gonna go  that way so we're gonna come to the 8 is 8 greater   than or less than 10 and again if you equal 10 we  just return true 8 is less than 10 okay what are   my decisions I can go up which is gonna decrease  value or I can go this way I can increase value   so I want you to think about this if 8 is in the  middle of the column can 10 be above 8 no it's   impossible why the column is sorted 10 cannot be  above 8 because we lose value so what you need   to see is where it eights the only way to go is to  go up in value how do we go up in value we go this   way now we're sitting at the 9 again 9 is less  than 10 if I go upwards I I can't do anything 10   cannot be up above 9 because we've lose value we  need to keep going and then we find 10 so we found   10 and okay that works so my theory was right and  now we need to look at the top right corner so   let's start there at 11 so let's look for 10 again  we're gonna look for 10 again and what we need to   do here is we need to look how 11 relates to 10  so 11 is greater than 10 so I need to lose value   if I go downwards I increase value I'm getting  farther away from the answer I'm I'm losing time   I'm wasting time that's not going to get me to  the answer so what I need to do is 11 I need to   reduce value I need to come this way I jump to  my 7 7 is 7 greater than or less than 10 7 is   less than 10 I need to increase my value if I go  left in the same row I'm getting worse I'm gonna   go from 7 to 6 2 5 2 4 possibly I'm not going  to get 10 I need to increase value so what we   need to do is we need to go down and we literally  reduce or eliminate the whole row and then we are   sitting at 10 and that's our value we want right  what we're doing here is we're reducing the search   space at every step in the problem and we're  not breaking anything because it is impossible   it is impossible for the item to be in any of the  positions that we just established that it cannot   be there's nothing we're doing we're not breaking  any rules we're not saying it might be over here   and and it possibly could just be over there it is  literally impossible that the 10 could be to the   right of that 11 why because we know that the row  is sorted right so that's what this solution is   all about ok yes that makes sense so what I need  to what I need to do is I need to analyze this so   what is the worst work I can do and I mean really  think about how what we were just doing we were   going either up or this way or we were going down  and this way what is the farthest I can go left   or right well the farthest I can go left or right  is the amount of columns n what's the artist I can   go up or down the farthest I can go up or down is  the amount of rose so we can upper bound the total   amount of operations to be the at worst at worst  the total operations we do is the number of rows   plus the number of columns and the reason reason  that's the case is because at worst we're doing   we're moving we're doing a little shift operations  shift / shift over at worst I can shift this far   and I can shift this high right so that's why we  add it would come to a concrete value so actually   let's find the concrete value just to really nail  this in so if I went all the way that way that   would entail one two three four checks which is  four that's how many columns we have and then if   I went upward we would do one two checks and then  we might not find the item but how many checks did   we just do one two three four five six six checks  what is the Rose plus the columns or what are the   Rose plus the columns well we have seven total  four plus three is seven so that is an upper   bound that upper bounds the amount of operations  we could do so asymptotically we say Big O M + n   now there's actually an even faster solution to  this and that is actually a very lengthy solution   it's addressed in cracking the coding interview I  put it in the in the code in the description and   it's very lengthy and like I said if you don't see  a logarithm in your time complexity and something   is sorted then there's probably your you probably  don't have the most optimal solution so it doesn't   run in logarithmic time with respects to the  rows and columns but I'm going to leave that   for another time because this video is already  very long so yeah that is how you search a 2d   sorted matrix and yeah so that is all for this  video if you liked this video hit the like button   subscribe to the channel my goal is to build  one of the world's largest software engineering   resources it might never become the largest it  well probably won't but I wanted to be one of the   world's largest software engineering resources  because I think we just need more resources in   this space of stuff and I think it's a cool thing  to do and yeah yeah I always think like this stuff   useful kind of isn't I mean it kind of makes  you think harder but I mean I still feel like
Info
Channel: Back To Back SWE
Views: 42,074
Rating: undefined out of 5
Keywords: search a sorted 2d matrix, search a matrix, search a sorted matrix, searcing algorithms, sorting algorithms, Back To Back SWE, Software Engineering, computer science, programming, programming interviews, coding interviews, coding questions, algorithms, algorithmic thinking, Google internship, Facebook internship, software engineering internship, swe internship, big n companies
Id: FOa55B9Ikfg
Channel Id: undefined
Length: 29min 30sec (1770 seconds)
Published: Sat Mar 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.