Google Coding Interview Question - firstDuplicate

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what up coders today we're going to be solving a Google coding interview question so you're applying for Google or you're gonna interview Google you know this can come up at any company realistically but this is asked by Google according to code signal you're gonna check this out of town code in the interview practice section so doing first duplicate so here we go so instead of reading that problem description I think it might just be better for me to explain it to you guys it's kind of mumbo-jumbo sometimes and they explain weird basically they're gonna give us an integer array so let's say I'm giving you an integer array that means an array of numbers right so this is an array of numbers there's an array of numbers this is an array of numbers too but what the constraint is is the values in that array those values these values in the array the numbers in there right they have to be between 1 and the length of the array right so the numbers in the array can only be between 1 in the length of the array so nothing less than 1 so 0 is less than 1 so you know that cannot be part of the array so this is invalid that's why there's an X over it and anything greater than the length of the array is wrong too so that can't be in there either so you can see the length is 1 2 3 4 5 so nothing greater than 5 should be in this and there is a bunch of stuff greater than 5 so this is just you know pretty much garbage now this could be a valid input right because the length of the array is 1 2 3 4 it has values between 1 & 4 only so 1 2 3 & 4 is perfectly fine right this is also valid because the length of the array is 1 2 2 3 4 5 and it has values between 1 & 5 it doesn't have to be all the values between 1 & 5 it just has to be the values in the array just have to be between the boundaries of 1 & 5 so you know there could be multiple ones there gonna be multiple twos multiple 3 is whatever just has to be between 1 and the length of the array the values have to be there okay we get it we get it we get it all right so now what we know we're given now let's look at our goal for this problem we're given this array what's our goal what do we have to do with this array all right so now that we understand our input let's look at the objective so we know that our array can have duplicates right this is this is there could be duplicate values right and I've labeled some of these here right you could see in this input array we have two ones two twos two threes in this input array we have two twos two threes right so let's assume that most of our inputs are gonna be duplicates this is some kind of real-world problem where we have a bunch of arrays of duplicates thousand probably doesn't come up too often but it might maybe we have thousands of numbers and maybe a lot of them are duplicates and what we want to do is we want to find the first duplicate and that is the name of the problem right so what does first duplicate mean though does that mean like okay one is the first duplicate two is the first duplicate right it's the first number No two is on the first duplicate in this input so what we're gonna go by is the second number of the duplicate the second number so you know this is the first number of a duplicate this is the second number first number second number first number second number we want the first occurrence of the second number so this second number occurs before this second number and the second number so it's the first second number of the dupe all of the duplicates so that is going to be the value we return so we're going to return one we're gonna return this one in this array and we have two 2s and two threes but the second occurrence of three comes before the second occurrence of - so we're gonna return three in this input and in this input there are no duplicates at all so when there's no duplicates at all and you can't return anything you're just gonna return negative one if there was just one duplicate that would be the what value return but yeah that's pretty much it hopefully you guys understand so your objective is to find the first duplicate meaning the second occurrence of that duplicate comes before the second occurrence of any other duplicates that exists if you cannot find any duplicates you just return negative one that is your objective all right it's time to accomplish this objective how do we do that there are usually multiple ways to solve problems some better somewhere we want to do things fast in computer science so we want to get that time complexity as small as possible we want to get as fast as we can to solve this problem because if we had to do with thousands thousands and thousands of data like arrays and numbers and stuff you know it can end up taking a long time so we want to get our time folks to be pretty low so first thing I'm thinking like brute force is the worst solution right so usually it's easiest one to come up with and oftentimes it's gonna be that ol ven squared solution and in this case it's gonna be that of N squared solution so in this case I'm kind of thinking like if we look for every duplicate like we could loop like the first thing I thought of I was like okay well if we loop through the array one by one you know you find it you start a two and you go okay and then you you can see duplicates and then it's like okay how do you know you need two pointers right so if we have two pointers we've take two pointers and then for each number so I'll leave a pointer here and then we take our second pointer we go boom-boom boom-boom-boom and we find the duplicate we can keep a minimum variable maybe something like minimum index of duplicate second Val that's there very long variable name you wouldn't want to use but I want to be descriptive here we would want to keep track of the index that is minimum where that's the second value where we see a duplicate right so we see this too and we get down here and every time we see the second a duplicate for the second time right we sell this value whenever we find the duplicate so the second number of the duplicate we're gonna check for the minimum we're going to update our minimum index to see if we can find we want to find the minimum index right so in this case we have 0 1 2 3 4 5 5 is the index of the second number here right so we found a duplicate second numbers index is 5 right so we update it right now we're gonna move our arrow right we got to the end of the array now we're gonna do another number and we're always starting with our arrow at I plus 1 so you know this would the red one would be I and the yellow one would be I plus 1 so we're gonna start here so this is gonna 1 this is equal to 1 this is equal to 1 now okay next number 3 then we go BOOM BOOM up we found another three now we are gonna check against our minimum which is set to five right now this is index 4 so 4 is less than 5 so this will be updated so we've updated our minimum index because we found an index that was smaller than 5 so it's 4 now and we keep going ok and then you go through the rest of the array there's no more duplicates and nothing's gonna happen and that's it they both end and there we go we found the minimum index of you know the all the duplicates so all the duplicates we keep updating a minimum with the minimum index of the second number we see and at the end it's like okay we have the index so how do we get the value now well you can just access array of the index and you'll be able to get that value because this array of 4 is gonna give us the correct value 3 and that's what we want so here's what our code would look like for that solution you can see we have this I shortened the variable name in second index is set to the length because that's not even a valid index we're making it big as big as possible so that we can update it with a minimum and we loop through this outer loop would be our red pointer like a red arrow and this inner loop would be you know the J would be our yellow arrow so the red arrow and then the yellow arrow goes you know one at a time looks looks for all the duplicates if they match up you know then we update that minimum just like we did a second ago and we update it with the minimum of J which would be the yellow arrow looking for that second value and the current minimum and the current minimum starts at length which isn't even an index so the first time we find a duplicate it would update to J immediately and every time we find a duplicate where the index is less than the minimum of the second value we update that min second index and if it remains the length of the end we know we didn't find any duplicates so we returned negative 1 otherwise we'd just return that value right so this is a double for loop solution and what and if you guys know anything about time complexity you'll know that a nested double for loop solution like this is going to have a time complexity of o of N squared ovince squared is not something that Google is going to accept or any of these big companies for that matter because we can do much better right what this ovine squared means is that if this array is size you know ten thousand it's gonna take oh of ten thousand squared time to solve that problem right but what we want to get it down to actually is linear right so o of n so that it's instead of ten thousand squared it'll just be of ten thousand time so you solve this problem so if we want to get this down to linear that means we're gonna want to get rid of this pointer and that means that we're gonna want to make it so that we're just doing one for loop through the array somehow and getting that minimum second value index now how can we do that well one thing you guys should know is that space complexity can be used to speed up time complexity so that we can use like a data structure or some kind of storage to store things while we try and solve a problem and speed up our time complexity instead of doing that double for loop maybe we could store something check back in with it and use that storage to speed up our answer so one data structure that comes to mind when we're dealing with duplicates or unique values or something like that is a hash set whenever you hear our duplicates or unique values think of hash set a hash set is built specifically to keep track of unique values and it's really good data structure because some data structures it takes time to access the elements of them so you store things and it'll take time to access stuff from them hash sets constant time no time it takes no time so when you access things from the hash set this is a set or a hash set we're gonna be putting things in we can access things that we put into it back from it in constant time meaning it doesn't take any time at all so let's have a hash set and here's the idea let's loop through these numbers one by one and let's add them to the hash in the first time we see one that's already in the hash set that is going to be the first duplicate that we see so just walk through this example right we see two right so we're gonna say is two in the hash Hut no ok so let's add it in the hash that I've got a twos in all right now we will get one is one already in the hash set no it's not so let's add one into the hash shot now we look at three is three already in the hash it no it's not so let's add three now we see five is five already in our set nope it said that it's okay and then we see three is three already in yes three actually is already in our set so that is our first complete duplicate we already saw it once these are our scene values and we saw the second closing one and that means that was the first duplicate that we found those the first one because we saw the first part of it and that was the first closing one we saw so that's pretty much how you do it it's just code this out really quick too so in Java it's called the hash set in other languages it could be called anything else like a set in Python or something like that so you can just make a scene initialize it it's just a basic for loop through all the numbers and then your if you can check use it these are constant time checks like I said so if seam dock contains a of I so if we already have the value in the hasha then we return that value because we already saw it in there once otherwise we just put it into the hash set right just say we did so we just do seen dyad these are just methods on the back end of a hash set if you don't know about hash this I'd recommend looking them up a little bit more there's some really good videos hacker ranked gala walk moved out talks over these and we return negative one if we don't see it right you can see how it's looping through each index in the array it's adding them to the hash set until it sees it you know it's saying is it already in the hash set if it is then we saw we saw it twice so that's the first complete duplicate we returned and we're out of the function so that's the first time we see a duplicate otherwise we're just adding it each time each time each time until we see the complete duplicate right yes the only downside to this solution like this is a good solution right this is linear so we went from N squared to end huge difference right that's really good improvement and Google would probably be happy with that big tech companies would probably be happy with that like that's fast enough the only problem with this solution is that now we're using space right just takes up memory and we don't really want to do that if we don't have to so although we're fast enough this is o of n space assuming like maybe everything's a duplicate and we have to put a bunch of numbers in and maybe the first two bookid is one of the last numbers so this could essentially waste a lot of space that we don't really want it to so if there's a way we could do this without space we would definitely want to do that and impress our interviewer [Music] so with a little bit of creativity and critical thinking there actually is a better solution we can get this down to linear time no extra space how the heck do we do that though it is not intuitive at all but there is one thing we never thought about and this is important when I read cracking the coding interview they made it very clear that every detail of a problem matters when you're dealing with these algorithms data structures problems these interview questions they're not lis putting things in here for no reason right have we ever stopped and thought about why the values have to be between one in the length of the array we never thought about that well you know we just were like oh okay that's fine but there is a reason because with that information we can use that and actually construct a solution and it's very tricky but this is the solution if we know that the values are between 1 and the length of the array then that means that these values we can use them as indices of the array as well in the trick we're gonna do here is we're going to take the current value as we loop through and we're gonna take that index minus 1 because if we took you know index 5 you know you can end up going out of bounds if you had like the maximum value if the length is 1 2 3 4 5 6 say we add a value 6 if we took index that's out of bounds right there's no index six it's only up to index five so we're gonna take the index minus one right so two minus one index 2 minus 1 which would be 1 so 0 1 2 minus 1 which is here and we're gonna make that value negative right and while we loop through these values we're gonna keep going in the next time we see a 2 we're gonna check if that the values index if this index minus 1 so 0 1 2 this index minus 1 is already negative that's how we know we found our complete duplicate right so let's just run through this example here so we see 2 right we're gonna look up this current array index 2 so this is index 0 this is the next one this is the next 2 minus 1 so we don't cook because we can't go to bounce so this is the first element and is this negative no it's not so we're gonna make it negative ok all right where are the next elements this is negative 1 though because we've changed it so there is no array of negative 1 so that's out of bounce so we're actually gonna have used the absolute value to get the original values right because as we move this red arrow along we're not checking for negatives right now we're checking for negatives when we do that lookup of the current value as an index minus 1 right so we're gonna use the absolute value as we go through so that we don't we get imagine that we're not using negatives we're not modifying these we do the map apps as we do the regular for loop you'll see that in the code and then when we look this up as an index we're looking up 0 1 okay it's this element minus 1 so we make this element negative now now we're at 3 so we look up array of 3 0 1 2 3 minus 1 it's DeRay of 2 and we make this one negative just happen to be it now we're at 5 we look up array of 5 0 1 2 3 4 5 minus 1 is 3 so we make this one negative now we're at 3 again the math absolute value so we look up array of 3 0 1 2 3 minus 1 is this it's already negative so that was our first duplicate that we found and we return that value so when you do that lookup of the index minus 1 and you know it's R and you see it's already negative that's when we found our duplicate right because the whole premise or idea here is that this three is when it looks up array of 3-1 or Rev 2 it's gonna look up 0 1 2 it's gonna make this negative when this looks over if 3 minus 1 which is 2 it's going to the same spot so it makes it negative here the first time and then when we see it again it's already negative so when we see it's already negative when we do that look up in the code we found our first duplicate we returned that right away so it's a little bit confusing it's a little bit jumbled up but if you look at it for a second it's exactly what we just did you're looping through right the red arrow this for loop living through element by element and every time we see an element we're making we're looking the array of the value of that element minus 1 so that index minus 1 and we're making it negative right that's basically what we did and then when we look up the value of that minus 1 as an index we look that up and it is negative already that's when we found our first duplicate and we just returned the math dot absolute value of that element otherwise we return negative 1 if we don't find any duplicates so you're making those values negative once again because when you see a 2 at the beginning of the array and the 2 at the ending of the other array when you see those duplicate values they're gonna map to the same index based on this because it's just the value minus 1 they're gonna map to the same index so you set it negative once at that random index that it maps to and then you see it again later it maps to the same element and it sees it's negative and we are able to check that right here and return the value of that element those are the final that's the final and best solution right that is no space that is a linear solution that's optimal Google is gonna you know you're gonna pass your interview if you come up with this solution those are three solutions the first one being proved for second one being pretty good but this is being optimal and yeah that's it hopefully this helps you get past your interview let me know how you like this style video I suggest this is my first drawing video and please like and subscribe so I can help grow my channel also please support me on patreon got the link below and I've study guides for the technical interviews down there thanks for watching this video and I will see you guys in the next one alright bye peace [Music]
Info
Channel: Nick White
Views: 214,248
Rating: undefined out of 5
Keywords: coding, programming, computer science, software engineering, google, tech, technology, algorithms, data structures, developers, software development, google coding interview, coding interview, technical interview, coding interview questions, firstDuplicate, code signal, arrays, fang, time complexity, hash set
Id: XSdr_O-XVRQ
Channel Id: undefined
Length: 20min 17sec (1217 seconds)
Published: Mon Jan 27 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.