Flattening a JSON Object Using Recursion in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what's up coderbeit welcome back to another data structures and algorithms video we're here today with another video in our back to basics series where we're going over you know just random good interview problems and i'm just live coding them and trying to solve them for you just right before your eyes this problem comes from an actual interview problem i got at my current job my manager my current manager was i also worked with him at my last company he was our team lead at my last company he recruited me for this next job but um we're trying to interview people now to fill some positions on our team and he reminded me of this question which he asked me at the last job that i worked with him at so hope you enjoy this this question has worked out for me obviously positively so means something to me a little bit and it's a good one so let's get into it this week's problem write a function to flatten a json object with nested keys into a single level okay so this is pretty simple right off the bat i think we've all seen json and how it can have you know deeply nested properties um where there are objects inside the object or rays etc and it can go kind of endlessly deeply nested so this is a pretty common thing in normalizing data or pulling out all of those fields and essentially we want to still retain the structure of the json we want to know that something might have been deeply nested but we want to have everything on in a single level in the json so let's look at an example really quick so here we have um a a large object the keys are places and you can see that we have some kind of you know there's the first level which is uh countries united states canada mexico and then there are some deeply nested things um states in the united states cities you know cities um new jersey new york etc and then there are even further nested keys which are neighborhoods or areas so here we have newark and brooklyn and manhattan and finally in each of these um for this example i just put some friends or people i know who live in those places and you can see that we have strings we have arrays but essentially we have some nesting and it could be infinitely nested we're not sure how many levels there are right so we want to go from this structure to something that looks more like this which is just one you know there's just one level to this json and you can see that each key is retained and then we append to the key with an underscore the nested keys inside so here we have the key united states underscore new jersey underscore newark underscore zero jeremy because at the zeroth index here we have jeremy and then under score one we have stacy because she's the second person in this array right and etc this is pretty self-explanatory all right so let's code this baby up okay so we're gonna use python again just because we're trying to change things up maybe people like python as well and as per usual i am in pycharm and i just have a file called flatten.pi so as per usual i'm going to um add my um you know if dunder name is main uh you know run our function so if done under name equals main right we're gonna call uh let's you know say flatten json and let's define flattened json and that's going to take our object i can just put pass in there so it doesn't hate me okay so we're all set so i think what is important for this question particularly is that we don't know how many how how deeply nested is this nested json right we don't know exactly how far we'll have to traverse so immediately i'm thinking recursion because we can't just use a loop right for off you know for i in equals three and then loop through three levels we might have infinitely nested uh fields so we are gonna have to do some recursion in order to kind of continue until we've hit some base case or something so right off the bat whenever i think of recursion the first thing i like to think of is what are my base cases because that's the building blocks of your recursive function the base cases really represent when you're signaling to the computer you know hey stop right we've we've finished something here and so for this i think first of all let's get our example in the uh in the editor so that we can actually you know talk about something and know what we're talking about so i'm just going to copy the example i had in the slides here let's say nested object equals that so here we have our nested object and we're going to pass that in here right and i guess we can also print because we know that we're going to want to see what we get from this function okay so let's talk about our base cases i think our base cases will always be if it's if the value right if the value at a certain property or key is an object or a list um or actually i think only if it's an object we want to continue to recurse right we want to continue to go down if it's a string or an integer or a list we're going to want to do something right we're going to want to somehow capture that and then signal to the function that we've uh kind of we've reached our our end so i think what we'll want to do is the first thing we want to do is we're going to want to return an object right so this is going to take this is going we're going to want to return our object with our own with our single level so let's say like our return here is that and we're definitely going to return that right return the red um okay so this is going to be um the object with the with the single key and i think for this function we might want to create another function within flattened json because what we want to do is we actually want to not recurse over this whole object but really we want to for each of these keys we want to recurse over its you know quote-unquote children right so we want them for united for the united states we want to capture united states and recurse all the way down and so that's going to be kind of um it's going to be one branch of our recursion and we're going to have a branch of recursion that we set off for each of these top level keys so i think for that we're going to want to have another function in here that we're going to that's going to be the function that we actually actually recursively call so let's say this is just you know flattened and what this is going to take is going to take the object right we're going to want to flatten x or whatever i don't think we can call it an object because it might not be an object it might be an array or it might be actually just the the end of the line right so an integer or a string or whatever so i'm just going to call this x because it's kind of arbitrary like what what what's happening here and i think the second argument for this function is going to be that the key right because we're going to have to with each recursive call we're going to have to keep track of the of the key that we build right we're going to want to have united states underscore new jersey underscore newer and even though we're recalling the function over and over recursively we still want to retain what that key is that we're trying to build here so i think the second argument is going to be the key and the key if we don't have one yet we want to initialize it right so i think we can just set it to be an empty empty string and that's kind of signifying that we haven't we haven't started yet we're at the beginning of the recursive call okay so we have flatten json which is going to keep track of our uh kind of big object that we're gonna actually return and then we have this little function flatten and this is going to be the thing that we call for every single key in the uh nested json okay so what are our base cases here i think our base cases are right if we haven't if we're not at a list and we're not at an object right that's the base case here and at that point i think that what we'll want to do is we will want to take that key right this is the flattened key let's rename that we're going to want to take that key and add it to our return object right that's our base case here so at that point we want to set this flattened key we want to put it on our out object excuse me our red object right and we essentially want to set it to whatever we are at in the recursive call and at that point that will be x right the base case is that x is an integer or a string so we're just going to set that equal to x right so this is going to be an else because we're going to have a few other ifs but i'm going to make a comment here this is x is equal to a string integer vtc boolean whatever right okay so that's kind of our base case here right so what about if it's not if it's if it's an object or a list so let's go through if it's a list first so let's say if the type of x is list what do we want to do we don't want to recurse at that stage right we just want to loop through this list and append the index of where we are in the list to the key so in this case we can just do something like um you know uh for [Music] lm in x so now we're looping through that list we want to call flatten with each element because then what we do in this case is we want to basically end up in our base case right this is where we want to ultimately end up so we will call flatten which with each element in the list and what is the key going to look like the key is going to basically we're going to want to get the index right the index of where we are in this list so we can say i is equal to 0 to keep track of the index and we can say that we want to flatten it right we want to call flatten with whatever flattened key we're already at so this is going to be the flattened key and then what we're going to want to do is we're going to want to basically just concat that index to the key right so we can say that we want to do a string of i right and then we just want to add an underscore right and that's going to be our new key and that's how we're going to recurse over this structure okay so that takes care of our lists so now what do we do if we have an object so if we have an object right so if type of x is a dictionary right because that's what that is in uh python so what we want to do in this case is we want to loop over the um the keys in the object right and we want to recursively flatten each one of those we want to set off a branch of recursion for each of those right while retaining the flattened key that we have so we can easily do that we can say for current key indict uh excuse me an x at this point right we want to call flatten with x at that current key right so that's how we're going to pat we're going to kind of move this function along and continue to pass the nested json and then the key that we want to kind of retain here is we want to take the flattened key which is what we have uh from previous you know uh whatever function parent uh function call parent is calling this particular function right at this stage in the recursion and we want to add to it the current key uh plus an underscore right and what that will do is that with each call to flatten we kind of append these append the nested keys to the original key okay so we're gonna have to do some updates here right so we have an if here we're gonna have to make this an alif right and then here this is now going to be another else right so this is like the base case everything else will will come here okay so let's see how we did and let's see what we get when we run this um and so far we haven't actually made any calls to flatten so let's do that right here we're just gonna originally the first thing we do is we're gonna flatten it with the object right and it's not gonna have any keys because it's gonna be at the top level so all the keys are gonna be initialized as empty strings and it's going to recurse so let's see what we get here when we run this okay okay so we get something and let's see what we get here so i'm going to put this into a scratch file where's my scratch json okay and i'm going to do that and of course it's all upset because these are single quotes and not double quotes but that's all right okay so we basically got what we wanted except now at the end we have extra underscores everywhere right so why is that that is because let's look at what we have here because every single thing in this um when we flatten we append an underscore after every single key even when it gets to this stage so at this stage it already has an extra appended underscore because of the call before it so what we can do here is there's a cool thing in um in python that you can just kind of in place remove the the last character from a string and the syntax for that is just a colon and then negative one so that will just strip that last thing from the uh uh from the and i we have to do this too okay so that will just strip the last character from flatten key and return whatever it is minus that last character and then that should take care of that so let's see what we get here let's copy and paste this okay and that is exactly what we wanted to get right we have united states newark zero um uh but we don't have we have new jersey we only have one new jersey so why is that okay so here we are we have jeremy right and so here is our for lm in x flatten so that all looks good so why is that um missing jeremy here uh i know because i is i'm not uh updating the index as the loop continues so let's do that right here so that will push it along so let's try this one more time all right so now we have exactly what we wanted so we have united states underscore new jersey underscored newark underscore zero jeremy underscore one stacy etc so whenever i'm doing any uh recursive uh problem or a lot of things really i find the debugger super useful um but especially when i'm doing anything recursive because it's it's helpful to visualize the call stack and exactly what's happening at any given point so let's go through this uh we'll step through the function using a debugger just to kind of visualize exactly what's happening here so in pycharm it's super easy you can just start to add breakpoints so i'm going to add a break point here and here and here and we'll see what we get here i'll put one there as well um okay so let's run this with the debugger okay so this is might be uh hard to see because there's a lot going on here on the screen but at this point we're at the flatten object right object is just the big the big original nested object and then um rhett is obviously an empty dictionary so let's continue to uh step through this okay so we're at the first case if type of x is so let's see what is x x is in fact a dick right so let's step into and see what's happening so the current key is united states the current flattened key is an empty string and what we do is we go back into the function with whatever is at x of current key which is the united states so let's see what happens then okay so we're now in another this is our first recursive uh call to flatten and the flattened key that we have so far is united states underscore that was from the previous call right here right so we can continue to do this again so at united states there's another uh object right so this is the dick that we're dealing with at this point it's the one with new jersey and new york etc so we can continue so now we're in the new jersey part of the object and we've appended new jersey and another underscore so we can do it again okay okay that was the newer part so now we're uh back into uh if type x is so what is x at this point x is not a dick anymore we're at a list right so this is the jeremy stacy list so it's going to skip over that and go straight to the lift right and it's going to start to loop over the elements in the list so here we go we're looping over we now have this flattened key i is 0 because we're at the first index of this list and we are going to go back out into the function so at this point what we have x is finally a string so we've reached our base case and we have our key which we've constructed which is united states underscore new jersey underscore newer underscore zero so let's see what happens at our base case here okay so here we finally were at a base case so we want to set this entire key on our return object and strip that last underscore right so that's good okay so now we popped that off the stack and we are back in our looping over our list at stacy right so now at this point elem is uh jeremy uh okay so first we're going to we're in the jeremy loop we're finishing it i is now equal to one and now lm is going to be stacy and we're going to do the same thing we're going to call flatten with stacy it's going to be the same key as before right united states underscore new jersey underscore newark except now we're going to append a one and here we are um we're at x x is equal to stacy our key is you know that whole thing underscore one which is exactly what we want and we are back in our base case again and we set another entry in our return object which will be uh underscore newer underscore one equals stacy and we're back in our loop we finished over the list right so now we are popped back out and we are back into um the uh looping over the current keys of x uh and here we have that was newark so we're going to finish that call right and we're going to move on to new jersey et cetera so i'm not going to continue but i highly recommend that you all play with this it's a super powerful tool understanding any sort of recursive function it's really helped me in understanding recursion and uh yeah i i can't recommend a debugger enough that's all i got today for you folks i hope you enjoyed another back to basics video i'm really enjoying these they're a little less polished but by design and i hope that it's you know simulates a little bit more of what a real world kind of problem solving might look like and yeah look forward to more videos like this coming down the pipeline and make sure you comment below if you're liking this sort of content have a good week everybody
Info
Channel: Coderbyte
Views: 11,009
Rating: undefined out of 5
Keywords:
Id: aUuleLd0lwM
Channel Id: undefined
Length: 23min 30sec (1410 seconds)
Published: Tue Oct 05 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.