Working With APIs in Python - Pagination and Data Extraction

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video i'm going to show you the basics of working with an api in python we're going to run through a code demo of how you would make a request to an api and how we can actually deal with the data that comes back so the api that we're going to be working with is this one it's the rick and morty api and there's a reason why i've chosen this one it's open which means you don't have to authenticate so we can worry about authentication at another time but also it has a good rest api and even better documentation because the first thing i would always say when you're trying to work with a specific api is you need to go through their documentation so you can work out what information is where and how you're going to get it out so if we look at this we can see we have a base url for our rest ui our rest api now this is important we're going to take this as well and we can see that if you hit this end point you actually get this information back which tells you that you can query for characters locations and episodes what i'm going to do is i'm actually going to copy this i'm going to paste it into my browser and we're going to see that we get this information back here is what we saw right here so what we can do is we can actually just interrogate this as if we were working with a website although we're actually looking at and asking the api for bits of information so i'm just going to make this bigger right now so we can actually see what the main response looks like so before we go ahead and actually start working with the code i'm going to run through this response here and explain what all these bits are now it's really important because when you actually look through the api it will give you a response and a schema of what the actual data is coming back so we can see that we have two main things here we have the results and this info thing at the top so the info bit actually tells us how many results there are for for the endpoint that we've queried and how many pages there are now this is going to be really important information as we go forward and it also gives us the url to the next page you don't always get this but there will be some kind of total number of results and also number of pages or some do it by you do the next load of results so it would be the first would be one to a hundred and then the second would be 100 onwards etcetera but in this case we get a total count and a number of pages so we can collapse that and now we can see that we get a list of results so if i collapse that we can see that my uh my browser is telling me there are 20 items and we can see that we have all 20 here and each one has 12 specific bits of information in it an id name etc etc so what i'm going to do is i'm going to say that we want to compile our own list for whatever reason of characters from this tv series we're going to take the name and then maybe the list of episodes well that might be quite long but we'll see when we get there so what we want to do now is we're going to come out here and we need to go back to the documentation and we can see that we have what i just explained here it tells you that the information given etc etc and then the character endpoint which is what we just looked at so it says you can access different pages with the page parameters so that's really important and we saw that in the next link when we looked at the information up here under the info tab we can see it says next page is equal to two that's a really common way of doing it then we have all the information etc etc and you can actually see the character schema here that i showed you with all the information you can get out etc etc so we're going to do get all characters this is what we're going to do so i'm going to copy this url here and we're going to go to our code now to make requests to a server using python or anything like that we need to use the request module if you don't have this installed you can do pip install requests you can google that you can find it that is the main one that you that we're going to use even the main python documentation says to use requests i already have this installed so i'm just going to import requests up here at the top like that and then i'm going to set my base url to this but i'm going to remove the character part because this is the base url that we're going to use and then i'm just going to say that our endpoint is equal to character now notice that i've left the training slash on our base url so now when we hit this equals sorry there what i'm going to do with this is i'm going to make a request to this api with us with our code so i'm going to say r is equal to requests and we're making a get request so we're going to do dot gets then we're going to say our base url plus the endpoint now because these are both strings we can just use the plus symbol to concatenate them together and then i'm going to print out r now when i run this we are going to get back a response of 200 which is a good response if you're getting something other than that 400 or 404 is not there 500 is usually excess denied or something like that but now we actually want the not the actual response code we want the information that's in that response and the easiest way to do that is to r dot json and this is going to give us the json response back from that api so i run that wizards by on the terminal we can see we get a load of information back and this is exactly what i showed you in the browser except now we can actually do stuff with this response and we can take this information out and we can get the bits that we want so what i'm going to do is i'm going to say let's um say data is equal to r dot json and then i'm going to say print and we're going to say data now we can access the keys using a square brackets like you're doing a dictionary and i'm going to type info let's see what we get back now so now i've returned just the information part of the response now what we want to do here is we want to grab the number of pages because we want all of the responses all of the characters sorry from this api so we're going to go ahead and we want to know how many pages there are now you could grab the next page link when you go through every time and just do a request on that you could absolutely do that but generally i like to make one request to the start to know what we're dealing with and then do it that way so i'm going to say we're going to go for let's do pages so again we're going to reference the key of pages and this is going to return the integer of 34 there so we can save this into our variable that pages so now what we want to do is we want to work out what parts of the other bits of information that we want so let's go ahead and do print data now if we come back to the documentation it shows us up here we get results now this is a list so i'm going to come back and i'm going to say print data results and i'm going to ask for the first item in that list so let's run that and we can see that we get this information back here now within here we have the id and the name etc etc and then some other keys that we can access and then we can see we have all of the episode information there so let's do um let's print out the uh name so we're going to go we're going to stick within our first first item on the list because we're just working out what this information that we want and i'm going to say name so let's print that okay so we get the name and then let's save that so let's do name is equal to and i'm going to copy that and i'm going to say uh episodes and we'll do if we look for the key up here we can see we have episode and then it returns a list not that episode so i'm just accessing different parts of the json data let's remove that so it's a bit a bit clearer and let's print out episodes okay and there we go there's our list back now it's worth mentioning at this point that if you're new to this and you're still trying to work out how to interrogate the json data properly don't keep sending requests to the server go ahead and save the response you can just copy and paste it out into a json file and save it to your hard drive so you can just work out how you want to get that information out otherwise you're just sending unnecessary requests to the server i'm pretty confident doing this so i mean i've only done like five or six requests at this point so not a big deal so what we're going to do for this is we're going to say we want to know how many episodes this character is in so i'm going to do the length of episodes instead of printing out all of them i'm going to ask for the length and that's going to give us how many there are we can see there's 41 so i'm going to say that this character is in 41 episodes so now we've worked out what bits of information that we want to get we can go ahead and start to write our code out properly so we can make the right amount of response requests to the api get back the information that we actually want and then can then put it into a nice list so we get total number of characters and how many episodes they've been in again this is just demo demo information but you'll get the idea so the first thing i'm going to do is i'm going to start writing some functions and i'm going to say up here we're going to say default defining our new function and we're going to say main request and i'm going to say we need to give this a the base url and let's pass in the endpoint as well this just means that if we wanted to we could use this to actually go to the other endpoints of the api if we wanted to so now we're going to put our record r is equal to request.getting here like that and then we're going to return the r dot json the json response here so what we want to do now is when we get rid of this i'm going to move this up to the top so it's just out of the way really quick and now instead of data is equal to this we want to do our data is equal to main request and we are going to actually because we've called this base url and endpoint anyway we can actually copy that out there so now if we run this we should be able to get the same results back so let's just print name just to check that our function is working which it is okay so now we want to actually work out how many pages there are that we need to um loop through so what we're going to do is we're going to say def for our new function we're going to say get pages and we need to pass in the json response so we're going to say response and now we want to work out how many pages there are now as we looked here and we worked out from the uh info and pages we can copy this out so we can say pages is equal to we just need to change this ever so slightly because we're working with the word response and not data within our function and we can say that and then we can just return this back out now you can do it like that or you can just make it easier and just put the whole thing on one line there we go so now we after we do our main request if we do our get pages on the data and we actually need to print that out and we'll remove this print statement from down here and run this we should see 34. there we go so now we can say that we're making our request and we can work out how many pages we need and so what we're going to do now is we're going to write a function to actually work through the character information on each request that we give it so let's say df for function and we'll call this one pass json and we're going to give it the response again so now we want to put these two in here this is where we found the information so i'm going to remove them from down here but what we need to do is we need to make this into a loop so instead of it like this because here we're actually indexing the first item in the list so what we're going to say is for item in response and we want to access the results key which is we've got here so we can remove this part we want to say print item and we need to reference the name and then the episode but we want item sorry we want length because we wanted to know how long that list was of episode there i've got too many things going on there we go so i can get rid of these now for the moment i'm just going to return this function so i don't get any errors and i'm missing a bracket so let's put that in so now we can do our past json function and we can say we need to give it the response which in this case is data because the response is coming from the main request function and we have put in a print statement in here temporarily just to check that it works if it doesn't we can tidy up our errors before we crack on right there we go so it looks like we've got our 34 which we are printing out from our get pages and we have a list of characters and the number of episodes that they've been in here so we're halfway there so let's get rid of that so now what we want to do is we want to save this pass json part into a dictionary so we can actually do something with the data so instead of printing these two things out i'm going to put them on separate lines just like this and we're going to say our character or we'll just call it char for now and we're going to create a dictionary that includes these two items i'm just going to indent that in there and we're going to say the key is name and that is equal to the item name and then the number of oh i need there we go is equal to that there and we're missing our bracket there we have our uh character dictionary here that's populated by these fields but we need to add them all to a list so we can actually return the whole list from this function so just at the top here i'm just going to call this character list and we're going to copy that and here we're going to do character list dot append the character that we are adding and then i'm going to return out of the whole function the character list so now if i print out what comes out from this function just to see if it's right hopefully it is there we go so now we have a list of names and the number of episodes that they've been in just like we had before but it's in a bit more of a useful format now now we've got that working we want to work with the pagination which is why we have this get pages function so let's collapse these ones out of the way so we know that our main request works here but if we look at the documentation again we can see that it has this question mark and then page is equal to 20 at the end so i'm just going to copy that what that'll do is that'll work tell the request which page to work with so i'm going to add that in at the top here so i'm going to say that we need to give this a number which i'm going to make as x and we're going to do plus and then page is equal to x like this and we make this an f string so we can actually add that all together so when we make our main request now we're going to start with a number one and let's check that that works we get the same information back so now if we go to number two we can see we've got some different information which means our pagination is working so we can just use this main request function so once we do the first one on page one to get how many pages there are and pass that part of the data what we can do is we can say for x in range so let's get rid of this well not get pages let's just get rid of the print part so we can say 4x in range and we can pass in our get pages data function there we can print x so if i just get rid of this for this moment as we can see it goes to 33 and we actually have 34 pages and it starts at zero we just need to shift that over so i'm going to say for x and range we're going to start at 1 to get pages data plus 1. so if we now print x we should get 1 to 34 as opposed to what we had before 0 to 33 there we go 34 1 to 34 so those are going to be our page numbers so what we want to do now is we want to use our main request function which basically is the one that returns the json data i'm going to copy that out we're going to put this in here now i'm going to leave the print statement in so we can see the pages go by but we're going to do main request of the base url and the endpoint which is are both up here which is correct and then x which is for the x or the page number so we could store this into a variable but we actually want to pass the information from this so i'm going to grab our past json function i'm going to bung that in main and we're going to put that there now what we want to do is we want to actually store all this information that comes out and we get a list out of this we can actually have a new main list probably not the greatest name and we can put that in here but what we can do is we can do extend so this is basically going to add everything that comes out of that list into this new list as we go through each time so now if i do print they not not inside my for loop it's come out of my for loop and we do print let's do the length of the main list this is a good way to find out if you have sort of the right amount of records i think it was 671 we were looking for so we can see this is the request we're making here to the server we need that to go up to 34 pages hopefully 671 so we've got the right amount of results so i'm happy that now what we can do is we can export this to a csv file so let's import in pandas now pandas is basically a really powerful data science program and to use it for just creating a data frame and a csv file possibly overkill however it works really well and i don't know i'm happy with it so i'm going to import pandas as pd again pip install pandas if you need to now instead of printing the mate the length of the main list what i'm going to do is i'm going to say df for data frame make sure you call your data frame something useful when you're actually writing your code in a in an actual project don't just write df because it'll be confusing it's equal to pd.data frame main list so then i'm going to print def dot heads and then df dot tail just to double check that the top and the bottom of our data is different and we haven't like duplicated it up or something like that and then we can see we can work on um exporting it so i have actually got um here 670 it would be 671. this looks like a duplicate but if we go back to our actual response and this is page 34. we can see that we do have two lots of the new improved galactic federation guard and to be honest we should probably stick the id in there as well that would make a lot more sense so i'm actually going to do that so what we're going to do is up here in our past json we can actually put the id in so i'm going to make a new line in our dictionary and say i id it's equal to item id because the key was the id key we can see it there so now i've added added that in instead of printing this out i'm going to do df.2 csv this is why i use pandas because it's so easy to do this and we're going to call this character list dot csv and when working with pandas and this lot that i always tend to do indexes equal to false because generally i don't want the pandas index which is the zero index down the side i just want it to look like you would expect it to so now i'm going to save that and i'm going to run this again and this will be the last request i make to your api i promise and we will have back our csv file of the all the characters that have ever appeared in this show and how many times they've appeared in each one with the episode so let's click on that let's close this so we can see and now we have our id our name and the number of episodes all the way through down to all these characters who only appeared once including our new improved galactic federation guards who are the newest characters if the id number is anything to go by which i'm sure it is so that's going to do it for this video guys hopefully you've got a gun understanding now of how you can start to work with an api in python and you can make your own queries there's a lot more to it than this we could actually filter but i'm not going to cover it cover that in this in this episode maybe i'll cover that in future ones if you've enjoyed what you've seen please consider dropping a like leave me a comment or subscribe to the channel i've got lots of stuff like this with python i've got lots of web scraping content we're moving into some app building content etc etc so if you've enjoyed that hit that subscribe button until then thank you very much guys and i will see you in the next one goodbye
Info
Channel: John Watson Rooney
Views: 95,758
Rating: undefined out of 5
Keywords: working with apis in python, wokring with apis, python api, learn python, rest api python, rest api explained, get data from api, api pagination, api pagination python, api pagination example
Id: -oPuGc05Lxs
Channel Id: undefined
Length: 22min 35sec (1355 seconds)
Published: Sun Jun 20 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.