API Endpoints? Get data from the web easily with PYTHON

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I've run welcome John here and today's video we're going to be looking at another way of scraping data from the web in previous videos we've looked at getting the HTML passing that we've also looked at rendering the page in different formats of different ways but what we're going to do today is we're going to try and go directly to the API source at the endpoint and get the information from there so a lot of modern websites work by sending the website when it loads we'll send off a request to the API for the information and then it will pick that up and then render it on the page for you to see so what we're going to be doing is going to the endpoint which is kind of basically the end of the sort of communication channel from the from the API so the API will take the request from the website and then it will send the data that it's requested back to it what we're going to do is we're try and get in the middle of that we're going to actually send our own request off to that API to get that information back so I've got an example up here and this is the fh pro league hockey and we can see here that we've got a lot of fixtures some have scores and some don't you can see that unfortunately some have been cancelled but what we're gonna do if we go to the inspect element and then go to the network tab and click on xhr which is where most of these requests will be and then reload the page we should be able to find out where this information is coming from and how we can replicate that request programmatically to get that data so if we look through all of this we can see that there is some type and it's JSON data which is very good is very useful for us and if I see here there is one is a post request and it's get match stats list statistics list and we can see under the response part of the network tab that we actually have all this information here so if I scroll back to the top we see the first game is China Netherlands and it was 0-3 we can see here current team score 0 vs. team's got 3 and the teams were China and the Netherlands so this is exactly what we were hoping to find this is this is where the website has the request off to the server and the server has used the API to send it back to the website and JSON data now we really can work with this and we can get this information back ourselves programmatically so to do that we need to use an external program I use insomnia and Linux but you can use postman on Windows that works really well as well so what we want to do is you want to find the request here that we just we just use to find this data right click and then copy as curl see URL and go off to postman or insomnia which ever APO program you want to use create a new request I think this was a post request but I don't think this really matters let's click create and up here I'm gonna paste the data in and then I'm gonna hit send and then hopefully we'll get exactly what we just saw in our inspect element back in our API program so if you're using postman you just need to go import and then paste the raw data into I think it's called in yeah I think it's called import raw data and paste that in or we copied from the web staff we can see right away here that we have exactly the same JSON data we've managed to get a request back if we look under the headers that we sent there's not too many ones that are interesting here but depending on what you're doing you might find that there are more interesting headers that you could manipulate or change and then send a request off again so maybe it's got a a number of return results maybe you could change that to higher number to get more data back or something similar like that this one doesn't seem to it just seems to send all of the information back to us so what we can do is we can go ahead and we can click on generate code I'm not entirely sure what this is in postman but it's very similar so I'm just going to copy this I mean you're off to our text editor sublime we're gonna paste that in and we're gonna save it and we can see you hear that it's copied the payload the headers the URL everything for us so if we go and write head and run that hopefully we get back the JSON data there we go that's it right there all of that information should be all these scores okay we can see the first one here so you could then take this information here and you could store it in a variable pull out exactly what you wanted or whatever you wanted to do it for you could create your own dataset by putting it into a database like we talked about in a different video but basically with this one I wanted to show you another way where you could go directly to the API endpoint and get the data back that you're interested in getting from a website it's definitely worth exploring these different methods if you're trying to scrape data depending on the website and how it works you have some success with others maybe more than this but this is definitely worth knowing how to do you saw how simple that was for me to find that and then go ahead and get all of that information off so hopefully you guys have found this useful in another quick video subscribe to the channel if you're interested and you want to see more of this sort of thing drop me a like or a comment if your if you want to find out more great thanks thanks a lot guys cheers bye
Info
Channel: John Watson Rooney
Views: 9,032
Rating: 4.9849625 out of 5
Keywords: learn python, api endpoint, web scraping with python, web scraping
Id: uRlik_-puEw
Channel Id: undefined
Length: 5min 38sec (338 seconds)
Published: Thu Mar 26 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.