Web Scraping Weather Data with Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay so in this video i'm going to show you how to scrape the weather data from google using python this is going to be a beginner level tutorial and at the end i'm going to explain to you why you might not want to do it this way and you might want to do something else but for this case let's get started i'm going to be using request html for this if you don't have it you can pip install it i'm going to do from requests underscore html import html session this is going to give us access to the session object which we can use to create our requests just know if you're pipping stalling this it is the dash not an underscore when you pip install but it's an underscore when you import it the next thing we're going to do is we're going to create our session so i'm just going to do s is equal to html session with the brackets on the end then we're going to set up our query string which i'm going to leave blank for now and our url which i'm also going to leave blank which will fill in in just a minute and now what i'm going to do is i'm going to construct the request so we're going to do r is equal to s dot get because we are using the s our session object here let's then give it the url and also we need to specify some headers now headers can be passed in like this directly with a dictionary or you can create a dictionary separately and then add it in the name of it in here i'm only going to be adding in one dictionary one header so i'm just going to do it this way now this is going to be the user agent like this this is going to get us past the google are you a bot thing so i'm going to do now is we're going to swap over to the browser i'm just going to go ahead and search up here for my user agent let's grab this whole string and move back to our code and let's paste the whole thing in there i just need to tidying up and there we go now we're going to grab the url from the page while we're looking at so what we need is just the search bit as you'll see in more in just a second but we don't need the rest of the stuff around the url we just need this part here so up to the queue and then these are the search terms that we're going to be putting in but what we want to do is we want to put our query string into our url so i'm going to turn the whole of the url into an f string which is going to let us put whatever we have in here into our variable into this string here so i'm going to get remove the word london and i'm going to make this our curly brackets here and then put the word query in so now when we run this whatever word is in query gets put in here so it's just a nice easy way to deal with putting different things into urls so now we want to actually think about running this so let's go ahead and just print out r.html.find and let's do the title and we will see if we get something back i'm going to put london back in the query for now so let's go ahead and run this and we get our element title down here so it's important to note that when you're using request html and you use the html find it's going to only return the element and then you can do something with that element if you also notice it has these square brackets around it that's because it always returns a list even if there's only one element in there so what you want to do is you want to do after your css selector which is what this is we just do first is equal to true and that's going to stop us returning a list we're just going to get the actual element that we wanted down here now we can do the dot text at the end now this time it's going to give us the text that is within that element so we should see that appear right here and this would be the title of the page but we don't actually want the title we want the actual weather data so let's go back to the page and have a look at the actual elements and see which ones we want to get for what bits of information we are after so what we've done is we've just basically gone and clicked on inspect and we can click on this little selector tool up here and then hover over the piece of information that we want now if you look here you'll see that these class names for the element tags have are a bit odd now that's quite common i i just to me without actually doing any more further further looking into it that might mean these could be dynamic and these might change over time and that's going to lead into the conversation we're going to have once we've done this at the end but what i'm going to do is i'm going to want to grab the 17 from this element so what i'm going to start to look for is some of these class okay we have a class and we have this id as well so it's a span tag with this id here so i'm just going to grab that id when they come back to our code and we're going to go and change this to span and then an id is used with a hashtag and let's paste that in there so now if i run this we might get 17 back we might get something else we do indeed get 17 back so now we've got the actual number of the um the temperature that's good let's go ahead and save this into a variable so let's just call this temp like this and then let's work on the rest of the information the next thing that we want to look for is the unit of measurement here so otherwise 17 what it could be i know 17 elephants warm i don't know without the unit it's no use again let's hover over it and here we have this tag here now this is interesting because we have this class of w-o-b-t now that i can see it will match all over the place because it has everywhere but what we can actually do is we can go and search for this tag here this div tag and then use that to find the span tag there's two ways we can do that so i'm just going to copy the class of this div tag and come back to our code so now i'm just going to do print again and we'll do r.html.find because we want to search for stuff and it was a div tag and the class is with a dot and it has this now it has a space in it you can generally get away with filling that space up with a dot i'm going to do first is equal to true again and then i'm going to run this and hopefully we should have our element back there we go we do now we've got two options this other element here that we were looking for has this span with a class here but it's also the first span tag underneath this div tag so what we can do is we can chain the two together so i'm going to copy the class here and what we can do is if we just go here and put a space then we're just going to search for something else that's underneath this div tag which was our span with our class of that so we're saying our selector is saying find this and then find this that's underneath it so if i run this again now we should get our element we do we can see if i just move this up and out of the way from my head which i can't thank you vs code anyway it's the right element just believe me and now we're going to go to dot text that's annoying and we'll run this again and you'll see we'll get the the the unit out there that does see so so far we've got the temperature and the unit unit of measurement which is good so that's two bits of data i think we should get one more so if we go here again we can just see there's this partly cloudy bit like a little description and we have this here now again this span id is a bit too vague i think we'd hit lots of lots of results if we search for that now you could index through it but what we can do is we can actually search for this tag here and then we can chain the finds together so this is a div class so let's do that let's come here and do print r.html.find and this was a div of that and again first is equal to true and what we can do is we can actually then do dot find again and we can search for this tag this span id which i've just deleted this span id within this div class tag so it's going to search for everything in here to this find this one this is quite a handy way to do things by chaining the fines together and it was a span with an id of that and again first is equal to true and dot text i'm just going to make this all a bit smaller it doesn't need to be this big there we go so now we've chained our find together so if we print this we should get that element partly cloudy there we go so this is now our description is equal to that so we've got our three bits of data and we've also got our query up here so let's change the query let's go toronto to toronto and run this and i didn't print anything out so let's change that let's print out the query the temp the unit and then the description and now when we run this we should get our information back so we've got our toronto 11 degrees and sunny let's try somewhere else let's give ourselves a space there let's try miami maybe somewhere a bit warmer would be nicer what do you reckon 30 degrees 28 degrees lovely i can tell you that's much nicer than it is in the uk at the moment obviously so now we've completed this this is how to scrape the weather data from google this is nice and easy but what i'm going to say is like i said at the beginning of the video that if you actually wanted constant weather data for an application or something that you're trying to make that is for more than maybe just yourself you're going to want to find yourself an actual weather api there are loads out there some of them plenty that are free that you just need to sign up for that give you probably plenty of requests otherwise things like this whilst they work and they're good they're good for learning they're good for personal projects but i wouldn't put anything like this further out just because it's not it's too dependent on anything that google might change or anything like that so definitely if you need data reliably for an application that you're making or some other purpose find yourself a good api otherwise stuff like this is really useful so thank you very much for watching if you've enjoyed this one you're probably gonna like this video here which is more web scraping
Info
Channel: John Watson Rooney
Views: 23,627
Rating: undefined out of 5
Keywords: web scraping weather data python, pytohn web scraping, web scraping, web scrapping, python tutorial, python, python beginner tutorial, beginner web scraping, python web scraping, requests-html, html parsing, html scraping
Id: cta1yCb3vA8
Channel Id: undefined
Length: 10min 47sec (647 seconds)
Published: Sun Oct 17 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.