Scrape Data from Google Maps (in 2024)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in today's video you are going to learn how to script leads from Google Maps in 2024 with a premium web scripting tool named octopus I have done the same video a year ago but Al big princi simpals remain the same the video is in overall outdated it's time for a quick recap the link to download octop Pur is in the description on unfortunately this tutorial has two limitations first you can scrip phone numbers but not email addresses secondly you can only scrip 120 companies per search if you want to scrip Google Maps at scalea with emails I suggest you to use scrap.tf or Excel file within a couple of clicks you can also so find the link in the description so ladies and gentlemen we are going to scrip restaurants at Dublin please note that this tutorial will only work if you have selected English United States as a language if you want to change your language you click on here language and you select English United States I mean it will work as well if you have selected another language but the point is that this video will be based on a lot of formulas you can find them all in the description these formulas are called ex passs I've made an entire video about the topic but there are two things to remember related to xpaths the first one is that xaps are really powerful because they help you to script your data in a more accurate way but the second thing is that xpaths can change over time so if these xpaths are no longer valid that's fine you can still take a look at my video and write your own or you can do a point on click we will see how to do that in the video as well all right so here is your url restaurant at Dublin obviously you can choose another research if you want to I copy my URL I jump into Octobers I paste it here on the homepage and I click on start I will get my task the first thing I have to do is to remove this popup because I don't have access to Google Maps not yet what I have to do is to turn on the brow mode I will change the language to English and I click on reject all there we go as I've said for the example I will change my language to English and to make sure that the popup won't appear anymore I have to save the cookie C Keys therefore I go to options use cook keie use cookie from the current page and I click unapply the only thing that will not change is the language but that's fine we will manage it afterwards and let's start creating our task what we have to do first what we have to do is to scroll down to the bottom of the page as many times as possible because we've got like three maybe five elements so far but if I scroll down again and again and again we can get up to 120 companies why 120 companies and not more it's simply due to a Google Maps limitation there is something interesting as well is that we have to scroll down to the bottom of the page for this specific part of the screen only and not as a main page we will have to take it into to a account as well let's come back to octopus and I create another element I add a step what do I have to add a loop I add a loop I click on my loop I will rename it scroll and as a loop mode I select scroll page the scroll area I can choose between default and partial what have we said previously we want to get a partial scroll area this part only and I have to select my X if the x paath is no longer valid you can do a point on click meaning you do something like this this part is pretty hard to do for the scroll area so it might be better to write the xath for that one I have my formula I copy I paste it here I click on apply I scroll down a little bit more and I can choose from scrolling to the bottom of the page or for one screen meaning one screen at a time I suggest you to select for one screen even F to the bottom of the page can work to some extent and I repeat the scrolling process a thousand times I do not care as long as I keep this box checked I end the loop when there is no more content to load and as a waiting time 3 seconds is good I click on apply and I'm going to test whether my scrolling process works or if there is any mistake I have to correct I click on run standard mode and we will take a look I click on pause because I have to change my language I click on here language and English I click on zero and I can resume if I click on TR browser let's see see the scrolling process works fine so I can stop my task and then once we have reached the bottom of the page we can select all elements at once I will add another step another loop but this time this will be a loop item as a loop mode I select variable list because I have a list of items and it is a variable list as simple as that same thing as before I can do a point and click but I do not like this way I've got my xff I paste it here I click on apply I click on a blank space and I click on my Loop item one more time in order to see if everything works fine so far I've got four items that's okay because we didn't scroll down to the bottom of the page yet I go to options and I will wait for 1 second okay what do we have to do now we have to click on each item in order to extract data from the detailed pages in other words we have to do something like this I click on the first element I extract data from this element then I click on the second element it will automatically be done thanks to our Loop item then to the third and to the fourth you get the idea so I add a step and I add a click item element I select relative xath to the loop item because my xath ends with an a tag and an a tag in an HTML document implies a URL so I click on each URL relative to each element I go to options and I wait for one second each time and I load the page with Ajax I add a 10 seconds timeout it's only a maximum if the page is loaded before 10 seconds it will scrap the data right away I click on apply and to see if that works I click on Loop item and unclick item it does work as you can see it's displayed in another way but it doesn't matter because at the end we will get this layout all right we can extract our data now I add a step and I add an extract data step I will wait for 7 seconds I click on apply and what are we going to extract the first thing I'm going to extract is the URL of the page so I click on that custom field page level data and Page URL I uncheck extract data in the loop I click on apply in order to see my URL what do we have next the title the rating the number of reviews the category the address the website the fun number the photos the number of photos and the opening hours I'm going to show you how to do it for the first one and we will do the opening hours together at the end because this one is really interesting how can I extract my title same thing as before I can do a point unclick just like this or you can click and add custom field capture data on the page I give a name to my column I click on absolute X paath and I paste my xath I click on confirm and I've got my title something which is good to do as well it's not mandatory is to remove any white spaces before and after the text sometimes they are sometimes there aren't so just in case it's better to remove move them I click on clean data add step and trim spaces trim both confirm and apply I'm going to redo the same process for the ratings the reviews the categories the address the website the phone number the number of photos and I see you back once it's [Music] over [Music] as promised we will now take a look at how to scrap the opening hours this part might sound a bit difficult particularly if it is the first time you have ever heard of web scraping so I will try to explain it in simple terms I am on Google Chrome which means I can use a Google Chrome extension called xath helper it's a free one I click on it and the X path to scrip the opening hours is that one of course you can pick up another one if you want to but in my opinion it might be the best one and I'm going to explain it to you in a minute I copy it I paste it I've got zero result so let's take a look at the opening hours and now I've got 56 elements it is a lot and it entails that at some point we will have to combine all of these items into a single cell we will have to merge them well I've got my xff I come back to Octobers and I'm going to create another loop a loop in a loop loop I add a 1 second timeout and I select variable list as a loop mode I insert my XA here let's see if it worked it worked we have 14 items and once I'm here I'm going to extract data of course and I keep extract data in the loop checked this time I add custom field capture data on the page and I do not insert any xff but I keep checked relative xath to the loop item I click on confirm and I've got 14 lines of data if I want to merge them I click on more and merge field data so all of these data rows will be merged into a single one and that's it I think we can run our task I click on standard mode and I see you back once it's over I said we can get up to 120 data words and we got 120 data words I can export it in a CSV or Excel format and here is what it should look like this is the end of the video I hope you have enjoyed it if it's the case you can give a thumbs up and subscri subcribe to the channel and if you want to script Google Maps in an easier way you can take a look at scrub. I work with them and to my mind it is the best Google Maps web scripting tool on the market so the link remains in the description see you next time
Info
Channel: François from Octoparse
Views: 6,818
Rating: undefined out of 5
Keywords:
Id: jB7QtbCBc6k
Channel Id: undefined
Length: 13min 49sec (829 seconds)
Published: Fri Jan 12 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.