How to Extract Multiple Web Pages by Using Google Chorme Web Scraper Extension

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello viewers this is a child Rafi once again welcome to this new video and I'm going to show you how to scrape web page by using Google Chrome web scrapper extension okay so here is the extension I'm going to use and I will add the link of this extension into the description field of the video okay so here I have to add this extension to my Chrome browser so I'm going to click here on add to Chrome ad extension and here it is web scraper has been added to Chrome and it's now it's time to visit your web page which one which I want to scrape so I am going to yelp.com to extract some business databases so here I am going to extract dentists from San Francisco let's see dentists and let's let's just skip the location to Chicago okay and then I have to click anywhere of the web page and click on the right button of the mouse and then here find this inspect element ok or you can also use the shortcut keyboard ctrl + Shift + I I am going to click on inspect and here I have got a view of the developer mode of this webpage and now I am seeing here web scraper tool which I have just installed this one ok I have to click on web scraper and then let's just make it a bit bigger okay so here after clicking on web scraper you have to click on create new site map to create a site map of this webpage or the resource link to be extracted again so I'm going to click on create new site map and then create sitemap and then put the site web name so I'm going to put Yelp dentists and Chi cago or just put okay Yelp dentists okay and then start you all the the resource URL so I'm going to copy this URL and going to paste it here great and now it's time to click on create sitemap okay so here we have got in this page we have got maybe ten businesses okay and we have got five thousand seven hundred and thirty eight results okay and let me see how many page so here there are many multiple pages and I'm going to show you how to extract all of the pages at once by selecting the elements okay so now it's time to click on add new selector and then we have to put the original ID like dentists and then the type should be linked as I'm going to select let's see which is this is the one number one so this these are going to be links so I'm going to click or add link and then I there are multiple listings as you can see here in this page you have got ten listings on the single page so it's a multiple listing so I'm going to click on multiple okay and then we have to select click on this select button let me make it a bit big bigger okay we have to click on select button and then we have to click on this us name okay this business name or this link address so we're going I'm just going to click on select and it's just got reread output and then we have to click on another the business name or dentist name okay and now if we just scroll down we'll see all of the numbers has been selected or the business names has been selected automatically okay and it's time to click on this button here enable different types of element selection so you have to click on it and then we have to click on done selecting ok and then we have to click on save selector all right now we have to click on our parent selector so I'm going to click on dentists and now under dentists we have to add new selectors for business name address web site and all other information ok so I'm going to click on this add new selector and then I'm going to put dentist's name ok and the text I'm going to open this one ok first of all I'm going to click and then here the ID is dentist's name and then the the ID or the the content should be text so I'm going to keep the type in text ok and then I have to click on select and click on Chicago dental studio and then I have to click on done selecting ok and then I have to click on save selector ok so the first element has been selected and now it's time to click on add new selector and then let's put the categories from here click on categories and this should be tax as well so I'm going to click or select and then click on each of the elements here okay and these are selected and it's time to click on done selecting click on save selector now again add new selector and then we want to extract this address information so I'm going to enter a double D array SS and then again text then click on select click on text ok this address feature and then click on down selecting and here it's been selected and click on save selector then we want the phone number so I'm going to add the select on the ID phone number again text then click on select and then click on the element click on down selecting and save selector and then we want the website address so it's time to click on add new selector and then his name website and then text ok and then select click on the element click on done selecting this is a repetitive actions here on this page and it's been selected again click on save selector ok so here we have added all the elements for first page ok now it's time to click on sitemaps ok so now click on this parent world now we want to extract all the pages let me show you sure the pages here all of these nine pages will be selected and maybe more the selected ok so to select all of them at once in to extract all of them at once we have to click on this button or our parents like okay so click here on add new selector then ID this is this is called pagination so I'm going to put the name or actually pages okay and then text the type should be links because there's a links okay and then we're going to select multiple pages so I have to click on multiple then click on select and then I have to click on numbers okay all has been selected and it's time to enable different type element section or selection enable and then click on done selecting and then click on save selector and now we are we have got two selectors one is for all of the dentist from the first page and the first page with elements which has a business name website phone number address etc and the second selector has all the pages okay it's instructing or extractor to visit all the pages to collect information or similar informations okay so now it's time to click on the first selector and click on edit button and here we have to select the pages we want to extract so root has been already selected but we again we are going to select and then click on pages to extract all of the pages at once okay click on save selector and now it's time to visit the graph let me see or let us check if everything is fine in selection so here is our root and then click on this graph then dentists on this list we are going to get dentists name categories address phone number website and then from page graph we are going to get again same informations from each of the pages okay dentists and then we are going to get dentists name categories address phone number website okay all are looking great and it's time to scrape them at once so I'm going to click on here and then click on scrape and request interval millisecond 2000 it should be 2000 milliseconds and then payload should be 2000 at all so this is the minimum numbers I kept our time frames and now it's time to click on start scrapping and you'll see a pop up window will be appeared and it will start extracting these information so I'm going to click on start scrapping and here we go a pop-up has been appeared and you will see that the page will be automatically loaded within few seconds ok so as you can see it's just started extracting the first page is being extracted and it's attempting extracting information from the second page and I'm going to fast forward the video to get all of the results ok and then I will show you what I have got or after some times you okay guys so here is you can see or scraping has been finished and it's time to click on refresh to see the data and here we go we have got all the extracted page we have got 93 or maybe 90 listings okay so I'm going to download them from here export data as CSV to see what we have got download now and let's go to the comments and save it here and it's been downloaded and let's open this file I had to wait few minutes to get the data extracted again so you might need to wait for a few minutes as well and let's see what are the informations so here is the dentist's name I think Kent or actually here this one here - this is a duplicate entry for this port so I'm going to close actually delete them this one is the source link okay and here is the dentist name category and okay so I forgot to select multiple here on the category section so this is why I've got only the first category okay and then here is the address information if I just like this and wrap text okay so here the address information you are the phone numbers and website for the businesses which is a website and these are the other information that I don't need okay so here we have got our 97 business informations extracted from Yelp okay so I have got all the listings list things from one page one by one from all of the pages okay so this is the process of using the Google web scraper extension and I hope you have found this video helpful and if you have found this video helpful please consider liking this video to support me and let me know if you have any question by commenting below and your opinion on the video and subscribe to my channel to get more videos like this one and I hope to see you in the next video thank you very much for watching
Info
Channel: Azharul Rafy
Views: 282,239
Rating: 4.8680439 out of 5
Keywords: web scraper free, web scraping, web scraping google chrome, data scraping from a website, data scraping tutorial, how to extract multiple web pages, how to extract web page data into excel, google chorme web scraper, how to use google chorme web scraper to extract data, extract data web pages into excel or csv, how to scrape data from a website into excel, how to scrape data from a website, chorme web scraping, web scraping tutorial, data scraping from websites into excel
Id: Gz3fbdXnjmw
Channel Id: undefined
Length: 15min 38sec (938 seconds)
Published: Fri Feb 08 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.