Web Scraping in Google Sheets" IMPORTHTML, IMPORTDATA, IMPORTFEED, & IMPORTXML - SheetAI App

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
that you five different ways of scraping data inside your Google sheet in just few seconds out of these four are Google sheet functions and one is the custom code a small limit let's see how so first of all we have this website which is Wikipedia and over here we can see a list of different planets and what I want to do is I want to scrape this particular table inside the Google sheet so the first thing what I will do I'll just create a new Google sheet by typing sheet.new this will create a new Google sheet and to do this to scrape this table we will be using the function called import HTML now to do that we need to just visit this website click on inspect element now the reason why I'm going inside this inspect element is because there are multiple table inside this webpage if so I need to know which particular table is this is the number one number two number three and number four which one is it so for that I can use this small snippet of code which you can find inside the description just copy it and paste it over here and hit enter once you will hit enter you can see all of these are listed over here now if I hover on all these tables you can see it this one gets highlighted when I hover on the fifth one which means this table is at fifth position so now what we will do is once we know that this is the fifth number of table we will go back to our Google sheet use this import HTML function by calling it like this is equals to import HTML import HTML function first parameter is the URL so I'll copy this URL it's a string so I'll surround that with the double quotes I'll type comma and then it asks for the query so queries basically we want to get the table the third parameter over here is the index which is fifth now I'll just close this function and hit enter and now within seconds you are able to scrape all that table so this is how you can use the function import HTML moving on to import data now import data basically helps us scrape CSV data you can see this is the CSV data and what I want is I want to import this CSV data inside my Google sheet so what I will do is I'll copy this I am going to create a new sheet is equals to import this is import data right so inside that I'll provide the URL and hit enter give it some time because this is a very long CSV file and you can see all of these country name country code year values are sorted and here for you to use now moving on to the next function which is import feed import feed basically helps you uh to scrape RSS feed now for example there is a major of these websites if they want to get indexed on Google they usually create a site map an RSS Suite so that Google can understand all these different links and all that which can be useful so for example NASA have this RSS feed for different things so for example we have an RSS feed for breaking news image of the day and education news and more so I'm going to click on the images today and this is the RSS feed so what I'm going to do is copy the RSS go inside here I'm going to write is equals to and then I'm going to write import feed once I write emote import feed I just have to provide the URL hit enter and wait for few seconds and you can see we have all the data now we are going to see how we can use the function import XML to scrape the data from the website so for this we are going to scrape the data or these blogs from this website called the hackernews.com now to do that we need one thing which is this Chrome extension called selector gadget for selector Gadget the link to download will be in the description now the reason why we need this is because this import XML function requires two things the first thing is the URL of the particular web page and the second thing is second thing is the X path so what is XPath XPath basically shows what exact element location is so that helps to scrape that okay so now if you know the code you can just go to the inspect element search through it and right click and find it like that but this is much easier if you use this add-on okay so let's see how we can do that I'll visit this website called the hackernews.com I'll click on this particular color and then I'm going to click on the title you can see all the title has been selected now I can just move myself so total 9 has been selected now I can click on this XPath copy that I'm going to create a new sheet this is the X path we need the URL and now we can use this function called import XML and provide the URL provide the expat and hit enter after we provide that the data will be there now you might be wanting sanskar all this function works really great why do we need another way so what happens is when I try to use this function for example create something from LinkedIn if you provide that let me show you so if I want okay I want to get the name of that particular person so I'll use this XPath method which we used in the import XML I'll select this particular username I'll copy the URL I'll go over here let me create another sheet I'll paste the URL here and then I'm going to click on the Xbox let me get myself out of here click on XPath copy that visit the Google sheet again paste the expat and then use the formula import XML provide the URL and provide the Xbox and if you wait for a second you'll realize that it cannot fetch the URL same thing happens if you try to use like Twitter and other websites so there are many websites which you can not scrape with this method so that's why I found a simple coding way don't know don't worry you don't have to know anything about coding for this it's just a simple script you just have to paste it and then do as we did for the XML way of using the x-bar so let me show you how now let's say the same thing which you want to do we want to describe the username and the description for example this particular description and let's say we also want to scrape some of the other information I'll show you how to do the name and the description you can do others as well depending on your website whatever you're using I'm doing this for LinkedIn you might be doing it for something else so the first thing to do that is you need to use this piece of code so this is a function uh this is code you don't have to worry about it basically what this does it will get the element by providing that XPath which we were using for the remote import XML function so we will just go do the right click come over here to the console and paste this here once you do that now you have the function which can use the same XPath which we were using for the import XML but directly like inside the console so you can just uh maybe you want to scrape the 10 things 20 things however you want you can do that so if I hit enter now over here you can see the function is available now if I provide the x path over here so let's say I provide like that and to get the X path I'm going to use the extension click on that click on the XPath copy the XPath click over here paste and hit enter so before I hit enter I am going to get the inner text so that's how you can get the values um but yeah I hope you find all these five ways helpful if you did make sure to hit the like button make sure to subscribe and I'll see you in the next one bye
Info
Channel: SheetAI App
Views: 5,591
Rating: undefined out of 5
Keywords: web scrape in google sheets, web scraping with google sheets, how to automate web scraping using google sheets, importhtml, importfeed, importdata, get data from website into google sheets, how to automate web scraping, web scraping, web scraper, web scraper intro, web scraper tutorial, data scraper, web scraper video tutorial, online web scraper, web scraping software, web scraper chrome tutorial, google sheets, extract data website, web scrape without code
Id: NwnxF96HIH4
Channel Id: undefined
Length: 6min 44sec (404 seconds)
Published: Thu Jan 26 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.