How to scrape websites using Selenium in C#

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi and welcome to this video in this video I'm going to teach you how to make a very simple web scraper using selenium in c-sharp so first of all I'm gonna make a new project and for this video I'm just going to make a console application web scraper demo okay so before we can actually coat the scraper we first need to download the selenium library or framework and I'll do this by clicking the tools tab and then you get package manager and then find selenium and then I'll pick the solution and install there we go now I also need to download webdriver so I'm going to do that up so there I'm gonna go to the selenium headquarters download page and if I scroll down I can actually see there are many different robot drivers but for this video I'm gonna use the chrome driver feel free to use whatever you want and I'm going to download the latest no ya latest stable release and win32 because I'm on a Windows machine there you and then I'm gonna put it inside of the bin folder so I don't have to specify a path within the code I'm gonna open the folder in File Explorer and go to the bin folder debug and then I have it right here drag it in and I'm actually done now now I can begin code begin to code I'm gonna delete these comments and okay so basically when you want to make a controllable browser you basically just have to create an instance of this to a web driver and since I'm using the Chrome web driver I'm gonna make a chrome fiber and I'm just gonna import all these packages there we go so if I run this code right now it will actually make a new I mean open a new browser and that's it if we're not it you will not even close it and for this video I'm gonna make a web scraper for the Google search results so I'm gonna say good drive it or navigate and then to Google and this bill basically just make it go to the Google home page and actually if I go to the Google home page I'm going to need to type in some words right and I'm just going to make it search for web shop and then it should be able to retrieve all these actually notice these are ads but these ones right and so in order to do that we'll have to make it search so we'll try to find this input element and there are many ways you can do this you can do it by class name or name or IDE or XPath for example so you can say very refined element and then we specify a pie and since I just copied a export I'm gonna do that there you I'm just going to do it like this oh okay yes I can't do that okay and so we should actually get any element an element right here and then what I wanted to do is search for some keyword so I'm gonna send keys yeah and then we just gonna search for web shop and I'm gonna submit now this just searches for this keyword and that's basically it we have actually made a very simple but for searching but we have not been able to scrape anything yet so when we actually go to the this page we can do the same thing we did with the input field and that is find some random element for example this one and then right click copy and then XPath now the problem in doing this is when you do it a XPath it will be a very specific element so now I have selected this element but it might not apply to this element as well and we can actually check this by searching for the XPath I just copied and as you can see right here there are there's only one element and we can try and copy another one right here if I just find another right here and then copy the XPath and then put it in so you can see that there's actually a difference between the two x paths now I know that I can just do like this and then we go into this one and we find the elements by this XPath so we can say titles and then say drive out of five elements and then we do the same thing again and that's basically it now we can look through these elements and each one of these will will be contained within this collection of elements so I'm gonna say a title of I mean titles and then we can say conservative guideline that for example text and yeah we can try and test it and see what happens yeah it searches for the webshop key word and then it finds as you can see right here to find finds all these different titles yeah and it ignores the ads now this is one way you can do it I did it using XPath but you could also use a class name for example this one Alip you can try and see what happens if we search for that find I'll get elements right this is JavaScript if you don't know now oh okay I can see now if you do it by a class name will actually get the ads as well and that's not what I want actually if we could we get some other kinds of texts and that's not what I want you can try the next one okay so we could also do it like this class LC 21 be like Steven next yeah so I basically just copied this class name and then instead of doing the XPath I'm gonna put it out front elements and then by class name and we would get the same result now the above was a lot more easy was a lot easier however this is a very simple example so yeah often you want to do the class name instead because XPath can be very different for each element so you can't really just generalize it like I did with this one and yeah we can try and run it again and we get the same results so yeah that's what you could make a very simple web scribe in seinem and well this is just the beginning but I think just by doing this you can actually make some very advanced web scrapers yeah I hope you enjoyed this video and feel free to give me some feedback otherwise have a nice day
Info
Channel: Scrapax
Views: 6,432
Rating: 4.9633026 out of 5
Keywords: Selenium, Web scraping, Automation, Bot, C#, .Net
Id: CpugqTr2j60
Channel Id: undefined
Length: 10min 14sec (614 seconds)
Published: Sun Aug 25 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.