Learn Python - Browser automation auto login with Selenium web scraping Part 1

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to today's video if you can't already tell we are at 1 000 subscribers which was the 2020 goal and we've nailed that um in august so i'm really excited really happy thank you so much i'm kind of speechless to be honest um it warms my heart to think that a thousand people that they want to tune in each week and you know learn python make data useful and together grow as a community so today's question comes from the comments and it pretty much asks hey how do i log into a website with python to be able to scrape that website so it's two-part episode uh part one is now and part two will be out tomorrow so if you're not already subscribed make sure you do so you get that update and let's just get down to business okay down to business we go so we've got the website the username and the password so let's go ahead and input those into our jupyter notebook now if you haven't got jupiter notebook installed highly recommend you use that for your python coding especially when you're still learning it gives you instant feedback for example i've put in three variables here url username and password shift enter on that and then i can quite easily call upon them and get them straight away without having to boot anything up now in today's session we are going to be using selenium and along with that we are going to be using the selenium webdriver manager so starters if you haven't got selenium installed pip install selenium um piper is a really good resource for understanding different packages and webdriver manager just makes it really easy to use selenium when you're partnering with the web driver manager all it means is you don't have to worry about having the right driver installed the right version number this takes care of everything for you so let's go ahead and write some code so what we're going to do is we are going to first of all let's go ahead and from selenium okay from selenium import the web driver alrighty so import the web driver that's what we're going to be using to boot up the browser and actually control the web browser through python the next couple of things we are going to import i'm going to paste in here and talk you through them because there are a few words let's have a looks we are going to import selenium.webdriver.com.keys it's quite the mouthful and this simply allows us to send keys so whether we're typing something into a username or password hitting the escape button hitting the enter button all those good things we can do from the selenium webdriver common keys next thing i like to import is that chrome manager i spoke about so that one is as simple as from web driver manager which we can see here web driver manager and that is simply.chrome because we are using the chrome browser and with chrome chrome driver manager cool so what we need to do next is we actually need to go ahead and define a variable that will be used to drive our chrome browser so this case it's going to be driver is equal to webdriver.chrome chrome driver manager install so this line here would normally say something like uh you know oops chrome driver and if you're on windows might even say exe uh but instead it does actually say chrome driver manager open and close parentheses dot install open and close parentheses and the reason it does that is it's going to go ahead and try to install the correct chrome driver if it's not already installed otherwise it'll use one that's been cached so i'm going to go ahead and run this and you'll see what i mean so shift enter shift enter and straight away a couple of things are happening one the actual browser has started up and it says here chrome is being controlled by automated test software but the second thing that happened was our chrome driver manager went ahead and informed us that we have the latest version and it's using a cached version for us if we didn't have the latest version it would go off and download and install it automatically for us so i love that about the chrome driver manager a big fan of that package cool so next thing we're going to do is we have our url our username and our password so it makes a lot of sense that we first of all navigate to the url to do that we simply say now we've got our driver defined here so we can simply say driver.get all right and we're just going to pass in the url driver.url shift enter on that have a look at our automatic browser and there we are we've navigated to the website first thing i've noticed is there is a pop-up is that going to cause us any trouble maybe so what i might do is i might navigate to the website wait a few seconds and then press the escape key just to get rid of that pop-up let's um let's not do it now let's actually get python to do it for us so what we need to do is we need to say driver dot get element by tag name and the tag name here is simply body now it's just us selecting pretty much the entire page okay once we do that shift enter that doesn't actually do anything oh my god [Music] always good errors uh get element by tag name what have i done wrong there let's think about it oh silly me that's don't listen to me it's find element it's you know what i got confused with get it's it's common mistake of mine shift enter on that all that's telling us is hey we've found something okay what we're gonna do is we are going to go ahead and dot send underscore keys open and close and we are going to go ahead and capitalize keys dot and capitalize escape and if we go ahead and shift enter on that uh another error what have i done this time oh yeah okay yeah no there's it's keys adam not k's shift enter on that uh head back to our automated browser and it's now press the escape key for us and got rid of that pop-up next thing we probably want to do is probably want to log in and i can see a nice um big login button here now before we go ahead and click that i want to learn a little bit more about that login button how can i identify that login button and actually get python to click it for us so what i might do here is go ahead and right click and go to inspect that's actually going to open up our developer tools here and i'm just going to one more time right click and inspect the login button and when that's done it's actually highlighted uh in the elements in the actual source code where that button is and straight away i can see here it's got a class that just identifies it this class is typically used for things like javascript and mainly css but in our case we're going to say hey for the class of login link we do want to go ahead and just click that so i'm going to copy and paste login link let's copy that i'll close that off so the browser stays the same and what we're going to say is we're going to say driver dot fine make sure i'm typing this correctly dot find element by class and the class is actually called login link and what did i do wrong this time find element by class name this is what happens when you try to memorize um every function method attribute that exists in all these different packages you get it wrong a lot but i still recommend it weirdly um copy and pasting code's fine but if you in a position where you know your internet's dropped out you don't have stack overflow at hand it helps to kind of best you can learn these things especially if you are using them quite a bit so let's go ahead and try this so drive it up find element by class name shift enter on that and this is a good sign so it means it has found that element and what we're going to do now is we're going to say dot click open close brackets shift enter on that and if we go back to our automated browser we can see here now it's gone ahead and clicked that and we now have the opportunity to type in our customer id and our password which in our case we have two variables username and password which we're going to use to input into each of those but before we do we need to learn a little bit more about those form elements so same as before right click inspect that element and we're going to have a look at one more time inspect element and what we've got here is we have an input which has an id and the id is j underscore username has a class but class just says text text form control so that's not specific enough at all so when you are selecting these elements you want to be really specific so what we'll do is we're going to go ahead and we're going to find the element by id it's called j underscore username and copy that and let's go ahead and say driver dot find element by id okay and we are going to look for j using them shift enter on that good sign has founded and we are going to go ahead and we are going to send keys these keys and the keys we're going to send is simply our username so pop username in there so shift enter on that alrighty come back to our automated chrome browser and as you can see the username that we had in a variable in python is now loaded into that input field for the login box which is really exciting we do need to do the password as well so let's go ahead and similar to before we can right click and we can inspect element and what we're finding here is it's j password so that piece of code is going to be almost identical difference is obviously we're going to swap out username for password and we've got a variable called password so we'll go ahead and we'll pop that in there shift enter on that and just like magic we now have the username and the password inputted into this login box alrighty so the next step is to go ahead and just we could probably send the key that hits enter but let's get really specific let's actually go ahead and get the selenium browser to sort of click the login button so to do that we right click and one more time we inspect that element what we've got is a button button's got an id it's called login id you know what that's good enough for me let's go ahead and try that so um the name of it is login button we're just going to steal some of this code okay paste that in there but rather than looking for j password we're going to use the login button and rather than send keys we are going to use the dot click so let's go ahead and attempt that dot click shift enter on that never get back to our page and as you can see the good news is we are now logged in okay so we're no longer not logged in we're logged in so that's all good and well uh let's go ahead and shut down that browser let's go ahead and say driver dot quit uh quick no weird open close bracket shift enter on that so now that we've quit the driver we can go ahead and package that up into a login function call the login function it's going to boot up the page it's going to escape the pop-up it's going to hit the login key login button it's going to input the username input the password and hit the login button so let's go ahead and give that a try so what we'll do is we'll package this up and what i mean by that is we're just going to put all the code together okay so we'll keep going now driver driver driver and do we want to quit at the end no we don't we'll leave that out for now alrighty so uh first thing i'm going to do is we're going to keep driver up here for now just make sure that's available beyond the function everything else is is fundamentally part of this sort of login little dance that we're doing so what we might do is we might create a function called define login and that's going to be made up of these steps here now one thing i did notice uh we are going and pressing the escape key i'm just not sure if that's going to be happening too early or i'm you know by design my understanding is that the selenium browser should actually wait for the page to completely load which it will do but i imagine that pop-up might come a quarter of a second or a half second laughter and we might miss the opportunity to escape it which will probably break the whole thing so what we can do is we can say import time um and once we've imported time we can say let's see so the page will load uh and once the page is fully loaded i'm just going to give it a you know random number of five seconds um just just for enough time because even so the page is loaded that pop-up does take a little bit to come and pop up uh once that's popped up we're gonna hit the escape key and then i like to just give everything a little bit of um a little bit of sleep between each action that way everything's loaded everything's good and if there is any detection it's less likely to catch you out because you look like a human who's taking the time to click things so uh we're hitting the escape key and then we're going to sleep for a second uh and then we're going to click on the login link and you know what we're going to we're going to sleep for a second um and then we're going to type in our username and you guessed it we're probably going to sleep for a second and then finally we're going to hit the password and we're going to sleep for a second and then we can hit the login button and we don't need to sleep anymore so shift enter on that and obviously nothing happens because all we've done is defined the function we haven't actually called the function so to call the function let's first of all clean this up a little bit more shift enter on that and we are going to go ahead and call that function so if i just go here login shift enter on that and let's have a look we've logged in we're waiting i think five seconds we're gonna hit the escape keep my hands are here hit the escape key yep and hit the login button good username password login beautiful guess what this is a two-part episode part two is coming out tomorrow um we're going to go ahead and learn how to use selenium as the browser and go through and scrape all 2600 items on this website which i did some pre looking at and it's gonna be really fun so thank you so much if you're not a subscriber i would love for you to be one but hey no pressure um a thousand other people before you have chosen to do that and i'm so grateful thank you so much um it yeah it blows me away that a thousand people want to sit down with me on a regular basis and just you know write code make data useful um and learn python together i think it's awesome so once again thank you so much i really appreciate it shoot it tomorrow for the next installment of web scraping with selenium and have a really good day
Info
Channel: Make Data Useful
Views: 13,885
Rating: 4.9748955 out of 5
Keywords: python selenium, selenium tutorial, python for beginners, python automation, python web automation
Id: BZMVoYhA7KU
Channel Id: undefined
Length: 13min 41sec (821 seconds)
Published: Sat Aug 15 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.