Website login using requests library in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello guys in this video we are gonna learn how to login into a website using the Python requests library so you must have already used the request library in Python for scraping the content from any given web page so I just given a URL and you use the request library to fetch the content from that web page right so that is where you use the request library but there is no authentication involved but what if we are able to create a session using the requests library and then able to login into that particular web site and then get the content of any given web page in that way we will be able to scrape much more useful information from any website right so that is what we are going to learn in this video we are gonna go through a step by step procedure for doing that and without any delay let's get started so here I have a website called code chef calm so it's a completive programming website and I our goal in this video will be to login into this website using some code which is by using the request library in Python so how to do that you already know that it's very simple that you just press the login button and you just login to any given website right but how does that happen actually behind the scenes behind the scenes there is nothing but a post request involved a an HTTP POST request is used in which you pass some for data okay and that data is being used to authenticate if the user is legit or not right so this is how it's done and all we need to do is just wrap that request specifications like what is the data going into that request which URL is being hit for that and like that so that is what we need to do so for that we are going to inspect this you can say this web page so let's say so I will open my elements inspector thing here right and you have to select the network so this network thing will show me all the requests which are going through this web page okay and now I'm just gonna press the login button that what happens okay so look at that some requests are being sent and then finally I have logged in and a lot of things have appeared right if I try to find which requests were made I think this request was made so look at that this is a post request and the status code was three zero two so this gives us a hint that yeah this request is being used for logging in any user on this website so this is the request URL for that and this is the request method this is the status code and if we just scroll down a bit we will get to know more we will see that what were the request headers and what was the form data oops so you just need to be careful that your password might not be encrypted when you try to say it here okay so mine is being shown here so now you have the form data which is being actually used by the website for authenticating and user so you got the name you got the password you got something called a form builder the ID if you let us see that where I can find this particular thing then there is a form ID which just it contains a value new login form so and we have opie which I think that means operation which is login so these are the fields that this website uses for authenticating the user so all we need to do is to create a session using the request library and then first of all make a get request to the code chef dot-com web page and then we will be sending a post request to this particular URL with this given form data and then we'll be able to log in we might have logged in by that stage and then we will be able to browse the complete website by hitting any URL okay so that is the thing that we are going to do so just let's start typing something let me just import the request library first of all okay so I have imported the request library and now I have to create a session because the login thing works with a session because you have to you have to serve a series of web pages while you are browsing through a website so in that case what you need is us so that the login information can be maintained throughout those web pages so for doing that we are gonna create a session so with the request dot session as s let me call it s so I have just created a request session context here and now in that context I am gonna first of all specify the main URL which I will be the opening which is going to be code chef calm okay so this is the code chef calm so I will just copy this so URL is code chef calm and after that I what I have to do is I have to make a request to this web page let's see what happens so now my request session object is s so I will use that for making the get request so it's s dot get URL okay and then now let me just print the content that I will get so let me just run it hmm so look at that I get 403 forbidden why this might be happening actually this must be happening because I am NOT providing any headers so the request headers are used by many websites to identify that 25 if the request is being made through the browser or by any bots in our case it's abort so it has detected that yeah it's abort so it's 403 forbidden but how we can but how can we bypass this so for bypassing there is a very simple solution you have to just use this user agent user agent header so all you have to do is just specify headers dictionary in which we will be having this my user agent like this and then this is its value so this is a browser property and you will be using this and will be just passing this headers thing to our request to see if it is able to differentiate between us in a browser or not so headers is what I am gonna pass now how does equal to headers so now let us run it now look at that are able to bypass that because we have got a lot of HTML content and I think now we can just move to the next part which is making a post request to the login endpoints right so now what I'm gonna do is I'm gonna make a post request right but before that I need to find what is the logic behind this form build ID because I know the name I know I know the password but I do not know this particular value and I feel that this might be important so I'm just gonna search for for build ID in this particular HTML code and look at that I got that so there is a form build ID and it's an input type it's an input tag which contains the named form build ID and its value is this so this is what I need to trace from this particular content that I get when I make a get request right so for doing that I'm gonna use the beautifulsoup library so from bs4 import beautifulsoup ok ok so I just made a mistake here from bs4 import beautiful su and now what I'll be doing is I'll be creating a soup object out of this HTML content so that is done by beautiful soup r dot content and also i'll be specifying the HTML power so that i'm going to need here and then yeah so that will make my soup object so if you are not familiar with the web scraping thing or with the beautiful SERP library I have provided some article links below in the description of this video you can check them out to know more about it ok so now moving on what I'm going to do now is try to find the bill form form bill ID right so formal ID is present in some is present as in an input tag so its input the tag name I have to find that particular tag which has a special attribute which is the the name of that particular tag is for build ID okay so this will provide me that particular HTML element and now in that I just need its value so that is what I have done okay so let's just take a look at back again at that thing look at that I found the input tag and then and how did I find that by specifying that the name of that HTML element should be formal ID and then I found its value because that is all I need right so this is how you do it so this is my formal ID but now let me create the complete data object which I'm going to send them so I'm just specifying it as login data is equal to let me copy it okay so now here is the login data the user name then a comma the password that I will change and then there is a formal ID which is you can say dynamic so I will be specifying that later in the session itself then there is form ID which is new login form I hope that is not going to change and then there is this operation o P which is login here so that's it that's a partial login data okay so I made some mistake here a comma was missing and now I will just specify that in my login data the form barold ID is equal to this value okay so this is how I'll be able to specify my complete login data and now we are good to make a post request for for logging into the website so we are just going to follow the same URL again and the data is going to be my login data and also I am just going to pass the same headers again in case they might need that right and finally I will to print the content to see if I have already to see if I have logged in successfully or not okay so this is it now let me just try it now let's see what is output okay so it will take a bit of time and I feel that we have done it let me just check by searching for my username in this particular thing yeah so look at that there's a link for the user page which is slash user slash nickel casing 97 so yeah so this is the page which is being shown right here hello nickel casing 97 right so yeah we have found it hello yeah so this is the thing so this is how we have logged in to this particular website and now we can continue this session by specifying any other get requests to any other page on this website and get the data from that as well right so this is how you maintain a session on any website using the requests library and in this video of you also learnt that how we can log in into any website by tracking down its by tracking down about by you can say by inspecting its network and then finding the particular request which is leading to the authentication right so this is how you do it I hope it was clear if you still have any doubts you can post them in the comment section below that's it from this video thanks for watching
Info
Channel: Indian Pythonista
Views: 145,128
Rating: 4.887023 out of 5
Keywords:
Id: fmf_y8zpOgA
Channel Id: undefined
Length: 12min 30sec (750 seconds)
Published: Fri Aug 10 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.