How to find broken links & Images using Selenium Webdriver

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to Naveen automation labs guys please subscribe to this channel and press the bell like and to get some interesting videos on selenium and automation guys we will be discussing somebody amazing tools and technologies and a wage welcome again this is Lavinia once again so guys today I'm going to cover one very famous interview question that how will you check broken links on a particular webpage using selenium in fact you can save lot of manual manual testing ours ok specially for manual testers that they're spending a lot of hours to check what are different broken links are available on that page ok let's see we have on website and then on that particular page around let's see 100 pages are there on each and every page let's see we have around 50 to 100 page links are available or images are available right maybe some images are broken or maybe some links are broken the moment you click on it you are getting 4 0 for error or something like that let's see the moment you click on it you are getting 4 0 for page not found ever right some time we are getting $500 or some some broken images we are getting so how to check ok that let's see you have 50 pages on 50 pages you have a tea around 100 links are available for each page right so 15 200 it means 5000 times you have to click one by one manually right so and that's the tedious task and why should I do that then I should do some smart work in that case so in that case I can use selenium and I can create one utility ok and they will ask you at a time of into the also voice that's a very famous interview question that how will you verify that broken links how will you check that number of broken links are available on that particular page ok so there are some you know link validated tools are available that but these are paid tools so no need to go for advice simply we can write some basic selenium script and then we can check how many you know broken are available and which broken link get the URL and you can immediately raise the bug with the developers like that okay so let's do that so what I'll do let's create one okay let me clear this output okay let's create a class and my class is lateen selenium session let's create a class and let's meet last name is broken link test and let you select the main method and click on finish right and we will write himself okay we'll start one basic selenium script so we will launch Google Chrome and what is happening just a minute okay let me add the selenium jar files once again in just a minute race okay so what I'll do simple I'll launch my Google Chrome and then I'll add one website and let's see I have this particular site so I'm not wasting my time to write the code okay this particular code so we will add okay any site okay so what I'll do let's see we will take this one this particular code on free CRM dot-com we will enter username password and click on submit and jump to frame and after that we will verify everything okay I think this in our most other tutorial we have taken this particular site and just thread dots LeapPad throws declaration and I will tell you for new folks okay this is the free CRM dot-com site where I will show you okay this is a free CR dot-com site and the moment you open it okay so you can log first time it will be like this you have to login it and I have username password - Naveen k password is test one two three and then on the home page what you have to check number of links are available see these all are links are available these all links are available so many links are available okay some footer links are also available and some you know ok different links are available some hidden links are available or whatever right what we have to do always remember that first of all links are okay denoted by links are represented by a tag right so if you spy an ich Allender tag lets IAM spying in this calendar tag so you will see that a tag is available so we know that okay links are available in the form of a tag right so we know that links are available links are associated with a tag and we will check some and we know that images are associated with IMG tag okay so these two links most of the cases these two links will be there either a or IMG right either you have to click on a link and you have to click on a image then it will be really it will redirect to some different you know URL so the whenever you see that okay with a tag we always see this particular property that href property okay href means a reference property H reference property and if you click on it so developer what exactly they have written they have they're redirecting to on this particular URL okay on this particular URL they are redirecting after clicking on this particular link right now what we have to do that what we have to do we have to get this particular property H ref right sometimes what happens that okay a href is equal to blank or maybe a h2 is equal to let's see I want to navigate to wwe.com this is my expectation that I want to move to www.google.com right something like this but maybe by mistake developers they have written Google 1 2 3 dot-com or some wrong URL they have written so the moment you click on this particular a this link it is navigating to it will try to navigate to Google 1 2 3 dot-com and it means that link is broken that link is not working it may be some internal link or it may be some external website link also right so how to identify that what are a different HF properties are defined by the developers these are not a broken links they should be the proper links right place so let's see there are thousand links are available and in that case you are not going to check manually that okay H the property is correct or not and you are not going to click on each and every link again and again that's a very tedious task very lengthy task right so let's do some automation let's do some a smart work same thing for some images are available raise some time of images available let's see in this particular case there is no image I think I there is no ing but some time logo is there okay and sometimes we have to click on that logo and give you some example let's see maybe this is IMG or not I don't know let's see if you spy any site okay maybe some okay so if you go to facebook.com or any kind of a different site you will see that okay see this is this is the SVG this is not IMG so sometimes we see that actual logo is available okay or maybe let's open some see amazon.com and opening so maybe some image is available so images also available let's see this is the all these are a tags okay see all these are a attacks see this is the image okay if you see that this IMG is available so this IMG is having one property they see one property that it's navigating to this particular URL alright let's navigate you to this particular URL SRC is equal to like this for sometimes you have to check for the links as well as for images also okay sometime you have to check for the images also let's see let's check for this image see this is a VN IMG okay for IMG also this is a kind of link and we have to check that after clicking on this image it should not be broken okay this image should not be broken grace how to do that so well let's take one simple example okay maybe with a tag also we have this property HRF property and for images also let's see we have this property H enough property and after clicking on this image also we are moving to some other different website okay stdp okay www dot let's see test.com okay and both are working fine google.com and test.com something like that right so what we have to do we have to check for a also and we have to check for IMG also like so after coming on this free CRM dot-com on the home page we will check okay water first of all we will check how many links are available so we will collect all the links and all images together and then one by one we will okay fetch the property of HF one by one we will fetch the property of HF from links and for images and then we will check that HFE URL is correct or not how to check and tell you how ok so let's do this thing first so first step what I will do first step symbol I get the collection of or maybe get the list of all the links and images so how to get the list simple we have one method that driver dot find any meant by dot tag name is there and we know that all the links are represented by a tag right all the links are represented by a tag fine now we know that driver would find element will give you a list of object so I written like I'll write like this list of web element let's see link links list is equal to like this ok guys that we have already covered the driver or find element by dot tag name a it will return link okay one object list object and the object name is this and just import it from selenium and list is from util okay it's fine so by our tag name I have stored that is also fine now what I will do okay this what I'll do by this linked list in the same linked list what I'll do I'll write one core that driver or maybe I something like this in this particular linked list dot we have a method add all method okay you can add all the methods of different other element also in the same linked list in the same list ok the same driver dot find LD movements by dot tag name and the tag name of IMG also right so we have this particular list in which we have stored all the links with the okay by using tag name a and in the same link okay links list in this particular list we have add all the elements which are represented by IMG tag also right so in this particular list we have a tax and we we have different IMG tags also yes okay so we have all the a tax and ing tags also in this particular list so what exactly it will look like in this particular list it will look like I'll tell you let's see one object will be created days like this this list object okay this linked linked list object is representing this object okay and all the a tax different a tax will be here all these a attacks and different IMG tags also available here okay then let's say this line is represented by IMD and these are a these are a tax I'll write like this a tax and this is IMG okay this is IMG tag so let's see there are 50 images and 50 links it will be stored in this particular list in this particular linked list okay now what to do after that what we have to do base okay so what we have to do then we have to just tell me that let me get the pointer back okay now what we have to do I'll create one okay see maybe out of these 50 links or 50 images maybe some HR F is not at all available there is no H ative so I will ignore that part why because I don't need maybe that lame let's see one link is available the moment you click on that link let's see a trip is not at all available okay there is no HS property defined maybe business says that okay we don't want to click on it okay but this is a link but we don't want to navigate to some other site there is no address we will check only when the H ref is available that H ref should not be broken that URL should not be broken but if H ref is not available we will not consider those things we will not test those things okay it is out of scope for us right so what I will do simple I will create one more list okay list off my web element element and let's see only four active links which are active okay active means having a chav property so I will create one simple list new array list having one web building okay it will contain all the web elements okay see you guys simple I'm creating one list object over here and this list store okay and put it from you till fine now we have this particular list is having what this particular list is having all the images and all the a okay so we will iterate it okay now the second step is i trade my this linked list I trade linked list okay simple so how to I eat it by using for loop for int I is equal to 0 because list is to store the value on the basis of indexes so I listed on the basis of okay indexing I will start my loop I equal to 0 and this is my list grace dot up to the size of array I plus plus fine then what I will do I will put one condition if okay is if what if this particular linked list okay dot you get off I it means the first element let's see the first name okay the first link and then what I will do first okay element and then give me the get attribute attribute on what get attribute of H ref okay if that / - the first element get that tribute get the attribute value which i tribute itself attribute okay and then what i have to write is not equal to null okay if it is not equal to a null fine this is my first property I'll write like this first property I'll write like this if in this particular object get attribute value of H ref which is not equal to a null it means I will exclude okay what I'll do X exclude all the links or images okay which doesn't have any H ref attribute okay so I will take only those images or links which having H ref so simple I'll write if link list dot get I get attribute H step is not equal to null okay but is not equal to null then only I have to get it so I have this active links created and then I will create okay I'll add okay simple I will add this particular web element into it so what is my web element is going on my web element is this link list dot getify okay simple ways I'll repeat see it's little confusing that I stored all the images and links in this particular object in this particular list object I created a separate list object in this particular set list object let's see the total count is around 500 links and 500 images okay maybe HHR EV is there or maybe I should have this not there but in this particular active links what I will do I will filter it out I'll take only those links and images having a chav property so let's see around 450 links and images are available 50 images and links are available that it doesn't have any extra property right so obviously my count will be less in that case that's you having 500 in that case it will be let's see 400 or 450 something like that okay we should doesn't have any agitance so what how exactly I'm storing on the basis of that that I'll forget that attribute of href which is not equal to null and then simple active links not add and store all the values one by one into this particular list so my list is ready this second list is ready great and then I'll check simple to get the size of captive link list simple Android system brought over to printer in all right size of active links and images like this see I am printing like this this is active links dot size okay like this and I'll do one more thing that first I will get the size off allocate the size of full links and images means in this particular linked list let's see around 500 or 600 how many number of attributes I mean links are images are available okay first I'll print it and then I will filter it out okay I'll filter it out on the basis of HRF property it is not equal to null and then I'm printing that I want only those links and images having a cheater property that's it and I'll print it on the console that what is the size now give me the exact size so fine so this is this is ready so what I'll do let's run this program and we will see how many links are available and after filter it out after removing after getting only a chose property then you will see that how many links are available are having only a church property right so let's run it so it's running okay so it will enter into the system after login and from the home page see size of full there are 160 links and images are available okay it's 160 links and images are available and active links and images are only 140 it means 20 links are there okay 20 links on images are the earth which doesn't have any HS property so we have not bothered about it okay so obviously 160 total number of links and images are there and in this particular out of which only 140 links are available doesn't have any H enough property fine so this is till here it is fine now what I'll do okay what we'll do is simple now I'll I trait this particular second list now see how what I'll do now now the third step will be the third step will be check the okay check the whatever the HRF URL with HTTP connection APA so I'll use HTTP connection a way to check that HF is correct or not I am NOT going to click on that link I'll check we have an API in Silla in Java okay in java.net package there's 50 P connection APA so through HTTP URL connection APA what I will do I will simple check that link is is correct link or not okay so what I will do I'll simple for loop I will start my int J let's see is equal to zero and J less then this is my active links so active links dot up to the size and then J plus plus okay fine so now what I'll do J plus plus so guys we have one method that what we have to do that we have to create one class is there URL class is there so we have to create the object see guys now this is a little bit difficult syntax for us new URL okay one class is there okay so I'll create just object new a URL in this what I pass I know that in this particular active links I have all the images in images and links which have okay which have a chav property so what I'll do simple active links dot get of J okay and point one by one get off J first time day equal to zero and dot what I'll do I'll take the particular get attribute of href swell pick what is the value of H ref C because I know that obviously HF is available because we have excluded those links and images we doesn't have a giraffe now this is the pure linked list which is this is the pure list which has all the images and links having nature of property so one by one I'll take it H ref and what I'll do with this particular H ref property guys what I'll do I will do one thing that one method is there dot open connection method is there but before that what we have to do before that what we have to do ways we have to cast this thing into one class is available now see one class is available that class name is s TTP URL connection HTTP URL connection class is there which is available in okay sgtp connection class is available and we have to cast this entire thing into this thing so simple I am doing casting okay guys like this see try to understand the concept and thus index and this HTTP URL connection what we have to do this is output we have casted into it and what we have to do then we have a method dot just a minute something-something URL we have to import first java.net the HTTP URL connection also just a minute okay then got imported see even I don't remember the exact no now it is coming so we have this thing open connection method is there raised so what exactly to lose get properties of HF it will open the connection so internally it will check that that the URL is correct or not okay that actually URL is connect or not is correct or not okay open connection it will try to connect it right and then what I will do and I will store in this particular object okay HTTP connection object and let's see my object name is connection is equal to this okay and we have to add one throws lets you add with surrounded with try catch block you don't need so many package okay this is my try catch block and what exactly it is saying okay it's saying fine no shoes let's not write the try catch block simple I'll add throws keyword okay add throws declaration okay fine and what exactly it is saying there are no errors maybe some problem with my eclipse see it's not showing any red block so there are no hairs now guys what I will do with this connection okay now I get the connection what we'll do simple connection dot we have a method connection dot connector okay so connection dot connect what exactly it will do now see error is going connection dot dot connect see now what exactly the purpose see this is very little complex for you guys what I have done I have created this new URL class is available new URL available the URL class creates a URL object from the string representation that this is my string great this is my URL string right active links dot get J dot get a tribute HF property I am taking and what I am doing is I am passing insider URL and and on that particular URL I am casting into HTTP URL connection what exactly I should EBU our correction that it does it will ok HTTP URL connection with support for HTTP specific feature so if you try to connect with any HTTP protocol and all we know that all the urls are associated with HTTP protocol so it will it will make the connection it will open the connection with that particular URL so once the connection is open then it will then we will it will return one connection object and we are storing inside this connection HTTP object okay and with this we will try to connect it now we have connected to that particular URL ways we have connected to that URL okay now after that what we have to do with this particular after connection connection dot we have one more method that is called get response ok connection dot get response message method is there see the get response message will return that if it is ok or not if means if the link is perfectly fine if the link is google.com it will return ok it will return okay otherwise it will return some adil okay if it is written that Google 1 2 3 , it will not return ok all right so what I'll do simple okay and then simple I'll once I make the connection get the response message and then simple Eldo connection I'll disconnect my connection okay nice simple disconnect method I will use and then disconnect it make the connection get the response message and then check it is correct or not and then disconnected and the moment I disconnect it simple and right system dot out dot println what I'll write system dot out dot println I will write that this is the active links okay I will print what exactly the response we are getting active links dot get off I correct dot don't get off I get off J sorry and dot get attribute attribute of what and simple print H ref okay this particular H ref and okay guys and then I'll append with something like this I'll write like this guys okay and what is the message that okay so this get response message will return one string so I will store one in one string also write a string this Fonz is equal to this and I'll check if the response is okay l print like this if response is not okay so I will print like this that w w google.com response okay with arrow okay or let's see sometime we see that okay not found at 4:04 not found error 200 means 200 okay so there are different response code slice response code always defined like this if we are getting 200 it means okay if we are getting 4 0 4 this means northome 4:04 means not fault okay let see if you are getting something 500 some links they give you five and sometimes we see on the website okay 500 internal error so it will give you some internal error two hundred means okay four zero fear means not found sometimes we see that okay 400 400 means bad request okay something like that so if we any of them okay the response status code is there so if everything is fine it will return okay if a link is broken it will give you not found let's say link is not available link is gone okay let's see some images there Facebook images there you are clicking on it and facebook.com is down it's not available so obviously it will give you not found error right so we are printing one by one not found like this okay guys so let's see and let's run this program and let's see if it is working or not okay so let's run it so it will open your site and after login we will check ok logged in and after login first it will check how many links and images are available so then it will print on the console see 160 and after excluding 140 links are available now see it is checking one by one now after see first link is this upgraded dot CFM we just find a very dot CFM it is also fine free CRM indexed activity is equal to user which is fine but after one two three on the fourth okay on the fourth link it is giving some exception that malformed URL exception unknown protocol JavaScript so sometime what happens okay I will show you what exactly this problem okay sometime what happens some links are represented by JavaScript tag okay what is JavaScript tag I will show you so what I'll do ways okay so what I'll do in this particular HRF okay I'm printing this HOF also single system wrote out pin talent in this linked list object I am printing this H ref hf property okay and then i will show you what exactly sometimes we get these kind of urls also guys please practice this thing i know it's little confusing for you guys but it says very important interview question they will definitely they will ask you that how will you check the broken links into a system now see there are 160 links are available it's printing all the 160 links one by one so there are so many links are available it's printing currently we are not checking ok no it's each showing current active site 346 okay now see I'll show you what 366 now it is showing 366 images and links okay so out of 360 City it will check one by one okay this is a first HF property for first link second link at your property is there for thirdly h2 properties there but fourth one link is available a link is available where HF is equal to sometimes they write like this Java Script wide open with this right so obviously this is not the correct format of URL right this is not the correct form of URL so what we have to do we will exclude this with this javascript tag okay if that particular property is starting with JavaScript if we will not include it okay we will not include it or maybe you can get into this you can do some string manipulation and you can take this particular URL and then you can check it but what I will do in this particular today what i will do simple in this particular demo i'll exclude okay those elements those a tag or image tags which are having this particular HTF property because this is not the correct form of URL okay this is the current form of URL it should start with HTTPS something like this right so that's why it is giving you some error that URL exception it's not the correct format URL exception at this particular line because new URL is expecting some good URL some string specific specification you are okay so what I will do I will put one more condition that and end that this list okay dot get a get of Pi and exclude that particular thing also okay get off ie dot get attribute get a tribute of H ref dot contains contains what if that particular H ref it contains this thing it contains JavaScript like this JavaScript then ignore it which our script okay so you want to ignore that particular part so how to ignore so if contains JavaScript it will return true what case it will return true so I'll make it false okay so how to make it one simple and make it false like this I'll put this not okay see I'll put this not over here okay so what I will do first time it will come over air okay fine H type is available fine so h o is not null it means a giraffe is available then it will check this particular condition H ref is available but actually F contains JavaScript it is true then it will make it false this exclamation mark will make it false so it will not come inside the if part so it will not include those Arabs which are having JavaScript okay which are having JavaScript ok guys so now this is more robust okay more accurate I am taking only those H reps which doesn't have any JavaScript so that's why I have written not like this now let's run it now if you run this program what exactly it will do first it will print all the image and links hrf properties H ref okay values and then one by one it will check if it is correct link or wrong link CH printing all the H ref of all the images everything now see one by one it is checking total number of links and images are 136 right and C you can see the output ways freezy are and.com which is fine second link is also fine third link is also fine fourth link is also fine also although this is a very good set website it's available in the production so obviously you won't find any not found error or something like that okay see it's all the links are one it will check all the 130 links 36 links one by one it will check quickly right so easily you can automate these kind of things and you can put any site Facebook on whatever now see I will give you an example see so I'm going to put a minute my program and I'm going to do an example that one site is there that is the make my sushi calm on that particular page some links or images are available which are see this image is available we doesn't have any okay it doesn't have any let me open it okay some images are there which are broken images okay so let me spy this okay so this is a broken image okay IMG first of all this is a broken image or maybe some other links are available which are totally broken on that particular page this spatially this is the example I found that on this particular site on this particular page some images are available on links are available which are broken totally broken okay so let me check with this particular site so I will take the URL of the site what happened just a minute copy and then I'll take the URL and I'll put it over here okay so what I'll do I'll put it over here instead of free CRM dot-com I'll write it over here and I'll comment this login part okay so simple make Masucci calm and then simple get all the list of a and images print how many link and images are available filter it out on the basis of href which doesn't have any href I mean which has any H of is null then don't take it if H is contains JavaScript then don't take it okay and then this particular active link this is the accurate link which it doesn't which is having some a href and agita property is not equal to and that's ref value doesn't have any JavaScript tag and then simple I'm by trading it and then I'm using HTTP connection you are in making the connection checking the response is the response is okay or not found and printing one by one what is the response for each and every URL okay so let's run this okay see first of all it's printing all the images and c64 images are there see this is a link can you see that ways some broken links are available not phone so you can just copy this so let me terminate it you can do one thing you just copy this and razor bug see for this this link is not formed but this is fine this is also fine this is also fine but see again for this not found error is there it means this link is broken okay this link is broken so what I'll do simple I will copy this URL okay I'll send it to developer as a bug and developer for developers also it will be so easy let's see out of out of 64 you have raised around let's see 50 broken links are available and just get the list of 50 and sell you to develop one for developers it will be so easy that they will check they will just copy this and they will paste it and they will find this particular URL text in their code and immediately they will fix it right for them also it will be so easy to fix it so that is the kind of smart to work you can do you can check sometimes we see that of your broken links are available and production broken links are coming then people will raise a question that why didn't you test it it means it got missed by us god by the test missed by a developer or me not by the developer got missed by us that we didn't test it properly fine so it's tester responsibility that at least images and that you lower image is broken that is a huge risk right right or maybe some important links policy private policy sometimes we see to see that see guys sometimes we see that if you see any site let's see CRM PR Oh some help link is available right sometime this help help is not there help link is broken right let's see helps dot CRM calm it is not active help link is broken in that case obviously that we have to raise it every time you cannot check it manually okay so these kind of things you can trace right by this so please practice this one guys it looks little scare you it it's very very simple okay once you write the code you were able to understand very properly okay simple I'm using selenium bye door tag name and I will not find elements concept creating one to air I mean to air a list and I trading it and this is a new concept you just need to remember that how to Casting cost it connected and disconnected after getting the response that's it and you are good to go it's a very famous interview question I'm telling you definitely 100% they will ask you how will you first of all how will you check total number of links are available by your tag name but how will you check that links are not broken that's some tricky question okay that's a very tricky question that you should know that and it will give a very good impression on them okay so raise that's it okay and I'll upload this particular code into my great repository guys this is my git repository so if you go to my great repository I'll upload it over here in selenium Java codes okay whatever you can find all the selenium session classes Java classes over here okay I'll okay I'll commit my code over here okay so you can find this particular URL in the description of this URL and please guys subscribe to this channel if you haven't subscribe share with your you know your colleagues infringe some interesting videos I have already uploaded webdriver architecture in Java doc and number of things have a look and then let me know if you have any issues ok thank you so much guys
Info
Channel: Naveen AutomationLabs
Views: 71,955
Rating: undefined out of 5
Keywords: Selenium, How to find broken links in Selenium Webdriver., How to Find Broken Links on Your Website, how to find broken images using selenium webdriver, how to verify link in selenium webdriver, how to find broken images in a page using selenium, how to fetch all links and click those links one by one using selenium webdriver, finding broken links using selenium, how to get all links from a web page using selenium webdriver, broken links learn automation
Id: f_8yUC52g34
Channel Id: undefined
Length: 45min 35sec (2735 seconds)
Published: Tue Oct 03 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.