Excel VBA Introduction Part 57.6 - Working with Multiple Tabs in Selenium

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to this weizel excel vba tutorial in this video we're going to explain how to work with multiple tabs in google chrome using selenium we'll begin by explaining how to open up a single new tab using a bit of javascript code and how to navigate to a different url in that tab we'll explain how you can switch between the open tabs in your chrome instance how to close tabs down and how to avoid stale element reference errors we'll show you how you can loop through the entire collection of tabs you have open and then a couple of practical examples to finish off the video which both involve opening up one tab for each item in a list and we'll show you an example using a wikipedia list and one for yahoo finance as well so quite a lot to do in this one let's get started to get started i'm going to assume that you already know the basics of working with selenium and vba and if not you can use this playlist and in particular this video to help you get up and running the first basic example we're going to write is going to navigate to the wikipedia main page in a new instance of chrome so i'm just going to copy the url to the clipboard while i'm here and then i've got a brand new blank excel workbook saved as a macro enabled file and in the visual basic editor we can insert a new module into the project and then create a subroutine called something like using tabs or using multiple tabs something along those lines i'm then going to set a reference to the selenium type library by going to the tools menu choosing references and then scroll down the list to find the selenium type library and then simply place a check in the box next to that entry having done that i can click ok then i'm going to declare a module level variable to hold a reference to a chrome driver so i'm going to say dim cd as selenium.chromedriver i'm making some module level variables so that it doesn't disappear at the end of the subroutine so we can see what's going on inside the subroutine i'm going to create a new instance of the chrome driver by saying set cd equals new selenium dot chrome driver and then we can start the driver by saying cd.start and then navigate to that url by saying cd.get and then in some double quotes paste in the url i've just copied having done that i can just give the quick subroutine a test by running it by pressing f5 and there we go our new instance of chrome with the wikipedia main page displayed next i'd like to open up a new tab in the same browser and we can do that by executing a little bit of javascript with the execute script method of the chrome driver to do that we can head back to the visual basic editor and we can say cd.executescript and then in some double quotes the bit of javascript we want to run in this case it's window.open followed by an open and closed round bracket if we then run the subroutine again we will see the next instance of chrome that opens up has wikipedia on the first tab and then an empty blank page on the second now of course it would be helpful if we could say which web page we wanted to navigate to when we open the window so let's just close down that instance of chrome and then inside the round brackets of the open method we can pass in the string representing the next url we want to go to i'm just going to quickly copy and paste the url for the ysl website so i'm going to go to a separate instance of chrome that i already had open and then inside the open method round brackets i can paste in in a set of single quotes the url to wiseal once i've done that i can run the subroutine again and this time the new instance of chrome that opens has wikipedia on one page and the ysl homepage on another a little later on in the video we're going to want to open up multiple different tabs each navigating to a different url and at the moment in order to do that we'd need to concatenate each url into the script we're executing that's not a huge amount of effort concatenating strings is pretty easy to do but we can make life a little more convenient for ourselves by parameterizing the script that we're executing to do that i'm first of all going to cut the url for the wizal homepage from the script and then get rid of the single quote characters what i'm then going to do is say arguments and then in some square brackets 0 and then i'm going to use the second parameter of the execute script method to specify which argument or what value i'm going to pass into that argument so in some double quotes i'm going to pass in the url i've just cut and then if i execute the code again we'll end up with exactly the same end result two tabs one with wikipedia and one with the wisel homepage but later on we have a slightly easier task to pass in a different url because it will just go into the second parameter of the execute script method once you've navigated to a page in a new tab it's fairly likely you'll want to do something with it say extract some information from it or interact with one of its elements let's try to do that with the ysl homepage let's try to type in a search phrase in the input box called search here i just need to be able to identify that element so in an instance of chrome i'm going to right click on the input box and choose to inspect it so i've got a few different ways i could reference this item i could use its tag name which is input i could use this class which is search or i could use this name attribute i'm going to go for the name attribute this time the what value so i'm going to go back to the visual basic editor and then i'm going to try to type in a new line of code cd dot find element by css and then in some round brackets and some double quotes and some square brackets i'm going to use the name attribute equals what close the square brackets and the double quotes and the round brackets and then attempt to send keys to that input box let's search for vba on the wiseal homepage having done that let's run the subroutine and we'll see the two pages open up again but things seem to be taking a little longer than they should and i don't see the words vba appear in the input box and when i get back to the visual basic editor i'll find a runtime error saying that element's not found i'm pretty confident that my css selector is correct the problem here is that although we've opened up a new window and that's the visible tab or the visible window in our chrome driver the chrome driver still believes that the wikipedia page is the active tab or the active window so even though you can open up a new tab you also then subsequently have to switch to it so currently our code is searching for an input whose name attribute is what on the wikipedia page if we just end the subroutine and then add a line which says cd dot switch to next window before we attempt to type something into it the next time we run the subroutine we should find that our text does indeed appear in the input box on the wiseal page we encounter a similar problem when we try to close down the active window and reference something on the remaining window let's just add some code to close down the wisel page and then try to type in the phrase vba in the wikipedia input box again in order to do that i need to be able to identify that element so while i've got wikipedia open here i'm going to right click on the input box and choose to inspect it so again i could that hasn't quite worked let's try inspect again there we go so it i could once again use the input tag name but it's also got a name attribute this time called search so let's use the name attribute again i'm going to close down that instance of chrome and then i'm going to add in the code that will close down the active window in the chrome driver so that's cd.window.close and then i'm just going to copy and paste the line which sends the keys vba to the input box and then change the name attribute this time to look for the attribute or the input box whose name attribute equals search so once again if i run this subroutine this isn't going to work we will see the ysl window open up and then close down but we can't see the phrase vba typed into the input box in wikipedia and that's because although we've closed down the window the chrome driver still thinks that is the active window so we get a no such window error so even though we've closed the window down we still need to switch back to the previous window before we type anything into it or interact with it in any way to do that we can add the phrase cd dot switch to previous window and then if we run the subroutine again we should see we open up a wiseal that closes down then we switch back to wikipedia and type in vba and all that's a little bit pointless obviously why on earth would you run a subroutine to do that of course that's not the point of what we're trying to show you here this is the absolute basics of how you open and close windows and then switch between them although you can only have one tab active at a time you can still reference the other windows you have open and there's a couple of different ways you can do that just to demonstrate the basics i'm just going to comment out all the code below we where we open up the extra tab and then i'm going to add a couple of extra lines to print out some information about the windows we have open so i'm going to say debug.print and then i'm going to reference the cd.windows collection and then refer to the first window using its index number of 1 in the round brackets and then write out its title i'm just going to copy and paste that line and then refer to windows 2 and then when we run the subroutine fairly unsurprisingly we'll end up with the titles of the two tabs written out into the immediate window now if you didn't want to do that individually one by one you can also loop through the collection of windows in a couple of different ways we could count through the collection using a for loop with a counter variable but you can also do it using a for each loop so i'm just going to declare a new variable at the top of the subroutine i'm going to say dim w as selenium.window and then rather than just writing out the value of the windows individually i'm going to comment out those two lines and replace that with a loop which says for each w in cd.windows and then i can say next w and inside that loop i can say debug dot print w dot title okay so i'll just clear the contents of the immediate window and then run that subroutine one more time and we'll see once both windows have opened up we'll get the same list of titles but this time generated by looping through the collection now let's start building towards a practical example of why you might need to consider using multiple tabs in the first place for this next example i'd like to be able to loop through the list of news items on the wikipedia main page follow the first hyperlink that i find in each of these bullet points and then return some information from each page that i land on so i first of all need to be able to identify the list items i'm just going to right click on any one of those and choose inspect and i can find there's a list of list item or li elements stored in a ul element and that's stored in a div element with a unique id called mp-itn unique ids are always a good thing so i'm just going to double click on that unique id and then copy that to the clipboard then head back to the visual basic editor and start modifying this code i'm just going to comment out the for each loop for the time being and also the code that opens up a new window as well we won't need that just for the time being i'm then going to declare a new variable dim let's call this news items as dot web elements so i'm going to try to return all the list item elements in that unordered list to do that after i've navigated to that page i'm going to say set news items equals cd dot find elements by css and then in some round brackets and some double quotes to use css selectors to find a unique id you precede that with a hash symbol and then i'm just going to paste in mp-idn and then i want to return all the unordered lists at the next level down in the hierarchy from that element so i can do that by typing in a closed angle bracket symbol and then say ul then i want to find all the li elements at the next level down in the hierarchy from that element and then so i can type in a closed angle bracket and then type in li i can then close the double quotes and close around brackets and just for a quick send check i'd like to debug.print the number of elements that i've found so i'm going to say debug.print and then say newsitems dot count so having done that we can run the subroutine again and we should find that there are one two three four five six elements listed on that page at the moment and i get six printed out into the immediate window next i'd like to loop through the collection of news items retrieve the href for the first anchor element in each item and then just write that into the immediate window so i'll declare another couple of variables to help with that i'll say dim news item as selenium dot web element and then also a simple string dim first link as string after i've printed out the news items count i'm then going to say for each news item in news items and then i'm going to try to capture the first href for the href of the first anchor tag in the first link variable so i'm going to say first link equals and then i'm going to say news item dot find elements by css and this is a nice easy css selector this time it's just the anchor elements of the letter a in some double quotes and then i want to refer to the first one in that collection so of course there might be more than one anchor element in each bullet point so in some round brackets i'm going to refer to element number one i can then refer to an attribute of that element by saying attribute and in some round brackets and some double quotes say href to return the url once i've done that i'm just going to say debug.print first link and then say next news item and then i'll clear the content of the immediate window not that there's that much in it yet but if i run the subroutine again we should find once it's finished we've got a list of urls for the first anchor in each of those bullet pointed lists so one two three four five and six next i'd like to try to follow that link in the same tab that the main page is opening and then extract some information from the page that i found before going back to the main page and moving on to the next news item so it's very much like manually clicking through the links so if i find the first link the first hyperlink in the first bullet point horse racing i can click on that and that takes me to the page about horse racing and all i'd like to do is try to capture the text in the main page title there so if i right click on that element and choose to inspect it it shows me that's an element an h1 element with a unique id called first heading so i'm just going to double click to copy first heading to the clipboard and then what i'd want to do is go back to the main page then go on to the next news item in the list and find the first link in there and follow that to a separate page and so on and so on and so on until i've processed the entire list i'm going to try to do all of that in one tab first of all so just in the same wikipedia tab that we open up so i'm just going to change the debug.print statement for first link to say cd.get to follow that link in the same tab then i'm going to want to debug.print the value of the first heading so i'm going to say debug.print cd dot find element by css and then in some round brackets and some double quotes it's a unique id so i'll precede that with a hash symbol and then paste in first heading close the double quotes and close the round brackets and then refer to the text property of that element and then i'll just want to go back to the original page before moving on to the next news item so i can say cd dot go back okay so having done all that i'll just clear the content of the immediate window and then run the subroutine and we'll see that it starts to process the list you might have just seen it flash up with the horse racing page and then gone back to the main page but then it appears to have stopped and the reason it stopped if we go back to the visual basic editor is with a runtime error referring to a stale element reference a stale element reference error indicates that the element we're trying to refer to no longer exists that could be for a couple of reasons it could be because the element was deleted or it could mean because the document object model was refreshed the element we're trying to reference is no longer attached to the page and in this case it's the latter of those two situations if i click the debug button it's trying to reference the news item element now because we went to a different page the document check model changed so that variable now no longer refers to an element which is attached to the page so a simple way to avoid this and the whole point of teaching you how to open multiple tabs is to open up each link in a separate tab keeping the main wikipedia page open in its original tab then we don't run into this issue now we have a couple of different ways we could structure this code to solve the problem for the first example let's loop through the links and open up each link in a new tab first before we then loop through the collection of open windows let's just restructure our code a little bit i'm going to get rid of the cd.get part and i'm going to replace that with cd.executescript and i'm going to use my parameterized script without copying the specific url to the wisel page i'm going to paste that just in front of the first link variable so that will open up a new tab using the value in the first link variable to work out which url to navigate to i don't then need to go back to a previous page so i'm just going to delete that part entirely then scrolling back down to where i've got my for each loop that loops through the windows i'm just going to uncomment that loop at this point and i've already got the code that prints out the window title i'm just going to leave that in there for the time being before i can search for an element on the page or on the tab we know that we need to activate that window i don't want to do that by using switch to next window so what i'm going to do is use a method of the window object i'm going to say w dot activate once i've done that i can search for or debug.print the first heading text so at this point i'm just going to cut that line from the previous loop and then i'm just going to close down that window before we move on to the next one so i'm going to say w dot close now this of course isn't necessarily the most efficient way to do this to open up all the tabs first before then looping through and closing them but it's a neat way to demonstrate how you can manipulate multiple windows in one routine i'm just i think at this point going to delete all of the commented out code that i've got here as well just to tidy things up a little bit get rid of a couple of blank lines and then i don't think i need to print out the number of news items any longer either so i'm just going to comment that line out as well or sorry delete that line entirely and there's our entire subroutine at this point okay so let's give this one a quick test i'll just clear the contents of the immediate window first and then we can run the subroutine and we should see that we get all of the tabs opened up first and then each one closes down i've included the original main tab in that as well which is why the chrome driver the chrome browser ends up closed as well at the end because of course the collection of windows includes the original one but we should be able to see in here that we've got a full list of all the windows and the title for each window printed out into the immediate window if we wanted to make sure that the main window stays open we've got a couple of different options for how we can do that what i'm going to do is declare a variable to hold a reference to the main window so at the top of the subroutine i'll say dim main window as selenium.window then before i begin looping through all of the windows with the for each loop i'm going to set main window to be equal to cd dot window so that's obviously referencing the currently active window which will be the main window that we started with then inside the for each loop i'm going to write a an if statement which says if not w is main window then we can perform all these instructions so debug.print the window title activate the window print out the value of the first heading text and then close the window i'll indent all those instructions and then add my end if statement to the end and if we run the subroutine this time we will end up with all the windows being opened up then the collection being looped over but we're left with the main window open at the end now opening up all of the windows at the same time might not be the best way to structure this code particularly when we have lots of windows or lots of links to follow so let's restructure our code again i'm going to just make some changes to the original for each loop that loops through the news items once i've executed the script to open up the next link i'd like to switch to that window so i'm going to say cd.switch to next window what i'm then going to do is debug.print the text of the first heading on that page so i'm just going to copy that instruction paste that in then i'm going to close down the current window so i'm going to say cd.window.close and then i'm going to switch back to the previous window which will take me back to the main window so i can say cd dot switch to previous window i can then get rid of the code which loops through the windows collection and the code to set the main window and i can also get rid of the variable which captures a reference to the main window i don't need that any longer either so a slightly different approach but potentially more efficient because we only have at most two windows open at once if i run the subroutine at this point we'll see each window being opened up and closed down in turn until we end up with once again the full list of values printed out into the immediate window so i think we've got the basics of working with multiple tabs or windows in selenium for the next example let's try to do something a little more practical than just writing out news headlines from wikipedia i'm gonna do an example based on yahoo finance now i know absolutely nothing about finance whatsoever so please don't ask me questions about how to get rich because i don't know i think if i didn't know how to do that i probably wouldn't be spending quite as much time making free videos on youtube anyway we're going to use i've picked virgin galactic holdings simply because i think it'd be quite cool to go to space maybe i'll go there on my next holiday um maybe if i get another 100 million subscribers or so maybe i'll be able to afford it anyway i'm just going to copy the url for this uh stock option what do you call it i have no idea that's how little i know about this so i've just copied the url anyway and i'm on the options tab and the reason i'm on the options tab here is because there's a drop-down list which allows me to select different dates and clicking on one of those dates loads the a new page refreshing the details in the two main tables on that page so these are the tables we're gonna copy from into a a single excel worksheet for each date that's available in this list okay so it's gonna involve switching or loading one page at a time copying the details from it coming back to the main page and then selecting the next item in the list to process that entire list so let's get started with the basics if i head back to the visual basic editor i think this warrants a new subroutine so let's give ourselves a couple of blank lines maybe down at the bottom of the module and create a new sub called uh no go to space if i could spell that correctly that would help so we're going to use a variable to hold the base url i'll explain why in just a moment but i'm going to start by saying dim base url as a string and then i'm going to say base url equals the string i've just copied from the instance of chrome i had open what i'm then going to do is say set cd equals and new selenium dot chrome driver and then say cd.start and then cd.get base url now if i execute this subroutine there's going to be one small problem if i open up yahoo finance before i can do anything at the page i've got to get rid of this stupid pop-up so what i'm going to do is just click the accept all button to find out what that's called i'm going to right click on it and choose to inspect it and i might need to do that again let's try that again right click and inspect so it's a button that type is submit it's got a name called agree so i could access that element using its name or i could just try to capture the a reference to the first button i think it's fairly safe here to capture a reference to the first button tag so let's add a line that says cd dot find element by css and then in some round brackets and double quotes refer to the button element and then just say dot click and having done that we'll give it one more test and make sure that when we load ei finance this time the button gets clicked and we land on the page we're interested in so it looks pretty good so far okay next let's try to return a list of the options from that drop-down list and once again we need to know some way of identifying those items so back on the yahoo finance page for version galactic i'm going to right click on that drop-down list and choose inspect so that gives me an item called a select element so that's the tag name at least if i expand the select element i see a bunch of option elements sitting in there so i think we can probably get away with a simple example of just returning a collection of all of the option elements so to make that work let's head back to the visual basic editor and i think a new variable will help with this let's say something like dim the date options as selenium dot web elements and then we'll we know that we want to loop through the collection so while we're here let's say dim date option as selenium dot web element okay having clicked the button then we can say set date options to be equal to cd dot find elements by css and then in some round brackets and some double quotes we can go with the option tag close the double quotes in the round brackets and then we could say something simple like for each try again for each date option in date options we could choose to debug.print something about them i'm going to try to print out the value attribute of each option so i'm going to say debug.print dateoption dot attribute and then in some round brackets and double quotes the value attribute then i can say next date option and let's give that one a quick test if we run the subroutine we'll hopefully end up with a reasonably long list quite a few options in that drop down list we'll get there eventually there we go the value of each option in that drop-down list now the reason the value attribute of each option is useful is because if i just navigate back to that open chrome browser and then i select one of these options from the drop down list at random we'll see that the url is constructed from the base url that we've we've put in with a query string parameter called date which is equal to the value of the option we've just selected so we can construct the next url to find based on the values we've got we've just got this extra question mark date equals part to include i'm just going to highlight and copy and paste that in a moment heading back to the visual basic editor let's declare a new variable let's call this one next url as a string and then inside this loop let's not just print out the value attribute let's say next url equals base url concatenated with the query string parameter date equals concatenated with the value of the date option and then if we debug.print next url and then clear the contents of the immediate window when we run the subroutine at this point we will see eventually perhaps i should have picked an option with fewer items in the drop-down list but we should now see we've got a long list of urls which we can now follow in separate tabs okay so the next little bit that we'd like to plug into this loop is basically the same structure as we use for the wikipedia example we want to follow the next url in a new tab switch to that window do something with that page close that window and switch back to the previous page before moving on to the next item so we can do that mostly by copying and pasting why reinvent the wheel here i'm just going to copy that chunk of code from the previous example obviously it will require a couple of changes but if i copy that and then just after the debug.print next url paste all of that in we can then change first link to say next url switch to the next window of course we don't want to debug.print the first heading text that won't exist on that wiki also on that yahoo finance page we want to do something else useful in there but then we'll close the active window and switch back to the previous one i'm just going to delete the debug.print statement at that point but that's the basic structure there set up so we're pretty much there with the main structure of this code all we need to do now is work out how to extract the information from each table on the page and that's using some fairly standard selenium techniques which you will have encountered if you've watched the earlier videos in this playlist but just to take you through that in case you haven't let's have a quick look at one of our open pages and if i right clicked on one of these cells in the table and choose to inspect it we'll see that that belongs to what it would do if it actually found the right element there we go that belongs to a td tag which is in a tr tag which of course must be in a table tag and there's a couple of tables on this page one for calls and one for puts and i can only assume that those are meaningful terms to people who know anything about finance feel free to teach me all about it in the in the comments below anyway so we've got a couple of tables we want to extract there could be more than just two tables so let's capture all the table elements loop through them and then write out each table onto a single list on one output worksheet so to add that code nothing too complicated let's declare a couple of extra variables i'm going to say dim tbls as selenium dot web elements and then dim t as selenium dot web element and then also dim ws as worksheet okay just before i begin looping i'm going to create the new worksheet by saying set ws equals this workbook dot worksheets dot add and then inside the loop after i've switched to the next window i'm going to say set tbls equals and then cd dot find elements by css in some round brackets and double quotes use the table tag i can then say for each t in tbls and then nice and easy i can say t dot as table dot to excel and i'd like to paste this data in the next available blank row at the end of my worksheet now bearing in mind i've just created this new worksheet so it won't have anything in it yet so a reasonably reliable way to refer to the next available empty cell is to start at the absolute bottom of column a so i'm going to say ws dot range a 1 48 576 there are many other ways to do this by the way if you don't want to have to remember that number now we've covered this in so many earlier videos so i'm not going to explain any other ways i'm just going to stick with this for now dot end excel up dot offset one comma zero and then simply move on to the next table by saying next t i'd like to do a little bit of tidying up at the end as well so after i finish processing all of the options i'd like to say ws.range a1 dot value equals now so we can see what date and time we've extracted this information at and then i'm going to say ws.range a1 dots current region dot entire column dot auto fit to make sure all the columns are the correct width now it won't be beautiful at the end there will be all sorts of tidying up we might want to do you might want each table on a separate sheet etc but those are all fairly standard excel techniques at this point so with all that code written let's just give the whole thing a quick test it might take a little while to run but we'll hopefully see each separate page being opened up one at a time we extract the tables from the page and then the page closes down before moving on to the next one so you can see it's taking a little while i suspect the adverts don't help on yahoo finance anyway um i'm just going to cut off at this point just so we can get to the end result feel free to fast forward to the next little section so there we are that's the entire thing finished and if we check back in excel we'll find our new worksheet with the date and time rendered into cell a1 and a lovely long list of what i am absolutely certain is meaningful data to someone who knows what they're looking at i'm sure you can have hours of fun picking through all that of course it could as i say use a little more tidying up like getting rid of these extra column headings in the list but that's all relatively straightforward stuff to do using basic excel vba techniques so i think we'll leave it there for this video hope you found that one useful i hope you got a bit more of an insight into why you want to work with multiple tabs in in selenium uh do feel free to ask questions in the comments below i'll do my best to keep up with the answers thanks very much for watching see you next time
Info
Channel: WiseOwlTutorials
Views: 3,116
Rating: 5 out of 5
Keywords: excel, vba, visual basic for applications, macros, macro, microsoft, selenium, chrome, google chrome, seleniumbasic, yahoo finance, scrape, scraping, web scraping, scrape website, scrape yahoo finance, wise owl, tutorial, free tutorial, free vba tutorial, free vba training
Id: _IlkdRwgIwg
Channel Id: undefined
Length: 36min 58sec (2218 seconds)
Published: Sun May 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.