Web Scraping in Power Automate for Desktop (Full Tutorial)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
web scraping in power automate for desktop is very easy let me show you so i prepared a case for you today but let me show you how easy it is to do web scraping first and then we'll move on to the case so here i'll have my amazon this is a very good example page because we have loads of data to extract this could be any system based on web in your company so for example i go to amazon.com i click this drop down books and then i search for something here i will just search for excel and then i want to scrape these results out to for example an excel sheet let me show you how easy it is so i go to power automate for desktop and create a new flow i will call this amazon web scrape like this and then i click create so here i have a blank canvas in a few seconds let me just maximize it here first i will have a browser instance so i'll find a launch new chrome or you can pick firefox show it if you prefer i drag in and launch new chrome usually i want to launch a new instance but since i already opened up this amazon excel i'll just use this so if i go back here so instead of launch new instance i'll click the drop down attach to running instance here i'll search by title and if you click this drop down here you can see my tab results my tab results i want to scrape is the amazon.com excel i picked that one this browser instance will get saved in a variable called browser that is fine i click save now i will do the extraction so if i go up here in actions again and find a extract data from web page here make sure you take the one on the web data extraction and not the one on the data extraction so i drag this one in here i will say web browser instance that is the browser variable that we just created up here when we launch the new chrome and attach the tab to this browser instance to open up the data scraping wizard simply just open up your browser like this with the results and now you can see that this live web help opens here i can select the elements i can scrape in the entire web page i can scrape everything but we want to scrape the structured data here in the results and that is first the title if i pick the title i can see the red border around it but if i want to get the title i can also choose the h2 as you can see here and sometimes you just need to figure out which one you want to use i think we can go with span and it's very very important because we need to tell power which made for desktop that we have structured data here that first let's say we pick span here so i do that right click extract element value and take the text here that one will get get it here you can see this green border dot that means that we scrape this one here you can also see it in the live helper and now the important part comes because i need to tell power to mate for desktop that this is a pattern so i picked this title and here you can see you can also choose between span and h2 and we need to tell power automate for desktop that this is structured so pick the same one as you did up here i picked span so i take span again i right click extract element value and take this text there you go now we extracted all the titles down wasn't that smart so far so good now it's very very easy i want to take all the authors and here you can see you can take by span and this second author so i think i need to take the entire line that will be the div here so i right click extract element value again take the text now power automation for desktop automatically says well you define a pattern here with the titles we will just extract the author line automatically for you we also have a date we will see how we can delete that one later in this session similarly i also want this first price i will find the price here you can see i can take a dollar sign the dollar the cents i want everything so i'll just mark it right click again extract element value and take the text here now we have that one as well we will leave it here i will click finish and here if i scroll a little bit down i can store my data in either a variable or an excel spreadsheet for now we will use the excel spread spreadsheet this will produce an instance called excel instance then i click save now we can run it so if i do this we will scrape the results from this web page here you go you can see excel open down here and we have our results that's how easy it is to scrape data in power automate for desktop we have a lot of features we had a lot of things that we can customize which i want to show you now by this example so let me close this one down again that's it the first task is to look at the case so i download the example data here let me scroll a little bit down on the course material page here go in here and download the data xlsx that is where our data is let me also move this a bit i'll place my data on the desktop so i can easily find it let's open the data and inspect it here you go so this is an excel book we have two sheets in it we have the anna science and org and the topics i want to read the topics sheet here you can see we have a table with one column and several rows these are the topics to scrape i want the excel sql vba java and python and for each one of these topics here i'll go to amazon and then instead of excel i'll search for sql like this and i will scrape the results for sql i will have all these results and later on as it gets more advanced i also want results from the next pages that's it so let's get started we will divide this robot in several parts we will have one path that will initialize the robot then we'll have one part that will read these topics and store them so we can easily iterate through do amazon searches here and finally write it all back to excel don't worry i'll hold your hand step by step just do the operations with me so here i will have the topics to scrape i will read this column and store them into a data table so i can work with it let's close down the excel sheet here let me just maximize the browser again go to my robot and also maximize this so let's delete these two steps we will add similar steps later in the robot but let's build it from scratch so i mark both steps and press delete we will create an input variable for our working folder and since i have mine here on my desktop i just do this shift right click go down to copy it as path and click it now back to my power to mate for desktop go over into input output variable click the plus sign here and click the input i will call this file path like this then in a default value i'll paste in the value that i just carved that is this one but i'll only use the folder so i'll do this i'll delete the end and the quotation mark in the beginning since we don't need that in power automate for desktop i'll just copy this file path as the external name as well and then i click create this is great when we want to reuse this or say this changes like i move this folder to a new computer now build a robot so i'll do a little bit of commenting because this will be a little bit longer robot so i'll make a great overview you could also create subflows which is a more advanced term but i just recommend sticking with comments for now so here i'll have the comment i'll say initia lies started so here i'll just have my initialization steps similarly i will just copy this then paste it in change this started to end it so all my initialization steps will go in here first we will just launch the excel sheet that we want to store the results in so go here and then find a launch excel and drag it in here i will just store it into a blank document that's fine our instance will not be visible and here we can see that the variables produced is called excel instance because we will have two excel books we will have one for the topics that we will read and then we will have another excel book slash instance that we'll write the results back to we will do proper naming so we can easily differentiate it which one is which so here i'll say excel instance result like this and then i click save so nothing really happens here we just open up an excel instance for the results now let's read the topics similarly i'll find a new comment for this reading topics started so i drag this one in here i'll say reading topics started like this again i'm lazy i copy it paste it back here open it and then just say end it like this so now we can read our topics and how do we do that we launch another excel instance so let me show you here so i'll find it lawrence excel and drag it in and now we don't want to open up a blank document but we won't read the document that we just copied in on our desktop so if i click this drop down here and open the following document since we already have a lot of the document here in our file path variable i'll use that so click the x here find the file path double click it like this now we just need to fill in what the name of our excel book is named so i go here backslash data xlsx you should do the same if your file is named the same that is if you haven't edited before you downloaded it oh after you download it our instance not visible we don't want to see what's going on while the robot works it might be fun in the beginning but then it gets a little bit boring we will just open it as read only because the topics we will not write anything anything back to the topics that's it then i can click save now we want to do a little bit of robustness to our robot so let me minimize this and open up our data because the robot will read the active sheets well it's fine here but if this one is active then we have a problem then it will open that and try to read the topics from here find the column it's not there our robot will fail so we will make sure that the robot will read this sheet every time this is active of course we can do it if we manually close it down on this one here but say that we forget it or our colleague have this excel book open and accidentally close it down here so we just specify that the active workbook sheet should be topics let me show you so i go back to power automate for desktop then i'll find a set active excel worksheet drag it in here after the launch excel so the excel instance and we want to look in that is this excel instance and we haven't renamed it yet that is fine i will show you how we can rename two occurrences at the same time so i'll just pick the excel instance again here i'll activate the worksheet with the name i'll just paste in the name of it that is topics i click save now we have an excel book that is this one here this name excel instance but we want to do the same naming naming convention is up here we could change it in this activity and this activity but we can go over here and either press f2 or right click choose rename and then we can just start writing i will say excel instance topics there you go it is automatically getting updated in both activities when we read excel sheets in power automate for desktop we need a range definition and that is let me show you we need power automate to understand that we have excel data in this range and we do it with the get first tree column row activity and that one will find the first three column that one is this one here and then we can just subtract one from it and we'll end over here of course we only have one column here but we cannot know similarly we don't know how many rows we got we can easily just add more topics so we'll find the first three row which is seven and then we'll subtract one from it let me show you so i go back to my power automated for desktop get first three that one went a little bit fast make sure you take the get first three column row and not this one up here i repeat make sure you take this one here they look quite similar but they don't do exactly the same thing drag it underneath the set active worksheet here i need to pick an excel instance and now it's great that we can easily differentiate between these two i can just pick the topics that one is this one here these two produce two variables those are the variables that i talked about before i'll say first three column and let me just rename this so i'll say topics again i'll go in here and say topics then i can click save and let me show you so if i just close this one down let's try to run the robot we only open up the two excel instances and then we set the active worksheet and get the first free column and row here you can see that we have the column number two that is fine that is the first free one and similarly number seven that is the first free row now we will read from the excel worksheet we need to read the topics so i click the x here and i'll find a read from excel worksheet and drag it in here here i'll pick my excel instance again is the excel instance topics that's it here i want to retrieve the value of a single cell or i will take the value from a range of cells again take the range of cells that is all the topics the start column we know that that is the column a you can also write in one that's up to you i prefer a since that is the notation we use in excel so i don't confuse myself the start row that is one and now i will take advantage of the two variables that i produced so i click the x here because the end column that is the first three column topics so double click that minus one again i just go in here say minus one so we found the first three column and then we subtract one for it that is the range of our data similarly with row just click the x here first three row topics take this one here and again say minus one then we scroll a little bit down go into advanced i always recommend going into advanced and here you can see that we have first line of range contain column names we have column names that was the topics to scrape description in the start so i just take this one here now our variables produced is called excel data again similarly let's rename it so we know what's going on this one will be the excel data topics like this then i can click save we can close this excel instance down again we will not use it anymore we have our data in excel data topics so i'll go over here and find and close excel here drag it underneath here so the excel instance to close that is the excel instance topics we will not be saving it and really we can't because we open it as read only up here so that is fine i click save let's try to run our automation but before we do that let's just save it so we'll make sure that if something happens we still have this one going on first then we can click run so again we just read the excel data that's the thing that we do here and then we come to the data scraping so if i go up here in excel data topics double click here now now you can see that we have our data that we want to search for so i can now search for excel sql vba java python and this is the header if we wanted to refer to one of these data rows in the data table then the first one will be zero it's zero indexed then the second one will be one two three four that's how it is in programming you got just get to get used to it so then i can click close here now we are ready to do the actual scraping the first thing that we will do is to open up an excel instance and go to amazon.com that is this address up here because we want to navigate to this page and do the search by the robot so let me minimize this again so up here in the initialization i will find a launch new chrome just as we had before so this one is done in the initialization i will launch a new instance we will not attach it to a running instance anymore we will run launch a new instance then i'll just paste in my uil here that is the amazon.com and we will produce a browser variable this one will be only browser we have so it's fine that we have it as a browser name then i can click save so now we move to the end because now the scraping will start let's do some commenting so we can easily get back to this workflow whenever we want i drag in the comment here i will say scraping started like this and i will say scraping ended let me drag this one in i could also copy it like we did before scraping into it like this first thing that we will do is to find out which topic do we want to search for because we have opened up the amazon let me just go to amazon here we've opened up the amazon then we want to navigate to books and do a search and we actually know what we're going to search for because we stored that one here in the excel instance topics or the excel data topics up here you can see we want to search for excel first and sql vba java and python so we find a for each so we can iterate through this data table and do the searches so i'll drag in the for each here so the value to iterate that is the excel data topics up here i click the x here take the excel data topics so every time we move over the data table that is we iterate through the data table then the cue and row that will be cue and item that will just be a name for reference we can just change it so i want to change it to topic i can either do this like this we can actually also delete this percentage assigned power to mate for desktop will automatically add it because this is a variable field then i can click save so now we're iterating to each one of the excel topics and we call the current topic for topic that one is this one so first we have moved to the amazon.com we're here then we want to go to the book session which is here let me just go back to the old apartments again because we want to do the search in books only so then we find a set drop down list on web page so i'll say set drop down list and make sure you take the web form filling here then we drag it in into the for each the browser instance that is browser fine now we need a ui element for this drop down so click this drop down here take the add ui element and now we can find it let me move this with the ui picker and this one is a little bit bigger but that is this one here so i press ctrl click with the mouse now i have my drop down so i'll not clear all options i'll take the select options by name and then i'll say books i want to look in the books session then i can click save let's just give our ui element proper name so if i go over here to ui elements you can see we have to select all departments i prefer to have my naming clean so i right click rename and i'll say department drop down like this this is just for naming purposes it's easy to see what this ui element is nothing else then i can go back to my robot because now i have selected the books i can do the search for the topic so if i go up here and i'll say populate text field on web page drag it in here so if i go back here so i am in my books and now i want to write something in here in this field back to power to made for desktop we will create a ui element for the text field so click the drop down here then say add ui element and here you can see it says input text control click with the mouse there you go we have a ui element we will now define what we want into the search field well we just want this topic here so we're referring to this data row so that is i click the x here take the topic and since there could be more columns in this data row i just want to make sure that we pick the topics to scrape so i go in here then i will say hard brackets single quotation marks and then i type in the header of my excel table we only have one column in here but as i said we could have more so i'll say topics to scrape like this so this means that we will look in the topic row that we're iterating through here that one would be excel sequel blah blah blah and then we will say give me whatsoever it's in the cell of the topics to scrape column then we will populate that into the text field then i can click save finally when we have written something in here let's say for example excel i want to click this button over here so we add similarly we add a new ui element for that in our click link on web page this one here drag it down here we will go into the browser click the drop down you know the drill add your element find this search one here ctrl and click it we will just do a left click that is fine i click save now let's just rename these two i'll press f2 and here i'll say input field and here that is the search button again f2 so i'll just say search button i do this only because i want to easily differentiate it i can also see this picture down here but then it's easy to maintain in the future what do you think about the quality of this lesson please post it in the comment below that will help me a lot thank you so now we're just searching for the topics the actual scraping has not began yet but let's see if this one works so i'll do all the searches not in this one but in a new instance let me go back to power automate for desktop and start the automation let's see if it works we can easily debug now we open up amazon here we are setting it to books excel we click the search one these ones are the result books sequel it looks like it worked now we can iterate to each one of the topics and come to the data that we want to scrape this is very important in web scraping because we want to make sure that we are on the right page one other thing that we can do to make it more robust usually as you saw it here we just did the searches so we just cleaned this field and then typed in the next one usually what we want to do is to navigate to amazon.com after each search and we will do this books blah blah blah in each one of them then we will know that each time we do something then we are at the same page nothing or little can go wrong we saw it's not the problem here but it can be so find a go to web page like this drag it in inside the for each so for each topic we will go to a url so i'll say https amazon.com this is just to make it more robust similarly when we automate in applications we also want to move to this to the start page each time or at least the same page this is to make sure that we start at the same place for each transaction which is topic in this place so i just click save here so before we scrape then for each one of these topics we will add a new worksheet into our excel instance result up here that is we want to have a separate excel sheet for each one of the scrapings that is excel secret and all that so i'll find a add new worksheet here just drag it in in the start of the for each so i add a new worksheet the instance we want to target that is the result and the worksheet name the worksheet name will be named the same one as the topics here so i'll click the x here and i'll say topic do i want to store it as first or last i want to store it as last and then i click save so now i have a new active worksheet called topic that i want to store my result in so now let's do the actual scraping that we saw before so i again i'll find an extract data here and take the one on the web data extraction that is the web extract data from web page so now we have done the search we can do the extraction to make this work as you remember we need to bring up a result page and here i don't have the results so let me do it manually if your robot has ran you can just do it like this you need to have the results and you need to be in the books department so now i can say again just as we did in the beginning i'll take the span of this title right click extract element value and take this text that one took the first title similarly down here i'll right click extract element value and take the next one so now i have all the titles then i need the authors again i'll take this diff i know i get the buy and then michael alexander and then i get the date with me as well i'll deal with that later so let me right click here extract element value and take the next one so now we have the author as well then we can just have these prices here we'll take this one here and have the prices so now we easily extract all our data that we wanted we can just click finish so we're scraping from the browser instance we have a timeout of 60 that is fine the variable is produced let's just rename that to amazon results so i'll do this i'll say amazon results like this and then i click save so now we have extracted the result then for each one of these extractions we will write it back to our excel instance result because we created a new sheet with the right topic name to do so i'll just find a right to excel worksheet take it down here we're writing out the result now so the excel instance that is the excel instance result what value do we want to write i want to write the amazon results so i click the x here go up here amazon results i will say where do i want write it that is the first cell that is on the specified cell so here i will say column that is column a and the row one so that is in the beginning of it then i click save so now i'll just close it and make sure that i save it under the right name so in the end find a close excel drag it down here in the screen after the scraping ended so i will close down the excel instance result then i want to save it so i want to save this document as and i have the file path up here so i'll choose the file path that is the file path for my data it's here then i'll say backslash and i can just say results dot sl xlsx like this then i can click save so now let's try to run the automation i can close down this one amazon here so we don't have a lot of things going on then i run the automation so now it will take a little bit more because now we will do five searches and we will scrape each one of the results back to our excel result book so we're scraping the res we're doing the searches we can't really see if we're scraping the results but we are trust me i'll show you that in a few seconds that's it let's go to our desktop and open up the results here you can see we now have five nice scrapings that's how easy it is to do multiple scraping but we need something more we need headers of each one of these so we have a title an orator a price then we want to scrape the first three pages this is only the first page and finally you can see here we have this by eric matthews then this one here and then may the third so we only want the author let's fix these things one by one the first thing i want to do is to have headers on we could have done it in data scraping but since this will be a little bit more advanced we will have three pages we will do it in a variable so first i will go back to my amazon web script here and then right after i added a new worksheet i'll find a set variable here drag it in so this is just to show you how easy it is to add data road to a data table this is also very important so up here i'll give it a name i'll just say headers result hit us result like this and then the value so to add a data row then we will do the percentages sign i can do some spaces in between nothing will happen with the spaces is just easier to read then i'll have the curly brackets like this then i have the hard brackets that is in the end and the start and in the end so again i'll have the hard brackets and the curly brackets here so now i can write my data the first one that was the title so in single quotation marks i will say title then i will say comma similarly i will say author in single quotation marks then another comma then i'll have the price like this then we can click save so now we have a variable with this let's add it to our excel sheet here so i find it right to excel worksheet again drag it after the set variable so the excel instance that we want to target that is still this excel instance result so i'll pick that one here and since we just added this topic sheet this is the active one that is fine we will add headers to each one of these so here i'll say value to right i will have the headers result that one is this one here then i will just write it in column a row one i'll click save so now we have headers on but we don't want to overwrite it down here so what we do here before we write to it we find the first three row and column in this excel instance result again here so we know where to write let me show you so i'll find a get first free row column here drag it underneath here and this is again the excel instance result and here first free column and then we'll say results like this first three row results again we do this because we want to easily see what's going on and now here so we we are writing these amazon results back but since we already have something in the first row we could of course write number two here but we will use this guest first free get first three column in a little while so that's why we use it go in here so instead of picking the row one delete that and now we will simply just write out the scraped results into the first three row so if i go down to these x here i'll find the first three row in the results that is this one here so now we will just write in the first three row and we only have one row in here so that's what i said we could have written in two here but we will use this approach in a little while you'll see why it comes very handy so now i will close this resolve sheet again then i'll run my automation so we're doing exact same thing as before now we're just adding headers to our result so we are scraping the results we're writing it back and we will see that we have added nice headers in a few seconds so sql one of my favorites also a good one to learn as an rpa developer vba also great because you can automate excel java well we don't use that a lot as rpa developers so i cannot recommend that at least for the books to read it's always nice to know a little bit of coding python is great when we want to do scripting so let me go back here now we inspect the results so now we have title author price and we have it for each one of these here so now we fixed that problem now i will scrape the first three pages so what i'll do let me close down the result so for each one of these results i'll scrape the first page then i will click the next button scraped next one and so forth just for the three ones here so i'll have a loop that is i will have a loop that will run three times and extract the things so if i go back to power domain for desktop let's talk about where we want the loop well we only want to do the search one once that is here we click the link then we do the search then we want this extract data from web page then we want to extract it then we want to write the results to excel now here we will have a click next button and then we will go up here extract the new results and now this comes in handy because now we know where to write it we also we actually already solve for that one so we will store the new results in amazon results then we know where we want to write them that is 18 rows rows below or something similar we'll write it click the next button so let's solve it together find a loop and you will just have the ordinary loop not the loop condition but the loop drag in the loop right after the click link on web page you can see it here we want to start the loop from one we will end the tree so that is one two three we want to increment it with one then i can click save now we just need to drag in the extract data from web page into the loop the get first free column row write to excel worksheet and we will have a click click link on web page here so i'll find a click like here click link on web page and drag it in make sure you are in the loop so here i'll add a ui element for the next button that will be this one here so i go back here click the drop down add ui element here i will find the next one here i can pick this one or this one i'll just take this one so i'll say control click with the button now we have that link let us click save so one thing that we will do is to go to the ui elements and see if we have done it right here we have an anchor called next first i'll just rename it so right click rename then say next we can also inspect the html behind it so if i double click here you can see that this is the address of the ui element of the next button all this we don't want to change nothing if it works but if it doesn't work we'll go back here and fix it so now i can just go outside we have it here let me close down these results again and start our automation so now we should get three pages with results in our excel book i run it again so we launched we launched just as before we launched a new chrome we read the topics we now do the searches now it will take a little bit longer because we will do the excel then we will extract it go to the next page then extract that you can see up in the address bar that we are on page 3 now page 4. then we will have the sequel we will also do the results from that one do [Music] here vba so so we will only do the tree that is because we click the link on the web page in the end so we will not scrape the last one so we will get the first three pages even though that we navigate to the fourth that's it let us go back and inspect the results there you go we have now 48 results from each one of the topics wasn't that easy now we could do a little bit of dynamic naming that is we don't want to call it results because that will result in the results will get overwritten from time to time let me close this one here so i will go back to power automate for desktop and then in the initialization i will have it get current date and time because i want to use this in the naming of it because i know that the current date and time if i go down two seconds or milliseconds that is quite unique it will not get overwritten so i will use this one here get current date on time i'll find it up here in it in in the initialization here i will get current date and time and store it in current date time fine i'll click save then i will convert this to a text so i'll find a convert date time to text here and drag it here this is so i can use it in my file naming just below so the date time to convert that is the queue and date time so i click here and scroll a little bit up i'll say current date time the format to use i will use a custom format i can choose this one on the standard but i want to customize it so i'll say custom here i will say first i want the year that is for wise i will come back to this code in a few seconds then i want the month that is 2 m's i want to have the day 2 days then i want the hours that is 2h or if i want it in 24 hour format i will take two big edges like this then i want the minute and the seconds i don't think my robot will run during second interval so i will just have it here but you can add milliseconds if we want these ones are very easy to understand if you know what's going on and you can probably also see it just by looking at it but if you want to know more you can go to google and then you can search for dot net custom date time format like this and pick this one here and scroll a little bit down because these ones here like the two d's that is the day of the month in this format 001 if we only have written a single d then it will have a look like this with the first nine dates dates of the month similarly you can scroll down here are the month and so forth play a little bit around it this is important you'll use this page a few times in the beginning at least until you start remembering these codes it's very easy don't worry so now i have my formatted date time i click save here then i go down so now i will use this format to date time in my save the excel document as so if i go in here then right before the results i'll put in the date so i'll say x here then i will say format the date time that one is here double click now you can see this variable is right before the results then i can click save now let's try to run the automation again we will do the same thing the only thing that we changed here is the current date and time we will use that as a name for our results i'll fast forward this so we don't have to look at it again we are done and let me minimize this again this page here now you can see that we have a new excel book with all our results nothing has changed it's just the naming so we don't overwrite any results we want a full report from each time the robot runs finally we want to solve this this is not very nice to look at and we can solve it in different ways but i want to show you how you can solve it in the data scraping wizard this is one by eric matthews we only want eric mathews here so let me do copy this one here first then i'll go back to my browser you should now go to regex101 regex that is a series of characters that define the search pattern we will use it here is advanced but it's not complicated and i want to introduce you to it here in the end of this lessons since this is a very important concept in rpa development you will use this a lot just follow me so go to regex101.com so here's my test string that is this eric matthews i want to define a pattern that only takes this eric matthews out and i can do this patterns up here the first thing i'll do is to look after the buy and then i'll pick everything up so what i'll do here is have a parenthesis then a question mark then this one here then an equal sign and a parenthesis end so now i haven't really defined what i want to look behind you can also see a little guide here if you want to know what's going on so if i go in here i now want to say i want to look behind the buy so after the equal sign i'll say bye now you can see that i look behind the by by this purple one here then i'll say space right now i don't pick nothing up i'll just say where i want to start so if i jump outside the parentheses then i'll have a dot that means that i can have any character now i'm picking up things then i'll say a star that means i'll pick everything up right after the buy but i only want to pick up the author so i want to stop here and the way i can do this is similar to the first one so a parenthesis then a quotation marks now only an equal sign parenthesis end inside this after the equal sign i'll say i want this i'm not sure what it's called this one in english english is not my first language so i want this one here and then to escape it because this actually mean something in regex then i will have a backslash like this that's it and i also want to remove the two spaces so i go over here and have the two spaces now this is an irregular expression but let me just move a little bit ahead this one will work here as you can see but if i jump to the second one you can see that we now have two ones here let's see that if it also works here so if i paste it in here it does it takes the author here as well and let me just show you where it doesn't work if i go down here to francois chole copy this one here paste it in here we are not picking it up that is because we only have one space after the author we could trim it in power domain desktop or we could make an or so if i just make a parentheses around my first expression here then i'll have this one here and now this actually means an or so now i can do another expression so i can say look here or here and if i just mark everything here copy it and paste it after the or sign so now i'll just remove one of these spaces and now i will also pick up the second author so i'll use this pattern for now this is just an introduction to see how you can use regex in the extraction wizard it's very advanced but it will be very helpful that you just got introduced to it now you know that exists we can also use it to extract emails we can extract every string that we want with these patents it's not that complicated and let me show you how we'll incorporate it so you need to have an amazon result page open like this then i go back to my power automate for desktop i open up the extract data from web page here now the wizard will open in a few seconds when i open up the results here if i go to the advanced settings here you can see the css selector for each one of the extractions so the first one that was the title then we had the order with date and the css selector here here you can see something called regex bingo so if i go over here just paste in my regexpad i click ok you can already see that we have the nice author down here so if i click finish here i'll click save i can run my automation let me close down this one we don't really need it then i can click run this regex pattern we could refine it even more to make it 100 percent robust but this was just an introduction so you can see how easy it is to fill with regex here in power to mate for desktop this time in the extraction wizard there you go let's just inspect the results again these one are the latest ones here you have them and we have nice authors we have it without dates or buy with anything we have it nicely formatted in this lesson you learned about advanced web scraping we saw how we could do dynamic scraping we saw how we can use regex and we did some advanced excel work the next lesson is on the screen you can just click it
Info
Channel: Anders Jensen
Views: 78,213
Rating: undefined out of 5
Keywords: web scraping, web scraping tutorial, anders jensen, anders jensen power automate, web scraping power automate, web scraping power automate desktop, scraping power automate desktop, how to web scrape in power automate, how to scrape in power automate, how to web scrape in power automate desktop, power automate desktop, power automate for desktop, microsoft, power automate desktop tutorial, power automate save web page, save web page in power automate
Id: WXK0u2yXLrU
Channel Id: undefined
Length: 51min 30sec (3090 seconds)
Published: Wed May 25 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.