Working with Large Data in Power Automate through Pagination

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone this is steve woodward and today i want to dive into power automate specifically when you're working with really large sets of data inside of the dataverse which was formerly the common data service recently i had a customer reach out to me about kind of an issue they were running to with power automate where they were not able to loop over all of their records inside of the dataverse so i wanted to put this video together to really deep dive on working with large sets of data with power automate and the dataverse and how you can kind of overcome some of those issues out of the box with powerautomate when you have the seated version of power automate which comes with office 365 there is a limit that you can only loop over 5000 records in a single action now if you get a paid version of power automate you can actually increase that from 5 000 to 100 thousand but the challenge here is that if you have data sets that are larger than a hundred thousand you still need a way to be able to page over the data so you don't hit these limitations inside of power automate so this video will actually show you how you can get around that and i've also put together a solution that's going to be posted on my github repository so you all after this video can take a look at this exact sample and play with it on your own and if you have any questions please feel free to post them on this video and i'd be happy to to elaborate on that for this video i put together a solution in the power platform to make it easy to kind of show how you can work with pagination inside of power automate i've got a sample entity that i've created in the dataverse and then i've also got a few different flows here we're going to walk through each one of these in this video the non-paginated flow is just going to be working with data in the dataverse using an action in power automate to get data back and i'm going to show you what the kind of the default behavior is and what are some of the different settings that you can apply there and then the next example i'm going to show is how you can actually manually do pagination inside a power automate to kind of work within some of the constraints of the limitations that are in the power automate platform namely you there is a default limit of 5 000 items that you can loop over in any given for each loop depending on the licensing you can kind of extend that but at a high level if you're working with large sets of data and you have to one by one loop over the data this may be the only way to do this so i want to kind of highlight this in case you run into that scenario where you need to be able to do that so first i want to show the custom table that i created in the dataverse it's called demo paging entity and i kept it pretty simple there's a name there's a value and then there's a comments field and then in this case in the view you can see when it was created um but really for the purpose of this demo we're going to be just looking at the name the value which we're going to be updating with powerautomate in the comments we're not going to be touching from a power automate perspective but hopefully this gives you a little bit of context of the data that we're using in github where i post the solution to this example i'm also going to publish a sample excel file that you can easily use to bring this into the dataverse so you can get some sample seed data to work with right away so the first thing that i want to show is a power automate flow that does not use paging but i just want you to see some of the options you have available to you in power automate so i'm going to go to the non-paginated flow and in here i've created a pretty simple power automate flow it gets manually kicked off and then in here i'm using a call to the dataverse to list all the records in this case from the demo paging entities now when i go to the advanced options i don't have anything specified we'll see in a second that i'm actually getting a prompt because typically you want to apply some filtering but there are reasons why you'd want to get all of the data and in that scenario i'm going to show you how you can kind of optimize your power automate flow to work within those constraints in the list records right now the default is that it's going to get 5000 back there's actually an internal threshold on this and i'll show you how you can change that if you go into the three dots here under settings you'll see that there's an option for pagination if you turn that on what's going to happen is under the covers power automate is going to go against the dataverse api and do paging across the data set for you automatically you don't have to worry about it the end result is that you'll get a data set that's larger than the 5000 limit threshold and you can specify that here i want to show what this looks like before we do that so i'm going to go back here and say done i then have another action that will just simply get a count of the number of records back so we should see in this case i've got over 5000 records so we should just see 5 000 records get returned from that operation and then inside of here i have an applied to each on every single record in that data set that gets returned and then i'm just simply doing a compose on the name property of that record one other tip i want to highlight here is that in the apply to each just like with the list records there are some properties that you can configure here if we go over here to settings there's actually a concurrency control by default this is turned off which means that when you loop over for example 5000 records it's going to go one by one so it's going to go record one record two record three however you can actually turn this on you can actually do operations in parallel and the real benefit here is that if each operation is item potent meaning each operation does not affect the next operation that's a great opportunity to actually do that in parallel the default is 20 but you can actually um set this to 50 if you wanted to and this is a real big time saver if you're able to do that with your data set now let's take a look at what this looks like so we'll go ahead and run this and now notice here i am getting a prompt basically saying that when you're using list records it's best to use an odata filter we're not doing any filtering here because we want to work on the entire data set but just once again if you ever get this prompt it should be just a warning to you to make sure that you fully understand that you do want to get all the records back more likely than not when you're working with data you do want to do some filtering on it but in this scenario when you want to get everything you will get this kind of prompt here we'll close out of that i'm going to kick off the flow and let's go ahead and take a look at what this looks like okay so the list records operation returned and then now at the initialize variable step where i'm just doing a count what we can see here is that um there's 5 000 records back anytime you get a result set that's like an even number like that where it's 5 000 or 10 000 you should think in your head that there's probably some filter or some cap that's going on with the data that's going to return back and you probably have to do pagination of some sort to kind of get around that now the apply to each is going to go through all 5000 records here the other thing i want to call out here is with the seated version of power automate that's the version that comes with office 365. there is a limit of you can only do and apply to each step of 5000. if we got more than 5000 here this would error out so that's just another thing to kind of keep in mind now let's go back to this flow and let's update the list record so that it does paging for us automatically so here on the three dots i can say settings i'm going to turn pagination on and i'm going to make the threshold higher than 5000 so let's do 10 000 and we'll say done and then let's run this again now in this case i believe there's around 5 500 records so that's what we should see this time around when we run this okay so what we want to see now is just the number of records that return from the list records operation okay now we look here now we get a result set of 5500 so that's the total set that we have in the dataverse for this entity and that's how you can use pagination just out of the box by just simply toggling a switch on list records keep in mind that the downside to this is if you need to do an apply to each operation that could fail based on the licensing that you have for power automate so that's just one one important thing you want to keep in mind so now i want to walk through the paginated flow example now if you look in this solution there's actually two different flows one is a parent flow the other is a child flow in this particular case because i'm doing the same update to an item in multiple places i want to be able to kind of modularize that and create a child flow that just contains the updates to a record and then in the parent flow when i do my 4-h i'm just going to call that child flow now in order to do this you do have to make sure that your flows are part of a solution if you don't put your flow inside of a solution you will not see the action to to call a child flow from power automate so let's look at the parent flow first and here's the setup the power automate flow gets manually triggered and i put an input value of update value here and this is going to be we're going to loop over all the records in the dataverse for this table and then there's a field that we're going to update and i'm going to allow whoever is running this power automate flow to specify what that value is i then do a list records like we showed in the non-paginated example and if we look here i'm not doing any filtering i also have not set any kind of pagination in the properties here so we're just using it kind of out of the box and then i am going to do an apply to each on this first set of records where i'm calling the child flow and i'm specifying a few things the name of the field the value that's going to be updated that was specified when this was ran and then also the item id so that this child flow knows exactly which item to go ahead and update these values for the next thing that i do is i'm actually getting a property called the next url this is something that's specific to odata when you do an odata request which is what this dataverse call is under the covers for list records um by default there is kind of this you know 5000 limit of records if there are actually more than 5000 records back there's going to be a new property that says oh data next link and that's how you can actually um start to do paging inside of power automate when you go back to here and you go to the settings to set the pagination to true under the covers this is exactly what power automate's doing but in this example you actually need to be able to do this yourself you can't let power automate do that for you so we're going to store what that is then i'm also going to leave a placeholder variable for the skip token and inside of this property i'll show what that looks like in a second but you can parse that out and then when you call the list records operation you can actually specify that to then get the next page of data so right here i've got a do and till loop which says that as long as the next url property is not null we're going to go ahead and then do another call to list records and if we look at the advanced options here you can see that the skip token is now specified as one of the parameter options there and that's exactly how we're going to go one page the next to get all of the data regardless of kind of the limitations that we have inside of powerautomate here and then i do another apply to each year where i'm once again calling that child flow and because i was calling this twice i could have just hard coded an update in here but i just decided to break out that logic into a child flow you could decide how you want to do this for me it felt better to do it this way but you could have just hard-coded the action twice in this particular power on mayflow and at the end here i'm just getting whatever that next url is if it's populated we'll go ahead and loop over again now one last thing to call out here in the do and till there's a couple of limits that you can actually change by default count means how many times are we going to loop until we stop because you could you know invariably create just a infinite loop here this is kind of one of those ways to kind of prevent that by default it's 60 i think you can update it to like 5 000 and then there's also another limit which is the timeout now the timeout in this case is set to one hour you can change that i believe to a duration of 30 days and we have our documentation that kind of highlights all the different configuration values you could do for that but you want to look at both of these to make sure that you're not going to hit one of those based off the amount of data that you have so let's go ahead and run this one so we're going to go ahead and test we'll kick this off manually i'm also going to give it an update value because this is going to be what we're going to populate that particular record in the dataverse with and we'll go ahead and run this now let's go ahead and take a look at what this flow looks like as it's running so we can see here that the flow is kicked off and then we did the first list records action and i want to actually look at the payload for this so we're going to go ahead and download this and this will take a second because there's going to be 5000 records in json so we'll let this load okay so we're at the bottom of the payload for that operation and i'm going to highlight here the section that we were parsing out in the power automate flow to be able to enable the paging so we can see right here that there is the odata.next link and that gives an entire url where you could then do another odata query to get the data back starting from where you finished the last time now specific to the power automate flow that i've created here i'm actually uh parsing out the skip token property and if you look at kind of like the logic inside of that power unmade step that's exactly what's going on there we're just getting what's in that property to be able to then do another query now with the way that this paginated flow was set up with a parent and a child flow the nice thing here is that you can actually step by step look at each child flow operation in this case each child flow operation is each individual record that's getting updated in the dataverse so i can actually in real time go to the process and see kind of where we're at and the flow here is that i had some input variables which was just the name the value and then the unique id to the record in the dataverse and then i'm doing a single update record operation here and we can see that we're at uh number 284 i had numbered all the records in the database one to 5500 so of 5500 were we've progressed to 284 in this particular case so what we're looking at here is this is a completed paginated flow that's gone through both getting the records with the initial 5000 and then doing a do until to complete the operation here so we can see here that the list records um completed we could look at the output here we initially did and apply to each over the first 5000 set where we called the child flow each time we processed the next url to get the skip token and then we did the do until loop same kind of setup here we can look at the resulting set was then 500 in this case the total was 5500 for all of the records and it was able to get all those individual records that way so hopefully this video was helpful in the comments below i'm going to put a link to my github repository where i will post the solution for this power platform demo that i just showed in this video if you have any questions about this please feel free to put that in the comments also if there's anything that you're struggling with or you're curious about from a power platform perspective feel free to put that in the comments too and i could take a look at that maybe i could do another deep dive video like i did for this thank you for taking the time to watch this video and uh until next time we'll see you later
Info
Channel: Steve Winward
Views: 19,854
Rating: undefined out of 5
Keywords:
Id: 5NtzcfmSGes
Channel Id: undefined
Length: 14min 38sec (878 seconds)
Published: Sun Jan 03 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.