Power Automate Tutorial: Detecting Duplicate Rows in Multiple Data Sources with Power Platform

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi there folks in today's demonstration we're gonna have a look at how to build out a power ultimate flow that will detect duplicates in a data source so whether that's a dataverse table or a SharePoint list or maybe some data from SQL or a third-party API ultimately we're working with arrays we're going to use select we're going to repurpose some data and we're going to be able to identify and detect any duplicates based on specific columns in that data source once we've detected those duplicates we'll be able to keep the first example and delete any subsequent duplicates now if that's something that interests you don't forget to like And subscribe and without further Ado let's jump into the demonstration so I'm going to kick things off from Excel and I have a data source here where we have an ID column which represents the ID of the item or the row or that particular record within your data set that you've retrieved from an API and then I've got a couple of columns the owner in the description and it's based on these two columns that I'd like to try and detect duplicate rows I finally got a date time at the end which is relevant to me and that I want to delete or keep the most recent record or one of many records and so I'm going to explain via expressions in Excel how we can Implement that in power automate using some conditional Expressions within a select action so first up if I was to order the owner that we've got here we can see now that Alex Wilbur actually has three duplicates for the right documentation for the API if I look at Henrietta Mueller we can see that she has two duplicates for the configure and deploy the application and those two examples they have different times so 1509 and 1505 and of course I might want to get rid of the 1505 and keep the most recent 1509 in order to make this data now more useful if we use this sort functionality and do a custom sort I can add in both the description and also the date time albeit with the date time I'm going to sort in descending order and we've gone ahead and hit OK we can see now that when we look at Alex Wilbur and that particular write documentation for the API that we have the most recent record for each of those examples at the top and again if we look at Henrietta moella you can see again that the 1509 time is at the top there for the configure and deploy so ultimately I'd like to be able to keep that record and delete the one beneath so in order for me to detect this what I would do is I would use a concatenation so if I go ahead and type in concat I can join both the owner and the description and create a new string and you can see now I have this string based on Alex Wilbur and his description if I copy that down again I've extended that to now include all of those individual owners and descriptions as an expression what I'd like to do now is detect whether or not the row above equals the row below and if it does then I know I have a duplicate so here F2 does not equal F1 F3 does not equal F2 F4 does equal F3 and so at that point we know that we have a duplicate and so we would aim to delete this one and ultimately this one in terms of formulas all we need to do is type in if and then if F2 equals F1 then it's a true as in we want to delete it otherwise it is a false as in we don't want to delete it and if I copy that formula down my screen right now we can see that the t's highlight any duplicates that we'd want to remove from our data source next up I would like to introduce you to my very large list which is what I'm going to use for my demonstration within power automate I have over 5000 items I know there are a significant number of duplicates based on the task description and task owner columns and I have a couple of other problems I'll be ignoring for this duplicate detection as the focus is these two particular columns I also have a created date which I'm going to use to sort the data so that I can keep the most recent records for all of the duplicates that have been detected this is very much like the data source that I've just demonstrated in Excel next up we're in power automate where I have a manual trigger and I'm getting those items from my large list and it's worth just remembering from that Excel demo that I did I'm also ordering the data so if you can order the data from your data source fantastic today in this demonstration I'm going to assume that I can't even though technically I could just to demonstrate how to be done so first up get items then I'm going to use a select action and with that select action I'm going to repurpose the data from my data source and I need to specify the columns that I'm looking to detect duplicates on so in this case it was the owner and also the description of the task if you remember and then I had the created date time which I'll add in there and then finally the ID so how am I going to delete these duplicates it's always based on an ID the final part is a concatenation so I would like to concat both the owner and the description to create a new column within my data source and then because I'm also doing a sort by created date time I'm actually going to do a second concatenation and I'm going to call that sort by and we'll use that later on in this solution so when it comes to the owner I'm going to pick out the dynamic values for owner in this case not the task owner but the task owner display name then when it comes to the description and everything else I'll go ahead and pick out those values or grab created and I'll also grab id2 right with the concatenation it's all based on an expression and you can type in concat then you can jump back across the dynamic content and you'll note that actually from my situation anyway I'm not seeing those Dynamic values so this is when you need to know a bit about expressions and if I hover my mouse I can see the expression there for the task go in our display name so I can simply type in task owner and display name in the single quotes so we'll type task Boon app for slash display name and I want to concat that with the task description so I put in a comma and then I can type in my second expression so in my single quotes again and in with task description so you would do exactly the same based on your column names now at this point if I had a third of course I could put in a comma put in the third but for this particular concatenation because I'm comparing the owner and the description I just want these two column names I am however going to copy this expression Ctrl a control C hit OK and then if I jump back into sort by I can control V to paste back in and I'm going to add in the third value which is for my created date time so I can type in item open close brackets square brackets and singles and that is created so that now gives me a second concatenation with the created date which is just going to allow me to sort the data based on those three columns in that order so next up I want to sort that data and I'm going to use that second concatenation I did so I'll go into my compose remember it's important that these rows run in order and if I go into the expression tab I can type in sort with the open close brackets into my Dynamic content choose my select and then I do specify the the name of that column to sort by which in my case I called sort by if I hit OK just to remind ourselves if I pop open that select it's based on this value here which is my concatenation so I'm sorting the select above based on the key sort by that'll Now give me all those rows in a nice sequential order a to z and in the order of the owner description created date time next it's all about detecting those duplicates and we saw again that in the Excel demo where I compared the row above with the row below to see if they matched if they did we output a t if they didn't and output an f and what we're going to do here instead of using one of the arrays as an input we're going to use range and so range allows you to select a number to start with and then how many numbers so a length in this case so I'm going to use the Expression length which is based on the length of the array so if we had 10 items we'd have an array that runs from 0 all the way through to 9 and it just so happens that we can use these numbers as integer index values to select each of the objects in our array or all the items or all the rows within our data source and this is the clever bit that will allow us to compare row 0 against Row 1 or Row 2 against Row 3 using another expression subtract where we literally take away one from the number that's generated by this range expression jumping over to the map we're now going to turn this into text mode and then jump into the expression editor and we can use the add property expression and it allows us to add a new property into an object so the object we're looking to add a property to is all based on if I go into the dynamic content R compose our sorted array and like I say we can use the integer indexes so question mark square brackets rather than having a fixed name we're using a number and that number just happens to be item with the open and close bracket so that be the zero the one the two the three Etc based on that range when it comes to adding in the key name we have to think about what we want that key to be called I'm going to call it is duplicate so we'll have a new key or a column within this array to indicate is that a duplicate or not and then the final piece is an expression so if the row above equals the row below output a t otherwise output and F and that will be based on if the open close brackets equals with the open close brackets and then the two values that we want to compare so we'll first of all want to compare the output from the compose and I'll just move my cursor to the end here the output of compose so the first closing bracket there and it's based on that item so that number that we're getting as part of that range and then we want to get the concat value so we want to get that concatenated string to compare it against the row above and the way we do that a comma and we put in exactly the same expression again with a subtle change so again the output of that compose I'll go to the end of that first bracket I'll put in a question mark with our item open close brackets again the end of that square bracket another question mark with the word concat which will get our concatenated string we're currently comparing ourselves against ourselves but with this item which is our number our zero one two three we simply want to subtract one from it so on Sir sub go to the end of that closing bracket put in a comma put in a one putting a closing bracket and we're now comparing one row against the other row moving along this expression if we go to the fast closing bracket at the end of this equals we're now into the if again I can put in a comma and I can put in a t for True another comma and an F for false and now we know that any that match will output a t and any that don't match what output an F exactly the same as we had in that Excel demo okay everything's been accepted and now all I need to do is to filter this array for any that are equal to true or any that equal to false and we'll go ahead and insert this filter array action so filter array is going to be based on the output from the select two and all we're going to be doing is looking at this property is duplicate so this new property that we've added so in this choose a value we can go into expression we can type in the infamous item open close brackets and single quotes and type in is duplicate and with that we can then see is it equal to a t because if it is we know it's 1 I'm going to delete so we can rename that to delete me and if I go ahead and copy that and we can paste that into our browser into our clipboard I can call this one keep me and we can change the condition so that it instead filters based on the F for false so I'll pop that open and I'll change that to an F so that's it delete me that's to keep me based on the filter is a duplicate equal to the true or the false and just last of all as a wee check I'm going to add in a compose here and we can get some counts so number of rows we'll go number to delete and then number to keep and of course the number of rows should equal the number to delete plus the number to keep the number of rows is based on the length expression and we can go into Dynamic content and get the length of our original sorted array here the number to delete is again based on the length and it'll be based on that output from the delete me filter number to keep if you haven't guessed if I go and do the length again into the dynamic content keep me hit OK and when it comes to actually physically deleting the items if I was to add and apply to each I won't go ahead and actually clear out my list because I need it for another demo but if I insert our delete me array so as the input here the delete me I can add an action we'll just go ahead with a compose and we can output the IDS of the items that we're looking to delete so the ID is just going to be based on the expression item open close brackets square brackets single quotes and ID now having put the flow through test mode if I have a look at the outcome you can see we have run the get items in three seconds and then the rest of the actions have taken again a handful of seconds to do this comparison and crunching with the arrays and the columns Etc the appliance each took a bit longer it was a loop it did go through 4 391 records from the 5000 that apparently should have been deleted but you can see the ID being output and of course you would pass that ID into your delete action for whatever data source that was if we start from the select at the beginning if remember we took the input from the get items and then with the outputs we were selecting specific rows or specific keys and values to the owner the description the created time and the ID and then we did a concatenation both with the owner and description and then for the sort by that included that date and time if we have a look at the compose it was then based on a sort so this is now sorted by that vaccination and at the beginning of my data I have a lot of null values it just so happens but as I scroll down a bit further you can start to see some examples there of how things have have started to order based on that owner description and created time being appended in that concatenation and in a specific order and then if we look at this select two you can see that range has gone from zero all the way to the length of our array and there happened to be 5000 so it stops at 499 and then it's using that that we've been able to create this next array that includes this property is duplicate and it's set to either false or true depending on whether or not the row above matches the row below so the first one is false because there's nothing above it and then the one after that and so on is equal to true and if we scroll down a bit further and get one with a bit of text let's have a look at this one here you can see we've started off with an F this is the first instance of Alistair analyzing website traffic it's in a specific date time order so we have actually the oldest one here had we look to keep the newest we might have used the reverse expression rather than the sort expression and then we would have kept the newer record But ultimately we can see that there is duplicate for this first one here is set to false and then everyone after that is equal to true and as we scroll down again we have the next example where there's a collaborate with the design team again we have one set the false and then all of them after that's set to true and so on the final bit is looking at the delete me again if I pop that open that's going to have all the ones that are set to true and of course the keep me is all the ones that are set to false and if I pop open that compose two we have our summary there are 5000 Rows 4 391 to be deleted which matches are apply to each Loop and then 609 to keep and so that marks the end of the demonstration lots to learn now we've seen a use of Select again with repurposing data the add property so you can add new key values we've got concat so we can bring together some strings and we're also sorting our arrays using the sort or the reverse expression so if that's something you enjoyed please make sure you like And subscribe and I hope to see you again sometime soon thanks for watching thank you
Info
Channel: DamoBird365
Views: 9,076
Rating: undefined out of 5
Keywords: addproperty expression, concat expression, data duplication, data management, data processing, duplicate row detection, power automate, power automate expressions, power automate flow, power automate tutorial, power platform, repurposing data, reverse expression, select action, sort expression, workflow automation
Id: SGXAqAzYUSM
Channel Id: undefined
Length: 17min 53sec (1073 seconds)
Published: Sun Jun 18 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.