Compare different files QUICKLY using Power Query in Power BI // Beginners Guide to Power BI in 2021

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video i'm going to show you a quick way to verify the files that you migrated using power query we're going to go through how you can find missing raw data from one source to another and i'm also going to show you how you can look through each of the different columns if you have a lot of columns to look for the discrepancy between these two data sources all of that and more so without further ado let's get started [Music] hi my name is fernan and welcome to the solutions abroad youtube channel where i focus on teaching beginners the wonderful world that is power bi i upload new videos every week so make sure you hit that subscribe button and the bell icon to get notified when a new one is out so this problem was actually inspired from a work scenario that i faced so imagine a scenario where you're in a business and the business has completed its data migration so that means moving the data that they have from their legacy systems into the new systems where the business will be using now the migration process is done and the business is asking for verification that the migration was successful and the only way to do that is to compare the tables that you have the data that you have from the legacy system into the new system and you imagine you have thousands and thousands of rows and also a number of different columns and you don't want to be doing this manually right and now i want to show you how you can do this quickly and easily with power query so let's go to the example that i prepared for you today so here is a power bi report that has some data sample data that i prepared for you today we have two tables here let me just expand it here you see that we have two types of people data people data legacy and people data migrated now both of these tables both of these sources have almost the same data and what we're trying to find out is what are the differences and if there are any in preparation for this demo i've changed a couple of different columns just so that we can see how you could compare these two together but just imagine that these could be from anywhere right so at the moment i pull these data from excel because that's where i created them but it could be coming from a sql database or from a different online source now if we quickly create a count on each of them so let's do a count of the migrated data we'll do that migrated and then we'll create another one for the legacy you'll see that on our legacy system it says that we have a thousand rows whereas in our new system on the migrated it says that we have 995 so we are missing five rows of data in our new system now how do we find out what those are now let me show you how you can compare these two using power query now let's open power query from this window let's create a reference to any of these two it doesn't really matter and we'll we'll name it delta something recognizable so now what i've done is i've created a reference so we don't touch these two and i will show you why in a bit why i do that but for now we are doing all of our transformations here in this reference so this is the legacy system the quickest way that you'd be able to compare the legacy against the migrated is through merging so what you'll need to do is click merge queries from here we'll need to merge it at the moment to the migrated because this table is or their legacy and now we will select the id so these would be the same for across these two so we will match them through their ids and what we'll do will change the join kind to a full alter this makes sure that if there are any missings on either the legacy table or the migrator table they will be shown as a new role in our results now in the previous video i covered the differences between these different join kinds in power query so if you're interested in why we're using full outer join or how you can use the other kind of joins check out that video it's really helpful so now we'll click ok and we will expand this by clicking the arrows here and we'll just leave these like that for now and hit okay so now you'll see that it puts the legacy in the migrated rows side by side based on their ids but now you will see that the people migrated the people data migrated has 995 uniques and if i hover over that you'll see that we have um it's not the same right so these this is what what is missing so you'll see at the bar here that there we have five less than one percent that is empty so if we filter those empty ones you'll see that now it brings us up the values that are missing in our new migrated system so i can see that um ids one five one two one five five is missing in our migrated data so let's verify that that is the case that we're missing these ones so if you remember one five one two one five five let's go back to our migrated query so let's scroll through that here we go so you see that we are definitely missing that data in the migrated table and that shows you the quickest way that you can compare two data sources together and finding some of the missing values on either side of these sources however you notice that we have a lot of other columns that we have inserted and we want to make sure that for every single column each of these row values are exactly the same between these two now how do i do that very easily and it's also quite easy so let me just show you so let's go back to our delta first and let's remove these queries that we did so let's just delete until the end so that deletes all of our steps that we have created so far so what we'll need to do we'll need to unpivot these two tables so let's do that so from here we'll need to highlight the id so whatever we want to use so this is our primary key so we'll unpave it from here and you right click and you unpivot other columns what that does is for all the other columns in your table it puts them into a single column attribute and the values between those two between all of those columns are put in a single column as a value and i'll show you why you need to do that in a second we'll go to the migrated one and we'll do the same so on pivot other columns and at the moment the data themselves here are already preset so that means the name of the columns are the same but you just need to be aware that if you want to compare two data sources against each other they need to have the same column names so now what we'll do is we'll go to our delta and you'll see that it follows everything that happens in our query here and i will do another merge and we will do merge again against the migrated data and here what we'll do is we will merge them not just with id but against its attribute as well so what i'll do i'll click id and i'll hold ctrl to select multiple columns you'll see that we have two columns selected on the people data for the legacy and we'll do the same here for the migrated so id control click and what that will do is it will match these to these two tables against their ids and attributes we will hit now we'll do enough full outer as well and we'll hit ok now we'll expand this migrated column that we have to adjust the value and you'll see why in a second so now firstly what we will need to do is go to add column create a conditional column and now what we'll need to do um we'll create a new column called is match and we'll compare the values against these two columns so value is equals to the migrated value so what we're saying here is if it's a match so if these two values are the same then it's a yes and then if it doesn't it's a no if we hit ok here we are that's our new calculated column that we have created now we will filter this to just show us those non-matching values we hit ok you'll see now that it shows us all of those values that we have at the beginning so we found out that they don't have a value on the migrated table so we can see they are from one five one two one five five and we know that because uh they were missing um they were missing from the beginning so that's why they're not matching so they are showing us now on migrate because it doesn't exist now if we filter this even further and we just we didn't want to see the nulls because we know we already know what they are if we're just looking for strictly any values that aren't the same between these two tables here's what they here's what we have so we can see that for this id and we have a last name on our legacy called the name is barley but it's different in the migrated table it's it's actually barnet so let's check this one right so let's check the 171 so if you remember that let's go here and let's uh let's see if we go back up before we unpivoted so 171 the last name is barley and then if we go back to the migrated if we now go one step up let's compare that one seven one here we are so it's completely different it's barnett actually so that shows us that there is some differences um between these two tables and the beauty of this method is that no matter how many columns you have or no matter how many rows you have the process stays the same so that means if you want to verify the integrity of your migration data as long as they are pulling from the same place you can just reuse this power bi report and you don't have to do any rework at all which is fantastic and that's it for this video i hope this helped you understand how you can use power query to compare two different data sources against each other i'll leave a download link down below if you want to use the files that i used for this demo if you like this video or you have any further feedback let me know in the comment section box below get in touch using the social media links that i included in the description box below and thank you so much guys for watching see you again on the next one
Info
Channel: Solutions Abroad
Views: 35,854
Rating: undefined out of 5
Keywords: solutions abroad, power bi, powerbi, power bi tutorials, power bi for beginners, beginners guide to power bi, data analytics, dax, data modelling, data visualisation, business intelligence, power bi 2021, power query, power bi power query, power query compare, power bi compare, power bi merge, power bi full join, power bi merge query, power bi compare files, power bi delta, power bi reconcile, power bi reconciliation, reconciling files, reconciliation, reconcile
Id: 294BH6vOpcs
Channel Id: undefined
Length: 12min 9sec (729 seconds)
Published: Wed Mar 03 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.