Append vs Merge in Power BI and Power Query

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone this is Reza from Redick at and today I want to explain to you what is power query merge versus append you have two different data sources to different data tables and you want to combine these two data sources and the two data tables together you have two options merge or append what is the difference and this rule apply in power core in Excel power query in power bi and everywhere else that you use power query let's check this option power query has two options for combining data sources together option number one is called append which is similar to the way of combining data sources together in sequel or relational data sources called Union this is the scenario that you use when you have two data source 2 data tables with exactly the same structure same number of columns same data type same column names but the data is different let's say budget for 2018 budget for 2019 instead of bringing those two tables separately and then writing lots of calculations to combine these together in tax a lot of writing lots of measures in power query you can actually append them together same rule applies when you have students of course one students of course - you have to combine them together with an option called append let's check it out here I have a excel data source and this data source is explaining students of one course these are information about the students ID of each student the first name of student last name of student and the course that they are enrolled I also have another sheet in the same excel file this can come from another data source as well this may come from sequence server may come from web data source or anywhere else which has exactly the same structure ID first name last name course but these are students of different course and the first one were the students of English course this is students of the math course I want to bring these two data into power bi here I have a power bi desktop report which doesn't have any data I'm going to choose getting data from Excel and I choose that data source to get data from and in this data set okay let me close the file in this data set I have three tables but I am just selecting two of these which is course one students and course to a students for now and then I click on edit if I load these tables exactly as is in power bi I end up with two tables this is my power coordinator I end up with two tables course one a student score twisting across to a student's exactly the same and then there would be no point because every time I want to calculate something give me all the students of all courses I have to write a measure give me something that is across both tables I have to write a measure it is always best to combine these kind of structure together and because they are the same structure I will be using append so let's see how I can use a pen I can select one of these tables it doesn't matter which one let's say this one and then in the Home tab if you go to the far right of the home tab you will see this option append queries or append queries as new when you choose append queries as new it will keep the existing queries as it is it will not change them it will keep them as is and it will create a new query result of combining those two if you use normal append queries it will change the existing query to be result of appending the two or three or whatever it is if you want to keep your existing queries intact and as new is usually better option I'm going to choose append as new it will ask what is my primary table which is course one students and secondary table you can also choose so you can choose two tables like that or if you have three or more tables go to three or more tables option and add tables into this section here I have course one and course two students only I click on OK and this will combine them together so this is now result of combining those you can see I have five rows here which three of these is coming from query one and two of these is coming from query two let's say this is one this is two these are coming from record one and query one this is coming from query two because it's exactly the same structure it will append only these after each other now if I load it into power bi I can easily put a slicer on course and depends on the course that I select I will see only courses in that area if I don't put a slicer I will see the total number so one single table will serve all requirements that I have one of the things that you need to be careful about append is that append will not remove duplicates what that means that means if you have a row exists in students of course one and the same row exists in students course - for some reason exactly the same row same number same values in all columns the result of append will include both shriek records it will not remove that you have to go to the option of remove duplicates to remove it this behavior is similar to Union all in T secured for those of you who may come from database background so append is a good option for combining data sources when you have two similar data structures but different date so the important option after append is that it will not remove duplicates in case that you have to placate it will have those duplicates but then you can use options like the remove duplicates remove roots remove duplicates remove duplicates will remove duplicates across all columns but first you have to select all the columns the ctrl or shift and then after that choose remove duplicates because remove duplicates it depends on which column you select and so a combination of a pen and remove duplicates will give you a result appended result which doesn't have any duplicates ok you want to do append but your data sources are slightly different one of them has one column the other one doesn't one of them has a column with different name the other one has different name columns what happens if you append if you append the append result will not give you an error you will not have an error but you will have one column for the data source a which has one column extra for the data source B it will be all null in that section or if they have two two different names like a student name without a space the other on a student name with space then you have two columns as the output student name without spaces student name with space one of them has null for the other data source another had null for this data source so it is always best practice to do a cleaning data before appending those data sources otherwise you might get into issues of having lot of extra columns another important tip after combining data sources is to make sure that your original data sources are marked as disable load or let's say uncheck the enable load because you do not want to load those data sources however those data sources are needed to be loaded to to load the final query to load into power VI this rule helps the performance of your model and saving the memory engine of power bi another important thing about combining sources to combining data sources together not just append is making sure that you do not load extra data into power bi these two queries right now in power bi and they are consuming extra memory but what I want to be loaded in power bi is only the final coil and the other two queries are not required but if I go there and right click and say delete it will not allow me to delete that query because it is used in append 1 let me rename just happen want to create a better meaning for this let's call it course students ok so I cannot delete those if I load these I will have three tables loaded into power bi what should I do I can right click on these tables that I don't want them to be loaded in power bi and uncheck enable load when I uncheck this option you see it is grayed out it means this query will not be loaded in power bi but it is part of the refresh process and it will load the data into course students some people think that this is stopping the Refresh no it's not stopping the Refresh it will be still part of your refresh process but those queries individually will not be loaded into power bi ok you learned about append what about merge what scenario you use merge for let's say you have two separate data sources and and their data structure is totally different they do not have the same structure but you want to flatten that structure let's say students is one table and courses is another table you you want to combine these together students who took specific courses who are enrolled in that courses you want to have all in one table there are some scenarios that you want to flatten the structure you want to combine data tables together it's not good idea to bring all tables as easy in power bi and make the relationship because you may end up with hundreds of relationships sometimes you need to flatten that structure and that is a scenario that you can use merge merge is giving you ability to combine data sources with one or more than one fields of relationship which we call it link fields joining fields you can choose that field angry relationship let's take this example here I have this query that we just created which is all the students of all courses but we just have the course name and we have also another query I'm just getting data from that another let's say table which has course information and this course information has the title of the course which is the same title that we had over there some other information like timing of the course capacity of the course and things like that I want to combine these two together I'll load that here as well I want to combine these two together instead of having one table for course one table for courses students I want to combine them together now you have to be careful when you are doing it because you do not want to combine everything together end up with one big table also you do not want to load each table separately all of them like hundreds of tables and create relationship in power bi based on those there are always best practices designing based on star schema and things like that which is outside of topic of this video I'll cover that a little later in some other videos let's check it out how this works in terms of combining right now Oh select one of these tables let's say course table and I go to the home tab again at the end I choose append queries as new sorry it is not append this is merge so I go to merge queries and marriage queries as new append is only when the structure is the same this is not the same structure you can see courses students different column course different set of columns they have some column which is kind of the same like course title but that is the only column which is the same everything else is kind of different so I'll go to Home tab minute queries mesh queries as new and I choose the first query which is course in this case and the second query course student and I choose what is the merge field the join field it is important to choose the right field here title is my right field it doesn't need to have the same name you can see here it's called course and there it's called title doesn't matter I just select that column and then click on ok if I have more than one column as a join then I select them in order hold the control key and this will give you ability to select them in an order ok I select those and click on ok so what is the result of a merger result of merge is your first query which is course you can see these are all columns from the course table with a column at the end which we call it usually structured column structured column is a column that have a structure in every cell you can see it is not a single value it is a table if I click on a blank area here not on the table on a blank area here it actually shown me that table preview down at the bottom let me bring it up here so this table now means these are students of math course or if I click on this physics we don't have any students for this course you know data's at at least English we have two students of this course right now you can combine these together with clicking on this option because the structured column will not be loaded in the power bi power bi does not understand a table inside another table you have to flatten that structure if you want and when you are flattening it you can choose which columns you want from the other table let's say from the other table I want first name last name and ID of student ok first name last name and ID of the students I will call this student ID right so now I have all of these combined I have the first two rows for math course and students of that the first three rows actually then before four and five our students of English course and row number the last row is physics students you can see some of these rows are a little bit different some of them are matching between the both datasets some of them are not matching that is why you need to be careful about merge type you are merging data sources together one of the most important thing is to check do you choose the right join kind or not the right join kind or the right merge type is giving you the output that you want sometimes you want rows that are matching between table a and table B sometimes you have table a table B and you want only part of table B which is inside table a sometimes you want part of table B which has table A in it depends on what output you want you may choose the join type that you want sometimes you want table a without the matching rows of table B it will be a smaller section of table a different types of merge will be possible through different merge kind or join kinds in power query you can choose that merge kind let's check it out when you do the move you can actually select the right we'll start with clicking on the setting of the merger option you always have the join kind and you can choose what you want there are different options you can get rows that are only matching between two tables rows that are in the first table matching in the second table rows that are in both tables rows that are only in one of these tables and not the matching depends on what you want you can choose the right join kind similar to append merge is also a good option and it will give you ability to combine data sources together in this video you learned how easy is to use merge how easy is to use append what scenarios you have to use each I have another video about explaining every single merge type and details of that if you are interested to learn more about power bi and interested in these videos go and sub type in the button below and you will learn more free videos of power bi like this
Info
Channel: RADACAD
Views: 79,005
Rating: 4.9674134 out of 5
Keywords: Power Query, Power BI, Excel, Data Preparation, Append, Merge, Combining Data sets
Id: qXH4WjCykLc
Channel Id: undefined
Length: 17min 32sec (1052 seconds)
Published: Mon Sep 10 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.