Today, we'll cover a{} more complex data consolidation
scenario in Excel, because let's face it, in real life we usually run into problems that we don't expect, and they aren't covered in
a standard basic training, so here's a scenario. You want to combine and append
the data from different files that are saved in a folder, but the catch is this, you need a mapping table for the headers, the data in each file can
have a different order, and they can also have different texts, so they're not consistent. Luckily, you have a mapping
table which you'll use, and you're going to use Power Query to get it all automated for you. (upbeat music) Okay, so the files that we
want to append are sitting inside the info folder on my C drive, and I want to set this
up in a dynamic way, so that whenever I add a new file to this, and that file ends with underscore HR, and it's an Excel file, I want that data to be
automatically appended, but here's the catch, let me show you the
content of the files here. So this is from the company Blend, and they send their
information in this way, where the first column is name, then salary and then department, now, this is an Excel table, and the table is called TSalary, okay, so name, salary, department. Now, the problem here is that the other companies
don't follow this order, so let's activate the preview pane here, we have a table here, now these tables all have the same names, but the problem is that
Kryptis sends the information with headers in Italian, Misty Wash has it in German, and they also have a
different column order, what they have all in common, is that the tables have the same name. Now, we've told these companies many times to correct their names
before they send it to us, because it's going to give us a problem when we're appending the data, unfortunately, they don't listen, so our solution is to
create a mapping table for all these headers, and we've added the mapping
table to our final file, where we're going to have our report. This is the mapping table, it has the list of before
names and the after names, after names are the correct names, so we want to have name, department and salary in our final report, and this makes it also easier
for us to maintain this, so if anyone sends us
information that's missing, because they used another header name, all I have to do is add this information to my mapping table, and then I just need to
refresh my final report, and everything is there, but how do we do a mapping of the headers? So this is not mapping of the content but mapping of headers. First thing I'm going to do, is create a connection
to the mapping table, this table is called Mapping, let's go to data and send this to Power Query from table, range, and leave the name as is, all I want to do is
create a connection to it, so close and load to, and only create a connection. Next thing I want to do, is import the data from the folder, so let's go to get data
from file, from folder, the folder is called info, click on open, here I can see a preview
of the files included, that's the right folder, let's go ahead and transform the data. I'm going to leave the query name as is, one thing I want to do is
to add a filter to this, go with text filters, click on contains, and let's go with underscore HR dot xls, now, this makes sure that
only the files that are Excel, and they have an
underscore HR in the title, will be included, if there's a word file or a PDF file, those will not be included
in the final query, now, let's go ahead and click on these double down arrows
here to open up the content, Power Query's going to
go and do its thing, it's going to take a look
at the sample file here, so notice it picked the first
file as the sample file, and it's giving me the
content of this file, which only has one sheet and one table. Now, the table that all of these files have in
common is called TSalary, so I'm going to go with
that and click on okay. Now, Power Query is going
to pick the sample file and apply any transformations
it needs to that sample file, then use that to apply
the same transformations to all the other files, and then it will
automatically append the data in our final query here, so it did all of this on its own, but there is a problem, right? Because I have missing data because not all the files
have the same headers, so name, salary, department, is actually the correct
header that I want, but only one of the files
has all of this information, the second file, remember
that was from Kryptis, this was in Italian, so I'm missing the information because the headers don't match, then another file is Misty Wash, that had the correct header for name, but salary and department
information is missing. I need to map the headers
before I append them, which means I need to do the mapping on the transform sample file query and not at the end here, because the moment a file is loaded, it should already map the headers and then it should append the information. Now, just so that I can visually see that whatever function I
use is actually working, I want my sample file to be a file that has different headers, so let's go ahead and
change our sample file. Remember when we were
in the navigation step, Power Query suggested the first
file to be our sample file, if I go back to the previous step, we can see the first file is Blend, so in the navigation step
it's taking position zero, which is the first file, because remember, Power
Query is zero based, so if we want the second
file as our sample file, we need to change the index to one, because this is zero, one, so let's go ahead and do that here. Now, the moment I press Enter, Power Query's going to add
another step which we don't need, we need this to end in navigation, our sample file is now Kryptis, we should be able to see it in
the sample query, and we do, we now have the headers in Italian. Now, in info we're running into a problem, the column name of the table wasn't found, so remember, originally the
column names were in English, so I can't apply the
change type step here, but if I remove this, I should see the new column
names up here and I do, and I'm missing a lot of information because Blend is not there anymore, the only information
that I'm going to have, is going to be from Kryptis, everything else is missing, so your sample file is actually the driver of your final version, because it's going to take the
headings of your sample file. Now, we need to apply
our header transformation at this stage here, now, the way I'm going
to transform the headers in this example, is to use List.Accumulate, which is one of my favorite
functions in Power Query, it allows us to loop through this list, see if there's a match
between the existing header and any one in our before list, and then replace it with the after list, so let's go ahead and do that, in our transform sample file, let's add a new step. Now, this is the name
of our previous step, our previous step was navigation, and this is showing us
the table that we selected in the navigation step. Now, List.Accumulate is something that we're going to use inside a function that's going to help us do
the final transformation, that final function is going to be table.transformcolumnnames, that's the main function that we need. This function needs two arguments, one is the table and the second is namegenerator as function, so it expects to have a function here, that it can apply to each single column, this is where we're going
to use List.Accumulate, so I'm going to go with each, because I want it for each single column, now, let's go with Shift+Enter, and let's expand this, so that we can see better, now, don't worry about the error now because we're not done with this function, so we're just going to continue, and go with List.Accumulate, actually, it's right here, press Tab, now, open bracket and see what we need, first thing, is a list as list, now, our list is going to be our mapping, but our mapping currently is a table, so we need to transform it into a list, we can transform it into a list of records or a list of lists, as long as it's a list, so I'm just going to go
with Table.ToRecords, and transform it into a list of records, so that's the mapping table, and that was the first argument. Second, is the seed as any, so seed is the first item
that we're going to put through this function that
we're going to be creating, the function is called
the accumulator function, and this can be whatever you want to make, it's very similar to recursive lambda, if you're curious what
a recursive lambda is, (air whooshing)
I have a video on it, and I've added it to the
description of this video. So in this case, our seed is our header, so it's each header, and for this we can use an underscore, now, comes our accumulator function, so this function needs two things, now, let's press Shift+Enter
to go to the next line, the two things it needs
are state and current, now, you don't have to call
them state and current, you can call them anything
else that you want, but I prefer to use this syntax, because it's easier to distinguish between the two different states. Now, here we need the go-to operator because this is a function, our function is going to use Text.Replace, so let's go with Text.Replace, we want to replace our current
text, which is the header, this is something we have to keep in mind, is that because we're inside
the accumulator function, we can't use the syntax we have here, so I can't put an underscore here, instead, I have to use this vocabulary, state and current, so my seed is going to be my first state, but remember, I can't
just leave it like this, because my headers are not in upper case, and need to make sure
everything is in upper case, so I have to put my state
in the text.upper function. First, I transform it into upper case, and then, I use that in
the Text.Replace function. In Text.Replace, I need the old as text and the new as text, so the old text I'm comparing this to, is the first item in my mapping table, in the before record, so remember, I transformed
my mapping table into a list of records, the current is going to be the first item in my list of records, but I need to specify
which one, before or after, well, I want my old text to be before, and my new text to be current after. {}, we have to be careful that we have to have the
headings typed in correctly, now, I'm pretty sure I had
them in upper case before, so let me just fix that
before I press Enter, now, let's make sure we have
closed our brackets properly, and press Enter, and we get these transformed correctly. Now, List.Accumulate
is not an easy function to get the hang of, that's why I've dedicated a whole section to understanding this function
in my Power Query course, but basically what's happening here, is that in the first run of this function, it's going to take our previous header, which was departamento, that's going to be your state, it's going to compare it to
the first item in before, so if I go here, it's going to compare it to this, and then it's going to
replace this with this, but there is no replacement to be done, because full name is not
included in departamento, and then the new state
is going to be the result of my transformation, in this case, the new state
is the same as the old state, it remains departamento, and then it goes on to the next line here, and it tries to do that replacement, so the great thing with List.Accumulate, is that you can do multiple
name replacements in one go, because you change this
current state of your outcome, as you move through this list. I understand that this is very geeky and a bit difficult to understand, so you can skip through my explanations and just go with the final outcome here, or if you'd like to understand it better, consider checking out the full course, but ultimately this is all
we need to get this to work, well, let's make sure it really does work, so here we have department,
name and salary, we have the values of Blend, and if we scroll down, well, we have Kryptis, that was the original file
that we were working with, Misty Wash is there, it looks like everything
has been mapped correctly. Now, if anything is missing
from our mapping table, the data will be missing here, all we have to do is
update our mapping table and everything will work. So let's go and load this to our workbook, existing worksheet, I'll just put it right here and okay, and that's our consolidated data file, I think I was just too excited to do the final corrections to this, I forgot to just grab the company name, and also update the data types, but before I do that, let's test this whole concept, I'm going to get a new
file from another company, and one of the headers
is not going to match, so the file that I have is on my desktop, and I'm just going to drag and
drop it here into the folder, let's take a look at the preview, department is correct, for name I have first comma last name, and then salary is correct as well, so this one is missing
from my mapping table, well, what happens if I
just go ahead right now and refresh this? Let's see what we get. The information should be
appended, it's right here, and the name is missing, so now, let's go ahead
and add this to our table, so it was first comma last name, but notice everything
is in upper case here, so we don't have to worry about the different cases we might have, so first comma last name, and that should map to name, now when I refresh this, we should see our names
pop up here, and we do. Okay, so now let's go ahead and make the final corrections, just grab the company name by going to transform, extract, text before delimiter underscore HR, let's pick that as the delimiter and okay, change this to company and
also update the data types, Control+A to select everything, under transform, detect data type, all our steps are added and recorded, and we're just going to go and
load this back to our workbook. Now, in case you're
dealing with a lot of data, one thing you can do is to
buffer your mapping table, this is going to give
you a performance boost. So that's one way of solving this, but there are always different
ways to solve a problem, the more you learn, the
more choices you have, and then you can pick the better
one based on the situation. If you'd like to improve
your Power Query skills, so you have more options
to approach a problem, check out my complete Power
Query course on xelplus.com. As usual, thank you for watching, don't forget the thumbs
up before you leave, and if you're new here,
consider subscribing, and I'll see you in the next video. (upbeat music)