Today let's take a look at a common task you might have already come across, which is to combine multiple
Excel files into one file. So for example, let's say
you sent out a template to your colleagues to collect some data. You got them in separate files. Now you want to combine them. Basically, you want to
consolidate or append the data in one single file. One solution has always
been VBA, but this time we're going to use a simpler approach. We're going to use Get & Transform, also known as Power
Query, from the data tab. (upbeat music) (air whooshing)
(bubble popping) My aim is to combine the data from these files by directly
connecting to the folder. Now, there are a few requirements. I don't want to include any files that don't have _Data in it, and I also want to be sure to
exclude any non-Excel files. Let's quickly take a look
at the content of the files. I have cells information
for a specific month. The data is not in an Excel table. The structure of the files,
though, are identical. They all have identical headers. Now, I want to make sure
I exclude this file. Now, what want I want my end
result to be is a PivotTable that combines the data from these files. I want to get a summary sales report by a company and customer. Let's open a blank workbook. Go to Data + Get Data +
From File + From Folder. Browse for the folder. My files are sitting right here. OK and OK. Now, Power Query goes
ahead, creates a connection to the folder and checks
what's inside that folder. If all the data was in a proper format, if it only included the files that I need, I could go ahead and Combine & Load the data directly to the workbook, but I need to make some transformations on this. So let's go ahead and transform the data. Before I combine these files,
which we can see right here, what I want to do is
to filter out the files that I don't need. So I set that I don't need anything that doesn't have _Data. So for text filters, I
want to include everything that contains _Data and OK. I also want to make sure
I've restricted everything to Excel files, Begins With .xls and OK. Those are my two steps. Now I'm ready to go ahead
and combine the content. All I have to do is click
on this double-down arrow, Combine Files, and Power
Query is going to bring up the navigator, asking me which tab or which data I want to combine. The decision I make here
is based on a Sample File. So it's picked the first
file, which is Bere Kleid in that folder. From this first file, I get
to pick the tab that I want, and let's go with OK. Now, Power Query's going
to go and try to figure out what transformation it should
apply to that sample file, apply that transformation
to all the other files and append the data. This is all the work
that it's done for me. Now, unfortunately at this
point, it's not perfect because my data wasn't
in perfect Excel tables. So I need to make some
adjustments but it's gone ahead and added all these steps. My last step before I click
this button was this step. What it did is it added another filter. It filtered the hidden files. So if I happen to have
one of these files open in the background, I'm not
going to run into problems. It's going to be automatically filtered. Then it went ahead and created
all these transformations, which is not that much here. It tried to figure out what it
should do to that sample file to get it in a proper format
so that it can apply that to all the other files before
it appends them together. The function here takes this
transformation and rolls it out to other files
before it combines it here. That's what this step means. Then it went ahead and
decided to rename this column to source that name, and then it removed all
the other information about the file, and then
it expanded our data and it applied a Changed Type. So now, we know we need to
clean this before we load, but where do we do the cleaning? Do we clean the appended version, or do we clean the sample version? Now, it's up to you,
but wherever possible, try to do the cleanup
on the sample file level before Power Query appends them together. Now, some steps, though, you need to do on the appended level because
only on the appended level do I have the file name,
which I could actually use to grab the company name. So if I go back a step,
before we expanded the table, I could actually take care
of that right after this. I could also take care
of it in the end here. In my case, I'm going
to take care of it here, so I'm going to rename this
to Company + Insert a Step and I'm also going to
extract the company name from the file name. Transform + Extract +
Text Before Delimiter. My delimiter is an underscore, but I'm just going to add
_Data to this and go with OK. Now, once it expands it, I have the correct company name expanded. Now, because my header changed, the Changed Type step doesn't apply. I'm actually going to exit out
because it doesn't make sense to apply a Changed Type to these. I should clean these before. Let's take a quick look at
the cleaning we need to do. First off, the column
headers are repeated, and second, we have a bunch
of null rows at the bottom of some of the data sets here. Let's clean that by cleaning the sample. Inside the sample file, I don't need the Promoted Headers there. What I actually need to do
is remove the top two rows. Remove Top Rows, put a two and OK. Now, I want to promote the headers here, so Use First Row as Headers, and next, I want to take
care of these null rows. So Remove Rows + Remove Blank Rows. Now, another thing it did automatically when I promoted the headers is it decided to change the type. That doesn't really help me,
though, in the append side because I still need to do the type here. So what I prefer to do is
remove the Changed Type step from here, and get that
correct in the last step here. So I'm just going to press
Control + A to select everything, Transform + Detect Data
Type and just double check that everything is correct. 'Kay, I have clean data. Let's go and load this as a PivotTable. So Close & Load + Close &
Load To + PivotTable Report. Let's just put it in the
existing sheet here and OK. Now, let's analyze the sales values by customer and company. Let me just update the design here to show in tabular format. This way, I can see the
total sales by each company to the customers and see
which ones are selling to the same customer. Now let's say I get new data. Lucas Basics here decides
to send their data, so I'm going to grab the
file and drop it here. They've inputted data, they
just have put the columns in a different order. So Sales Value's no longer in
the end, it's in the E column instead of where the other ones have it, which is the F column. Is that going to give us problems? Let's check. Let's go right-mouse click and Refresh. We have Lucas Basics' data right here. The order of columns doesn't matter. What matters is to have
identical column headers, because check this out. If I open Lucas Basics_Data and instead of Sales Value "S" with a
capital "S," I have a small "s," I'm just going to save
this, and let's go back here and Refresh, the data's
not going to show up. Now, for problems such as
this, you need to decide where you want to do the cleanup. Do you want to cleanup the source file or is this a common problem that some headers could be
capitalized, some headers aren't? Is this something you want to take care of in the Power Query side? If you decide to take care of
it on the Power Query side, you could add a new step and
capitalize all the headers before you combine them But to do that, you need to
be familiar with M functions. This is something we're going
to cover in the advanced section of the course. If you're interested to find
out more my Power Query course, check out the description
below this video. Now, what if you wanted your
end result to be a table instead of a PivotTable? Well, all you have to do is
change the load destination. Right-mouse click on the
final query + Load To, select Table instead of PivotTable. Existing worksheet $A$1
is fine, and click on OK. It's asking us if we really
want to remove this PivotTable. We do, and that's our table. All the data appended below one another. So that's how you can use
Excel's Get & Transform feature to combine data from
multiple files into one file. I hope you enjoyed this video. If you did, give it a thumbs up,
(popping bubble) and if you like what you see, consider
(mouse click) subscribing to this channel
(dinging bell) if you haven't done so already. And don't forget to hit
that bell so you get updates when new videos come out here, and I'll see you in the next video. (upbeat music)