Welcome to MSPTDA
video number 6. In this video we're still
studying Power Query, and we want to see three
important transformations. We're going to see how to
merge, append, and UnPivot. Our first example,
we will see how to merge two tables to replace
either VLOOKUP in Excel or relationships
in the data model. Here's our end result. We
need this sales rep table with territory, not like
the original sales rep table where there's
territory ID and we're either supposed to do a VLOOKUP
to the lookup table or have a relationship
between the two. Not only that, but
the source for each are two different
files, text and Excel. So we're going to go from these
two files into a single table. Our second example,
we want to see how to append one,
two, three tables, stack them up on
top of each other, and get a single table
with date, product, region, and revenue. In this example, we'll
have some problems because there are different
columns in each table, and the first table
from the sales for 2017 doesn't even have the
revenue calculated. No problem. Power Query can take
all of these tables, transform them, and append
them into a single table. Finally, we'll
get to do UnPivot. Someone's keeping data in a
non-proper dataset, students down the rows, class
across the columns. So we need to UnPivot this
and create a proper dataset. If we go over to
the sheet merge, we could see our first example. Oh, wait a second,
there's nothing here. We got to go look
at our files first. Now I'm using Windows
Explorer, and you can download these files. There is an Excel file. If we open it up,
it's an Excel table. We can definitely import this. And there's our text file. In our text file is the
tab delimited text file. So we'll import
both of those files. Power cruising the data
ribbon tab, get and transform. I'm going over to the get
data button from file, and we'll do the Excel
file first from workbook. Navigate to wherever that file
is, double click to import. This will open up
the navigator window. We only have one file to
get, so we simply select it. Now we come down, and because
we're in Power Query in Excel, it's this transformed data. I'm going to click transform. Step 1, I'm going to come
over and name this sales representatives and enter. These steps are fine. This table looks like it already
has the correct data types. Now we're going to close
in load, close in load 2. We want only create
a connection. That means Power Query
will just be connected to the data in that Excel file. Click OK. There's query number 1. Now we go over to get
data from file and text. We'll navigate to wherever
that file is, double click. It's going to ask us to
file origin, the delimiter, and it's going to
ask a difficult judge on the first 200 rows. Yes to all those. It is tab delimited. That's delimiter, so I come
down and click transform data. I'm going to come over and name
this something like territory lookup table and enter. Now look at this. It did not interpret
the text column headers as column headers,
but no problem. We can come in the
upper left hand corner, click the dropdown, use
first row as headers. That's also up in the transform
group in the home ribbon tab. And then we have promoted the
first record to field names. The data types are OK. Our steps are OK. Close and load, close and load
2, only create a connection, click OK. Now we already have our
two queries over here, so now we can come to get data. Down to combine
queries, merge is when we replace VLOOKUP
or relationships. Later in this class we'll
see that it's actually doing a join, and
we'll get to see the different types of joins. A pen, that's one on
top of each other. We want merge. Click-- it'll ask us
what the first table is, sales representatives. What is the second
table, territory look up. Now we simply have to
click territory ID in both. This is the foreign key. This is the primary key. If we were in Excel, VLOOKUP
would look at that value, find a match, and bring it back. In the data model, we'd
simply create a relationship. And if we wanted the value
over here, we'd use related. Now we're going
to do a left outer all from the first
matching from the second. And get this. We definitely want that because
we want every single sales rep name but only matching
from the second because this company
does not sell in all of the territories
in this lookup table. So left outer will work. Click OK. Now we'll bring in a table, and
if you click off to the side to the right of
the word table, you can see down here
it is bringing back the single record that matches. There's MEX, a foreign key. There is the primary key, and
there's the territory name. I'm going to come over
to the expand button, uncheck use original,
uncheck all, and we only want territory. Click OK. Now I'm looking
at the data types. The data types all look fine. I actually do not want this
now that I have territory, so I'm going to
right click remove. I want all the rest
of the columns. Now look at this. I forgot step 1. Step 1 is always name the query. D sales rep, as
if this was going to be a dimension table in
a star schema data model, and enter. Now for this example, I'm going
to load this to the sheet, close and load, close and load
2, table, existing sheet, A1. There it is. That's on the merge sheet. Click OK. And there we have successfully
merged the two tables replacing VLOOKUP or relationships. Now I'm going to
click the filter, and filters are awesome
because they'll always give you a unique list of all the items. And I'm looking through--
these are all the areas that this company sells in. No USA. Well, we just got
a new sales rep, and they're selling
in the territory USA. So I'm going to save this, find
my source data, and open it, go down to the bottom,
control down arrow. Right here, I'm
going to hit tab. This is an Excel table,
so it adds a new record. Type the sales rep ID, the name,
today's the hire date, and USA. Now I'm not going to hit tab. I'm going to hit enter. Save Alt F4 to
close that workbook. Now I'm going to come over and
somewhere in our query output, I'm going to right
click refresh. And you could see
there it is, 135 rows. If I click the dropdown, it'll
be right in alphabetical order. This is just simply
amazing about filters. There it is, United States. If I control down arrow, sure
enough Sue Red Coolinator has been added. So that's our first example. We went from a text
file and an Excel file, merging the two
into a single table. Now let's go look at
our second example. Actually this is probably
going to be the most fun one. And the data is
right here in Excel. If we go over to 2017,
date, product, region, that's all good. Those columns match the
columns in the other tables. But units, net cost,
equivalent, and price, that's what we're given in this table. We have to multiply all
those to get our revenue. If we go over and look
at 2018, this table, if you look through it,
is exactly what we want. It has the four columns,
everything correct. 2019, uh oh, there's
an extra column. Now what we're
going to do first is I'm going to import all of these
and then try and append them because this is
a common mistake. We'll see what happens. So I'm clicking
in a single cell. This is already an Excel
table with a smart name. Now I go to data. From table arrange,
that name is OK. Those steps are OK. I'm thinking though I'm
going to change this to date. It'll ask me if I
want to replace it because that automatic
step there was already applied to data type. I'm going to say replace. The other data types look fine. Now we're going to leave
this for the time being, but later we'll come
back and fix it. Close and load, close and load
2, only create a connection, click OK. Now I go to 18. The keyboard to import an
Excel table as alt, A, P, T. I'm going to change the
data type to date, replace. The data types look fine. The name is OK. The steps are OK. Close and load, close and load
2, only create a connection, click OK. Now I go to 2017. A single cell is
already selected, alt, A, P, T. Date, I'm going
to change that and replace it with date replace. Everything over here is OK. Close and load, close and load
2, only a connection, click OK. Now we can go up to get
data and let's try append. Combine, append,
and this is so cool. I remember earlier
versions they didn't have this, three or more tables. I'm going to click Sales 2019. Hold shift, click Sales
2017, add, click OK. Well, it got date,
product, and region, but color, well,
it looks OK here. But definitely units, net
cost equivalent, and price, we can see what happens. These three columns were
only in the 2017 table. So for all the records with
2019 and '18, they'll be nulls. Somewhere down
below here, there's nulls because 2019 was
the only table with color. If we click the drop down
here, look at this load more. If your filter doesn't
give you a complete list, click load more. And there it is. There's some nulls down there. Click cancel. Now I'm going to come over and
name this something like three years data. Close and load,
close and load 2. I'm going to create it only as
a connection for the time being. It's a totally messed up append. Click OK. Now all we have to
do is go edit these. So sales 2019, I'm
going to double click. So for color,
right click remove. Close and load will
just close to the place it's already been loaded, so
it will close to a connection. Now I come over to Sales 2017. Double click and
our goal is to add an extra column that will
multiply price times net cost equivalent. Now a net cost equivalent means
how many pennies for every $1.00 of the price should
we charge the customer. Down here there's no
discount, so 1 times 19. Down here, the first record is
a discount of 9% or 9 pennies, so we're going to multiply
91 times every dollar. So we multiply these. That will give us the net
price times the units. So watch this. Instead of like we
did in earlier videos where we added a
custom column, I'm simply going to highlight units,
hold shift, click on price. My goal is to
multiply all three. And in the add columns,
there's this great feature, from number. Click the standard and I'm
going to click multiply, and it will add a column for us. Now a couple videos ago,
we went and edited up in the formula bar. Here's our inserted
multiplication step. We can see table
dot add columns. And I don't want
multiplication as our name. This should be revenue. So instead of adding
an extra step here, if I look up in the
formula bar, change type, that's the function acting
on the previous step. That's the name of the column. Each means for each row in
this table list dot product. That means it's
going to multiply whatever's in this list. And look at that. We learned the syntax last time. Curly bracket can be the
syntax for the positional index operator, but curly brackets
are also the syntax for a list. So what did it do? It used the lookup operator,
which is the square one, got the units for this
row, net cost equivalent for this row,
price for that row. The curly brackets
make it a list, and then we can use the
list dot product function, which means it multiplies. It even added a type. Now I'm going to come
over to the formula bar, double click multiplication,
and call this revenue and enter. And there we have our revenue. Now we do not need units
all the way to price, so I click one, hold
shift, click the last one, right click remove. Now for 2017, I have
exactly what I want-- date, product, region, and revenue. Now before we close
and load this query, there's two other
important points. Remember we're trying to append
these one on top of each other, and we have date, product,
region, and revenue fields. Now remember when we
had different fields, we ended up with extra
columns with nulls. Well, that will happen
also if you do not spell the field names
the same, and you can run into trouble if you
don't have the same data type. For example, if one is text,
then all of the columns will end up not
having the data type. And then for a revenue
column like this, you wouldn't be able to add. So when appending,
we actually have to have the same
columns including all the tables having revenue. All the field names have
to be spelled the same, and we need the same data types. Home, close and load. And now watch this when I
double click three years data. That is so beautiful
in Power Query. We have various queries over
here, and they're connected. And I can go edit any of them. And when I come back
here, it's like magic. It's all been cleaned up. Now, if we look at the formula
bar, this is a cool function. All it did is it knows
that there's these names-- that's the names
of these queries, which happened to be tables-- table dot combine. That's all it did. And then it lists syntax. It listed each one of the tables
that needed to be appended. Now I'm going to close and load. And I'm going to
come over and change the load location, right
click, load 2, table, collapse. And I'm going to try and
come over to append, A1. There it is. Click uncollapse. Click OK. And there it loaded it,
but it wasn't polite enough to jump over there. But sure enough, there it is. Actually if we come back
to wherever we were, all I have to do is single
click the query, and it jumps to that sheet. Our last example, UnPivot. Now our goal on this sheet is
to take this non-proper dataset and convert it to
a proper dataset. Now what someone
did here is they took raw data that should
be in a proper dataset and organized it into a
cross tabulated dataset. That means these grades
for these classes are associated
with that student. Well, if this was
a proper dataset, there would be three columns-- student, class, and grade. The next record would be
student, class, and grade. Now the process of taking
data structure like this and converting it
to a proper dataset is called UnPivot
because this sort of looks like a cross
tabulated pivot table. Now it's almost like a
cross tabulated pivot table except for that 1.7 grade
for Ernestine in Business 218 is not a summarization. It's a bit of raw
data, but it looks like a pivot table
because at the column header is a condition or
criteria and at the row header is a condition or criteria. So because this is
actually raw data, we're able to take
this in UnPivot it into a proper dataset. Another way to
think about this is if we did create a
pivot table, remember there's a unique list
of items in the row area and a unique list of
items in the column area. It is the column area that
we're going to UnPivot. We literally tell
Power Query, hey, take these columns in UnPivot. Power Query will know
to take these conditions and put them in a new column
and take all of the raw data numbers and put them
into a third column. Let's see how to do this. I'm going to click
in a single cell. This has already been converted
to an Excel table, which is required when we
take data from Excel and bring it into the
Power Query editor. So with a single cell
selected, I can go up to data from table
range or we'll just use our keyboard, alt A, P, T. Step 1, I'm going to name this. So final proper dataset
grade table and enter. I do not need change type, so
I exit out, click the red x. Now when we're
doing UnPivot, these are the columns I
want to UnPivot. So I could click on
the first column, scroll over, hold shift,
click on the last column, and then right click
any one of the columns. And down at the bottom,
there's UnPivot columns, which would work when we
select the actual columns we want to UnPivot. UnPivot other columns is
another option and UnPivot only selected columns. Now the problem with doing
it with UnPivot columns is we had to highlight a lot
of columns instead of just one column and then click
UnPivot the other columns. The second problem
with this method is if we choose UnPivot columns,
the code underneath the m code will always write it as
an UnPivot other columns. So it's much faster,
especially in this case where we have lots of columns
here and one single column, to select the single column-- that's faster-- and
then use UnPivot the other columns not selected. Now UnPivot is
over in transform. There it is right there. You could click the dropdown,
or we can simply right click UnPivot other columns. And there we go 1, 2, 3 columns. If we look over here
in our applied steps, there it is, UnPivot
other columns. If we look up to
the formula bar, there's the table dot UnPivot
other columns function. The functions always act on the
previous step, then it says, hey, which column do we right
click and want to UnPivot other columns-- there it is-- and then it named
the two columns. Now I'm actually going to come
over and x this out and watch this. I can select all the
columns, right click down to UnPivot these columns. And when I do that, I get
the same line of code. So x out right here right
click because it's faster to just right click
this one column and UnPivot other columns. Now want to rename
all three columns. So I'm going to double
click and call this student. If I want instead
of double clicking, I can hit enter and use
the arrow key to arrow to the next column and F2
to put it in edit mode. This will be class, enter
arrow, F2, and this is grade and enter. I also want to
change the data type. This will be text. That one's fine. This one will be decimal. There's our applied steps. There is the name. Now we want to close and
load this proper dataset, click close and load
then close and load 2. I definitely want this on a
table on the existing sheet. I'm going to try and
click in K4, click OK. And there it is. We have a proper dataset. We can do things with
this proper dataset that we can't do with a
cross tabulated data table. We can sort the grade
column largest to smallest then sort the class column. And now we can
see all of BI 348, the grades from
highest to lowest, and then go on to the next
class and there they are. So in this video, we saw
three amazing transformations that Power Query can do. We definitely can go from a
cross tabulated data table UnPivot into a proper dataset. We can definitely append
three different tables, but before we append, we
can fix them in Power Query and then append one
on top of each other into a single proper dataset. And in our first
example, we saw how to take two different files,
one with a foreign key that needed to look up to a
lookup table and a primary key, merge these two tables
into a single table, and replace VLOOKUP
or relationship. If you like that video, be
sure to click that thumbs up, leave a comment, and sub
because there's always lots more videos to come from Excel
Is Fun, including video number 7, our next video, where
we'll do lots of merges looking at all the
different join types that Power Query can do. We'll see you next video.