Welcome to Excel
Basics video number 25. Hey, in this video
we've got to talk about the amazing and
brand new feature called Power Query, also known
as Get & Transform to convert bad data
into a proper data set. Now, this is what it
means by bad data. Someone decided the
description column we would combine product,
name, date, and region. Separated each by a space,
forward slash, and space. Really, those are
three separate columns. And the reason this is bad
data is because, of course, we can't make a pivot table,
or use all other data analysis features like sort to
perform data analysis. So what we're going to
learn is that Power Query allows us to clean data,
transform data, and even import data. Now, we'll go from this bad
data set into a proper data set, which of course
then will allow us to create a pivot table. That will be example number one. Example number two, we're going
to import multiple text files. And many businesses
transfer data from one system to
another using text files. And so every month-- April, May, June and July-- we get a new text file
with the sales data. So what Power Query can do
is simply look in a folder and it will take all
of the text files and create a proper data set. Now, of course, proper data
set-- as we have seen, what 20, 30, 40, 50 times in this class-- field names at the top, records
in rows, and empty cells all the way around. Now, the reason that the proper
data set is so important, of course, is because
over on data analysis we've been defining data
analysis and business intelligence as going from raw
data into useful information. But guess what,
sometimes there's this step before
proper data set. Here we have bad data. The region and sales
rep is combined. So we've got to
go from bad data, clean it to become
a proper data set, and then do whatever we're going
to do from that proper data set to create useful information. So far in this class we've
done all of these amazing data analysis topics. And our last topic
for data analysis is Power Query, also
known as Get & Transform. Over on this sheet,
what Power Query does. Well, Power Query-- one, two,
three-- clean, transform, import. And we'll see all three
in our two examples. Now, we're got to talk
about something important. Because if you do not have
Office 365 for Excel 2016 the data ribbon tab
looks like this. And the group we're
going to be using is called Get & Transform. If you have Office 365
for Excel 2016 then that data ribbon
tab looks like this. Now, the buttons
we're going to need-- if you don't have Office 365-- and this is what the group
looks like in our lab here at Highline-- it'll
say Get & Transform group in the data ribbon tab. For our examples in
this class, we're going to use the from table
button, new queries, and show queries. Now, when you have Excel
data in an Excel workbook, you use the from table button
to bring it into Power create. When we're importing
external data, we use the new query button. And when we want to see
a list of our queries, we click show queries
and it opens a task pane. If you have Office 365/ Excel
2016 there's the button to get data from Excel. There's the dropdown get
data to import data for us. We'll be importing text data. And over in the queries
and connection groups, that's the button--
queries and connections-- you click to see a
list of your queries. Now. If you download the PDF
notes below this video, I have detailed screenshots
of every single step in this video. And when there is a
difference between those two, I have screenshots of both. All right. Let's go look at our
first amazing example. I'm going to click
on the sheet EX one. Now, here's our data set. And if I scroll down, we
can see what the problem is. This column really
is three columns, and so we need to
split it apart. Not only that, but once we
get it into a proper data set, this is the actual pivot table
report that we want to create. Now, that's a picture I saw
a minute dragging it out of the way. Now, as we look at our
data set over here, notice two videos ago we
learned about Flash Fill. So I actually could
come out here, give it an example or two,
and then Flash Fill. Now, that is perfectly OK
if it's a one time event. But if you have source
data like this and this source data might change-- or more importantly you add
new records to the bottom-- then you want to
use Power Query. Because Power Query-- once we
create our proper data set-- can be refreshed when
the source data changes. Now, this data set here is
how the data gets sent to us. So every single time
we get sent new data, we have to split it, and
then make our report. Now, if we scroll over to the
site later after we create everything-- the Power Query
and the pivot table report-- we're simply going to
get these new records, include room at the
bottom of our Excel table, and everything will update-- the table, Power Query,
and our final report. So that's an example
of when you'd want to use Power Query
instead of Flash Fill. Another important
bit of terminology we need to learn about is
when you get multiple columns all in a single cell. Or in our next
example we'll have all of the columns of data on
a single line in a text file. The character-- or
set of characters-- that separates one, two three
columns is called a delimiter. So our delimiter is space,
forward slash, space. There are all sorts of
different delimiters. Sometimes people
use tabs, sometimes a comma, sometimes all sorts
of different characters to split the different columns. They're called delimiters. Now, another important thing is
if we go up and look at data, anytime we use
from table or range or get data or get text
data it will open up a new window called
a query window. Also, in this
version if I wanted to see the list of queries,
notice I click the button queries in connection. So the word query has a
very important meaning. Now, this is the first
time we've seen a query. And all a query
means is I'm asking some question of the data. Now, our question-- or query-- is can you please
take this column and split it into three columns. Namely, can you split it into
product, date, and region. And the word query is
universal in all areas where data is used. Next video will be our first
video about the program Access. In Access we have a
whole section of Access that is just queries where we
ask questions of our data set. All right. So we are ready to build a
query to ask the question, can you please create a proper
data set from this bad data data set? Now, in order to get data from
Excel and click this button, you have to first convert
it to an Excel table. Now, the reason we have to
convert it to an Excel table is because they want you-- if you're going to use any of
these amazing query tools-- they want you to make sure that
it is a dynamic table of data. That if the data changes
or we add new records or take away new records,
everything will update. All right. So step one we've got to
convert it to an Excel table. We go Up to insert,
click on table, or of course we use control
T. My table has headers. Click OK. Now, we immediately
want to go up to table tools,
design properties, and name this table. We have the name
everything smartly, especially since there's going
to be lots of different names-- not only here in
the start table-- but in the query and in
the final clean table. So I'm going to call this
start sales table and enter. Now, with a single
cell in our Excel table we can go up to data. Get & Transform--
whichever group you have-- and select from table. Just like that it
opens the query editor. There is our table. As the data sits
in the Excel sheet, we're going to come over. And in the query setting we
want to give this a good name. That start sales
name is going to be the name of the table
that sits in the sheet. We want the query to
have a different name. I'm going to call this sales
proper data set and enter. sales this query is named
Sales proper data set. And once we transform
and clean this data set, when we load it back to Excel
it will be an Excel table which has this name also. Now, we have two columns,
and our goal is to split. So I'm going to click
on the description column up in the home ribbon
tab here in the query editor. I go over to the
transform group, and I'm going to
select split column. Now, it asks by a delimiter. That's something like a tab,
a comma or in our case space, forward slash, space. We can also split it by a
set number of characters, like if we only wanted state
and it was two characters. But I'm going to
select by delimiter. Split column by delimiter
dialogue box comes up. It thinks we want
a space, but I'm going to click the
dropdown and select custom. And I can type any set of
characters that I want. Space, forward slash, space. We can choose where to split. I want to split at each
occurrence of the delimiter. Now, when I click
OK look at that-- one, two, three, and four. And remember this
is totally dynamic. When I add new
data next time, it will split that data just
like it split this data. Now, we see that these
are not very useful names. So we're simply
going to rename them. Double click description
one, and we're going to call this
product, and enter. Now, I'm going to double
click description two, and call this something
smart date, and enter. Double click description
three, and call this region, and enter. Double click amount, and I'm
going to call this sales, and enter. Now, the other thing
we want to notice is that there's little icons
that tell us what type of data Power Query thinks this is. Now, ABC is the icon for text. The calendar is
the icon for date. And one, two, three is the
icon for a whole number. Now, you can click
any one of these-- the icon-- and it allows
you to change the data type. Now, the data type
is really important, because this is a
Power Query table. And when we dump
it back into Excel, it will actually be
Power Query output. Now, if we try to
create a pivot table and want to group to date,
if it was accidentally coded as a text data type,
then grouping wouldn't work. Similarly, if the sales number
was coded with a data type text, then we wouldn't be
able to do some calculations like summing. So data types are
very important. Just like we saw over in Excel
with the default alignment. But here we explicitly
say what kind of data type can go in this column. Now dat is the first time we've
seen an explicit data type. When we get over to Access
in the next couple of videos, we'll specifically have to
define types there also. Now, the reason that
data types are important is because it's
a way of assuring that the data in this column
is the correct type of data. All right. It interpreted all
of these correctly, but I want to change this one. So I'm going to click on
the one, two, three icon, and I'm going to
select currency. The difference between number,
currency, and decimal-- decimal, you can have
up to 15 decimal places. Currency, you can
have up to four. And a whole number you
don't have any decimals. Now, we're going to come and
I'm going to click on currency. Now, it's not going to show
any number formatting here, but that is the data
type that we want. Now, if we look over here it
has saved all of the steps, including the first
two steps which it did. You can actually
click on any step and look exactly what it did. So you can click through looking
at how you built this query. If for some reason you needed
to come back and edit later-- which we're not going
to do in this class. We'll do in the class after
this, Business 218 Spreadsheet Construction Highline class. You could click on this,
and it would actually open up that step again,
and you can change it. Not only that, but if you
make a mistake over here you can click the X. So if
you really messed up and you needed to
delete a few steps, you could just click,
click, click and start over. Now, we have our steps. We have our name. Our data is set. Now, we need to load
it back to Excel. I'm going to click in the
home ribbon tab, close group. I'm going to click the
dropdown, and we're going to choose close and load two. Now, when I select
close the load two, it's going to open
up a dialog box that looks different in the two
versions of Excel 2016. It says import data if
you have Office 365. Is load two if you have a
version that's not Office 365. But in either one of the dialog
boxes, it doesn't matter. You can choose to put
it in a table, which is what we're going to do. And you can choose to put
on an existing or new sheet. We want to put it
on a new sheet. Now, I want to click OK. It's immediately going
to load the clean data. Over here we can see
queries and connection. If you have a non-Office 365
it will just say queries. But there's our
query, and we can see that it loaded 365 rows. There is our table. If we click inside that table-- table tools design--
look at that. It has the same name for this
Excel table as the query. Now, really they're
the same thing. This is just a dynamic table
built with Power Query. I immediately want to come
down and double click. This is just the sheet
name, but I don't want to leave it sheet one. I'm going to name it
clean data, and enter. One last thing about
the queries pane. If we come back up to data,
if I click this queries and connections-- or in the
other version it says show queries-- it actually hides that pane. So if you come
back later and you need to find that page and
you click on this button. And then if you
want to edit it, you can double click to edit
it or right click it. And then you can edit, or
look through your steps. Also if you want to
delete your query, right click delete or you can
simply use your delete key. But we don't want to delete it. I want to click in a single
cell and make a pivot table. Insert pivot table, or
the keyboard alt NV. I want to put this on
the existing sheet, and I want to put
it in G1, click OK. Now, I have the queries and
the pivot table fields open. I'm going to drag the
date field down to rows. And in Excess 2016, it
will automatically group. You can see down here
it says months and date. I'm going to drag date off, and
there we only have two months. Now, I want to drive
product down below months. Come over to the row area,
right click expand and collapse. Expand entire field. That is so beautiful. April and our product,
May and our product. Now, I can drag
sales down to values. Right click the values area. Number format I want currency. Two decimal symbol is OK. Click OK. Now, here is the amazing moment. We're going to come
back over to EX one. We are hoping that all of those
steps and all of that process is automatic. That means if we
add new records, everything will update. I'm going to click in cell AA15,
control asterisk on the number pad. Control C to copy. I'm going to scroll over. Click in somewhere in the
first column of the data set. Control down arrow to
jump down to the bottom. Click in cell A380, control V.
Because this is an Excel table, it automatically
incorporates it. If it did not-- then as we saw in
our earlier video-- you can simply go up to
design, table resize. Now I'm going to
control up error. Now, notice that
was the start table. Over here I see the query. And over on clean data,
I see the pivot table. Now, you can right
click update there, or I'm going to come to my
Power Query created Excel table, and right click refresh. Actually I'm going to--
either one will work. I'm going to do it
over here because this will be more dramatic. Right click, refresh. And you can see it's loading. And what it did is it re-split
and added the proper names and types, but now for 481 rows. Our table is updated. I can control down
arrow, and look at that. Control, home. Now, I want to come to the pivot
table, right click, refresh. And just like that
it totally updates. The new data was cleaned
and transformed, dumped back into the table. And we simply refresh
our pivot table, and our report is updated. Now, I forgot one
of the columns. So now I'm going to go over
and add region to columns. And there is our
completed report. Power Query really is
an amazing new tool that helps us clean the data. And the beauty of it is
everything is refreshable. Now, we have a second example. I want to scroll
over to example two. Now, I want to go
to our website, our start file which
we're working in. But you have to download
this zipped folder with our text files. Now, I'm going to right
click save link as, or however you want
to download it. Now, I'm going to
download it to the desktop just to make it easy to navigate
to inside of Power Query. But by all means,
you can download it to your system of
folders, Excel folder. I'm going to click save. Now, on the desktop
I'm going to unzip it. And your computer may
unzip in a different way, but we can always
unzip it right click. And then some
option here will say extract all or extract here. I have seven sip. So I'm going to point to
that, and say extract here. That will give me
the single folder. Now, I want to double
click this folder, and go look at what's inside. Now, here is the folder
we just unzipped. There is a start
folder, and then a text file with June and July data. We're going to not
use those at first. Later we're going to copy
those-- as if we just got them by email or something-- and then
dump them in our start folder. I'm going to double
click this start folder. And these are the two files that
we're going to import first. Now, what we'll do is we'll
point Power Query to this start folder. It will actually see
the entire file path. And Power Query--
when we refresh-- will come back to
this start folder and try and import
anything that is here. Now, for our example
we are only going to have monthly text sales data. For us, our system is
clear, the start folder holds only text files. Now, lets double click one of
these files and look and see what a text file is. Now, I'm going to double click. Now, it might open in
Notepad or something else. I want to close this, and I
want to right click open with. And either find word or
choose default program and select Word. Because when we open
it in Word, we'll be able to see exactly
what the delimiter is. Now, when a file is saved as
.txt it's almost always a tab delimited file. Now, the reason that this
type of file is so t in data analysis-- and for
the entire world-- is our data is saved
mostly in databases. But there are so many different
types of databases and systems to store data that
they all agree that text files is
how we send data from one system to another. So it's very common that
your database wherever you're working will export
the data that you need to analyze in a text format. Now, sometimes it
comes with the tab. I have my non-printing
characters on so we can see. Other times it comes as comma
separated delimited data. Now, we can see that we
have product, date, region, and sales. And then there's a
bunch of records. So this is April data. I'm going to close this. And, again, our text file .txt
is a common way that systems transfer data out of
the current system. And then we can import it
basically into any system. All right. You ready? We're going to go over to
Excel and point Power Query towards that start folder. Here on example two
I've selected A11. I'm going to go up to data. Get & Transform-- in this
version it says get data. In the other version
it says new queries. So I'm going to click the
dropdown in whichever version, and point to from file. And then hover my cursor-- and this is so amazing-- from folder. I'm going click it. Folder dialog box
we click browse. Now, we browse-- and
you can browse anywhere on your computer. I made it simple. So I click the triangle,
and I click on start. You go wherever you saved
it on your computer. But when you point Power
Query towards start it knows the whole
file path, and it will try to import everything
in that start folder. I'm going to click OK. There's the file path. That file path is memorized
in the very first step of the Power Query editor. Click OK. Now, in the class after this
will click edit, and learn some advanced techniques. But if you only
have text files, we can use this amazing
button right here, combine. We click the dropdown
and say combine and edit. Now, when I click
combine and edit it's going to take a
while, because it's doing a bunch of steps. Actually before it does
anything, it's polite. It says, hey, I'm going
to use the first file. Here's the file origin. I'm going to base my
data combining on what I see in the first 200 rows. But what we want to make sure is
that the delimiter is correct. It even gives you a
little preview here. Our delimiter is tab,
so now we click OK. And now it builds
a bunch of steps. It opens up the Query Editor. Now, over here it
did a bunch of steps, including building a custom
function that it uses to combine all the text files. Over here there is our name. There's all of the
steps it's already done. We can see our
columns over here. Now, I'm going to click
in name, and we're going to call this something
like monthly text sales proper data set, and enter. We don't need to view these
automatically created steps, so I'm going to
flick a collapse. There's an uncollapse
and collapse. By the way, you get to
see all of your queries that you have in
this workbook here. There's the - we already
did in example one. Here the one we're
working on right now. I'm going to close this. There is the name of the
query, there's our steps. Now, here's a column. And we don't need that. It tried to be
polite, it thought we might need the file name. But we don't need this column. So I simply right click the t
header-- or the field name-- and point to remove. Product is a good name. Text data type is fine. Date is a fine name. Date data type is good. Region ABC text
type, that's fine. Sales, that's a fine name. I'm going to click
on one, two, three and change it to currency. Now, I've added
our last two steps. Again, this is quite beautiful. You can-- later if your folder
path for that start folder changes-- you can come back and edit this. You can come over to the
little gear and click on it and actually change
the file path, browse and find the
new start folder. Cancel. That's an example of how we
would come back here later and select a step and edit. And, again, if you really
mess up you can click. Click to delete those. In fact, let's just do that. I'm going to click
on the last one. That was our last step. I'm going to X this one out. Notice it goes back to
whatever it was before. And I'm going to
change it to currency. Just illustrating how easy
it is to edit this query. Now, we want to
come up and close and load, close and load two. In fact, if we want to dump it
straight to a sheet as an Excel table, then we can simply
click this icon at the top. I'm going to click, and now
it is loading to a new sheet. There's our table. There's the new
query at the bottom. I can collapse all of
these automatic steps. Collapse all of these
automatic steps. I want to immediately come
down here, double click, and call this clean
text data, and enter. Now, we can see that
we have 365 rows. If I click in the data
set, control down. Now we have 366 rows, because
we have field names at the top. Control, home. Now, let's go back
to Windows Explorer. And I can see up here start. I'm going to click in the
address bar on file downloads. Now, I'm going to click
on 03, hold shift. Click on 04. Control C to copy, double click. Control V. Now, that start folder
has four text files. When I go back and refresh-- I'm going to come over to
the query pane, right click refresh-- instantly our new
two months of data has been added to our table. Now, we could go ahead and
create our same pivot table we did before. But that is absolutely amazing. Data, get & Transform-- either get data
dropdown from table. Power Query to clean our data
and give us a proper data set. All right. So in this video we
saw how to import text files from a folder, and
then dump new files in later. And over in Excel,
everything updates. Example one, we had
our bad data set. We converted it to a table. Used the button from table. Did our steps in Power Query. If we needed to look at them,
of course, double click and it opens up the Query Editor. There's all of our steps. Then we loaded it
back to clean data. Built a pivot table. Then later we added new bad
data records to this data set. And sure enough, when
we refreshed everything the table and our pivot
table report updated. Right? That is our last video for
Excel Basics in this Office 2016 class. That was video number 37. Next video-- video number 38-- we'll actually have just a few
different videos, maybe three or four videos
about basic Access. All right. If you liked that video, be
sure to click that thumbs up. Leave a comment in sub, because
there's always lots more videos to come from Excel is Fun. All right. We'll see you next video.