Excel Data Analysis Class 01: PivotTables, Power Query, Formulas and Charts

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to excel data analysis video number one in this video we're going to learn about pivot tables and formulas now to follow along you'll need to download these files in the link below the video [Music] now our goals are to use power query to import the data use formulas to create the data model and then from that data model use a standard pivot table to create our reports and our excel line chart here's the four reporting goals the first report we need to create is a common report where we need total sales by year and month and then we need a line chart to show how sales are changing over time next is a region by year sales report that shows the change year over year then we have to calculate the average monthly sales for each one of these products finally we'll have to calculate a sales report which is a cross tab that shows supplier at the top of the column and years at the head of the row now with data analysis it always starts with the data now we have the product price table and the sales rep region table in the worksheet now very importantly our data is stored in a proper data set also known as a table a table has field names in the first row that's the name of the data in each column and then records and subsequent rows now if your data is not stored in a proper data set you can't even use the data analysis tools in addition if the table is stored in the excel worksheet then you have to have empty cells all the way around your table or else the data analysis features will not work now our sales data comes to us in an external text file text files are common ways to get data from one system to another system now we have a dot txt there's also csv let's go take a look at this file now here's the file if you click up in the windows explorer address bar that file path is going to be used by power query to get the data into excel now let's go look at the file and i don't want to open it in notepad so i'm going to right click open with word now i have my non-printing characters turned on because we got to figure out how this file gets a full table from one system to another and the way it does it is with something called a delimiter a delimiter is the character that separates each bit of data and that arrow tells us that the delimiter separating the bits of data is a tab character if you have a csv file the delimiter is a comma and that becomes important when we import data we can also see date product id sales rep id and units there are four fields in this table now when importing data into the excel worksheet and later next video into the powerpivot data model we go to the data ribbon tab in excel and then to the get and transform group we have a text or csv file so i'm going to click this button you have to navigate to the correct location select the file click import in this window very importantly it wants to know what the delimiter is it's a tab and we can see that it interpreted that crazy text file as a proper data set now we want to click transform data this allows us to look at the data in the power query editor before we import it so if there's any trouble we can fix it in power query so i click transform data this is the power query window over here we can see the name of the query this will become the name of the table that's imported into the excel worksheet there's our table over here on the right there's one two three applied steps in this query now if we look at the source and click the gear icon and this is where the file path is listed if you need to change it you can come here later when i close this in every query that you create the source step tells you the source of the data now notice when the file came in it was a text file so everything including the field names were just considered text notice everything is aligned to the left the next step it promoted the headers we need those because none of the data analysis tools work unless we have a proper data set and then watch this everything is text right now if we tried to make a calculation in a pivot table on a text number it wouldn't work if we select the change type step we can see it got the data types correct the dates and all the numbers are aligned to the right now from our excel skills we can tell that these are all numbers because numbers are aligned to the right but very importantly at the top of each field there's a symbol that tells you what the data type is and if it didn't get it right although it usually gets it right you can easily change it this is a data type icon you can click on it and there's your list of data types date is fine for these one two three is the symbol for whole numbers whole numbers for all of these so power query got all three steps correct now we can load it home the close group close and load drop down and we want to select close and load to this opens up the import data dialog box now we can load it as a table you can load it directly to a pivot table or pivot chart if the full table is complete then you don't even have to worry about loading it to the worksheet we want to load the data to the table because we have to look up prices and calculate revenue and things like that later we'll see how to use this option to load the data to the powerpivot data model i'm going to click existing i want to put it in a9 as a table click ok and there's our imported table over here in queries and connection there's the query if you hover you can see the load location to the worksheet there's the data source now this pane sometimes is not open and you need to open it to look at your queries getting transform is part of power query but also queries and connections is part of power query if i click the pane goes away click again i can see all my queries now step one of our data analysis process is done we used power query to import our text file and the other two tables we had here at the outset now if we're going to use the amazing insert pivot table we have to have one flat table and what that means is that we have to add an extra column and calculate the sales for each row and not only that we're going to have to go and look up product supplier and the region and bring it back to this table when you use the standard pivot table every field that you want to summarize in every condition and criteria for the calculations in the reports and visualization have to be in one table now this will be totally different when we get over to power pivot and power bi there we don't even need an extra sales column because the formulas we build and the pivot tables do it automatically and we can use multiple tables all right let's build this flat table now when we take our data and prepare it for the final reports and charts that's called data modeling now our first task is to take units times the price that we'll look up and calculate sales in an extra column now this is an excel table so when i type the field name and hit enter a new field is automatically added to the excel table equals and if you've been around a long time you know how to use the vlookup function to look up product id and go and get the price but we're using microsoft 365 and there's a better function than vlookup it's called x lookup now x lookup is different in a few ways one is the default is exact match in the old vlookup the default was approximate match and there's a few other differences we're going to look up product id comma and lookup array there's no more table like in vlookup the second argument requires that you highlight the column with the items you're going to try and match from the lookup value now i type a comma return array instead of highlighting a table like in vlookup we simply highlight the field that contain the items we want to potentially go look up and bring back to the cell now that's it for our formula because the default is exact match lookup but comma you could put a message or a value in case you get an error comma these are the different types of match modes in the old vlookup there was only exact match and approximate match comma and then you can actually tell it how to search the bottom two we don't use because that's the old vlookup way but you can search first to last and last to first now we don't need any of those we're going to keep the defaults for all of them so that's our formula when we do exact match lookup close parentheses and now this is an excel table so when i hit enter the formula automatically copies down to every row now this formula isn't completed so in the top cell i hit f2 times our units and now when i hit enter there's our sales column now if i click in the top cell and hit f2 i want to take a closer look at these references these are not cell references these are references from our excel table objects anytime you refer to an entire column you have the table name the name of that table is d product and then in square brackets the field name anytime you refer to an individual cell in the same row as the formula you have to use the implicit intersection operator the at symbol that means from this entire product id column please get whatever's in this row now we're going to need the product name so equals x lookup i'm going to look up using my arrow keys product id comma there's the column to try and get a match comma and this is the column that contains the items i want to go look up and bring back to the cell now why did microsoft change from a table to putting in two columns because what used to happen is people would insert a column into the lookup table and then the old indexes that we used to put in there would break but with this that'll never happen all right that's our formula close parentheses and enter now i need to look up supplier equals x lookup product comma get a match there comma return the values from there close parentheses and enter the last column we're going to need is region equals x lookup and this time we're looking up sales rep id comma trying to find a match there comma and we're going to return the region none of our reports are going to require sales rep so we're not going to look up that column when i hit enter and adjust some of the column widths we have our flat table and we're done with our data modeling now we can build our pivot table reports now the first report we want is a year and month sales report and we want to notice that in the date column we have the sales amounts and units amounts listed by day there's no year or month column that's no problem for the pivot table feature it knows how to automatically roll all the dates and the associated numbers up into months and years now to create your pivot table report you click in a single cell on the proper data set go up to the insert ribbon tab tables and there's the pivot table button the drop down has some options your options may look slightly different but the one we want is from table and range click this is the pivot table from table or range dialog box yours may say create pivot table but both dialog boxes have table and range and when you select a single cell it'll automatically put the full table into the table range text box then down here we get to decide where this report goes we want to put it on an existing sheet with location selected we'll click on year month sales b3 click ok now the pivot table fields task pane starts over here i'm going to move it by grabbing it at the top and what we have here is every field from the table and to create our report we drag whichever field we want down to columns filter rows or values values is the area where we make our calculation so that's where we drag our sales fields these other ones are the conditions and criterias that sit on the outside of the report now we want year and month down the rows and we're going to see this amazing feature in a pivot table we will with our move cursor click and drag notice the no symbol no if you hover in one of the four areas notice that's the columns area you see the green bar but look at the icon it'll show you that's the column area filter row and that's the values what we want is rows and watch this when i let go it automatically creates three new fields years quarters and for the moment the month field is called date and you can see up in our field list there's the new fields here's the pivot table report let's select in the row area right click expand collapse expand entire field now wait a second i don't even want quarter we can click and drag off like this or if you want to remove you can simply come up and uncheck and that is amazing now when you create a pivot table like this it actually takes the entire table of data that we have in the worksheet and it stores it in a temporary location called a pivot table cache and down in that cache it stored a new field called years and quarters now i don't want quarters so watch this let's come over somewhere in the row area and right click and group click this is the grouping dialog box and if we click quarters now the pivot table cash will no longer group by quarters also notice if you needed to run an hourly sales report like to look at peak sales for a website that's an option there are seven options for grouping and anytime you group that means in the row area it lists the conditions which can be used in the conditional calculations in your report let's click ok so that means in a standard pivot table anytime you have the date field in the row area you right click and you can choose what type of grouping you'd like to do now another interesting thing is notice that the months are sorted correctly january to december now of course we would expect that but over in power bi desktop and power pivot month names like this are sorted alphabetically so april will be on top now there's an easy way to deal with that and we'll see how to do that now why does this work in a standard pivot table because file options advanced and all the way at the bottom there's a custom list right there and that's what causes in a standard pivot table the month names to be sorted correctly now let's drag sales down to values and just like that with a few clicks we almost have our report two things you need to do in what's called a standard pivot table we will distinguish a standard pivot table from a data model pivot table that's what powerpivot creates when you use a standard pivot table you have to manually add number formatting and manually change the name of the calculation to add number formatting to this calculation it's not like the excel worksheet cells we don't have to highlight all the cells and add number formatting we simply select one cell in the values area of the pivot table and right click we do not want format cells we want the number formatting option when you open this it'll have a single tab the number tab now we could use currency or accounting but lots of dollar signs clutter up the report so we're going to select number use a comma separator i'm going to leave two decimals showing but if you want a less cluttered report you could show zero decimals click ok and now we've added our number formatting now we come up to the top cell i'm going to click up in the formula bar and indicate the unit and this is the name for this calculation column we also want to click in c3 and name this field month and that's our report now let's drag the task pane over to the right and now we want to create our line chart to show the trends over time i click in a single cell in our pivot table insert tab over to the charts group and there's the line chart drop down we're going to select the first option now to move a chart you don't want to click inside because if you try to move it there it will wreck it you gotta point to the outside edge and when you see your move cursor you click and drag now anytime you do a visualization the number one rule is no chart junk that means you look at each element in the visualization and decide does it help to deliver the message well right off the bat this is a legend and we don't need it because we only have one line if you have multiple lines then it's helpful to have a color-coded legend i'm going to use the delete key the word total as a chart title is not very good so i'm going to click on the chart title come up to the formula bar and type total sales and enter now we have total sales here and some of sales here that's unnecessary repetition also these are called field buttons and they govern how we can sort and filter this pivot chart but guess what the pivot table and pivot chart are connected and i'm going to do the sorting and filtering over here so right click any field button and point to hide all field buttons on chart that option is also up in pivot chart analyze and then field button drop down we can point to the little white circles and click and drag to make the chart wider now when you create a chart there's a green plus hovering on the right side you can click and use the check box to remove or add different chart elements i'm going to click escape also inside the chart you can click on various elements and right click to format but a better way to get to the format task pane is to select the elements you want and we want to reduce the decimals to remove clutter and then use the keyboard control 1 that opens up the task pane now up at the top we have different icons all of these icons are governing what's going to happen to our vertical axis we want this one here down under number we can change decimal places 2 to 0 tab and when you click on a different chart element the task pane adjusts all right i'm going to hold my alt key and use the move cursor that snaps it to the grid let's click inside the pivot table click on field list and we want to try to drag product down to filter whereas the fields dropped into rows to determine a different calculation for each row when we drag product down to filter we're allowed at the top of the pivot table to select a condition that will affect every calculation in the pivot table so when i select aspen all these calculations are for aspen and the chart updates clicking at the top selecting all click ok now all the calculations are for all products now there's a user friendly way to apply a filter now i'm going to click and drag this off pivot table selected up in pivot table analyze filter group let's click insert slicer i want to check product click ok and that is looking awesome we can resize it because the slicer selected there's a slicer ribbon tab you can change the colors add different number of columns but for us this is fine when i click carlota everything updates quad i can hold the ctrl key and select two items that means every calculation is showing for quad or the aspen product for example that calculation right there the numbers used to calculate that sum from the original data set had the year equal to 2022 and the month equal to october and the product equal to aspen or quad if i click on carlota now that cell has the conditions year equal to 2022 the month equal to september and the product equal to carlota fundamentally that's what a pivot table does it makes calculation based on conditions and criteria now that button right there you can use that instead of the control key to select multiple items and when you want to clear the filter you click the red x now every calculation is for all products our next report is a region year sales report now we already grouped by year so when we go back over to the data set to start a new report i click in a single cell and instead of using this button we're going to use the keyboard alt n v t hitting each key in succession we want it on an existing sheet location region sales we're going to put it in b3 click ok and here we can see it still has years that's because every time we create a new report it uses the same cache data now that's a good thing because if you had 10 reports and it re-cached the data each time which it used to do in earlier versions of excel then your file would get really big all right we want region in rows and i'm just going to check it because the default is to send a field like this to rows any time you bring an item down to rows it instantly gives you a unique list we can do the same for years when i check this it puts it below which is what we want and it gives us a unique list for each one of the regions now we drag sales down to values and each time we sum our sales we have to add number formatting and change the name at the top right click number formatting number use a separator click ok and we'll indicate the unit all right so that's sort of a hassle and a data model pivot table and a power bi report we won't have that problem because the number formatting and the name of the calculation actually follow the calculation around but this next calculation where we need to calculate the sales change as a percentage year over year between each year that type of calculation in a standard pivot table is much easier than any of the other tools we'll learn here's how you do it you drag sales down to values we have a duplicate and to change the calculation you click in any cell in the values area in the second column and there's two ways we can change the calculation summarize values by allows you to change the function or the aggregate calculation and show values as this is amazing percent of grand total percent of column total difference from that would calculate the difference from year to year this is the one we want when i click the show values as percent difference from dialog box pops up now notice it says sum of sale if we had a different aggregate calculation like counting then the percentage difference from calculation would work on the count now the base field that's the field where we're trying to make the percentage difference from calculation and the base item that means what is the starting amount for each calculation for us it's going to be previous so that way in this row right here here's the end amount there's the previous start amount now when i click ok not only does it make the calculation but it adds the number formatting the year over year-over-year percentage change from 22 to 23 is 28.89 now to illustrate how fast a standard pivot table can be i want to do the same report off to the side but instead of sum of sales i want to count how many transactions there are so we're going to go back to the data set click inside the table alt n v t we'll put it on existing region sheet g3 click ok check region check years and when we drag a number field like sales down to values it defaults to the aggregate calculation sum but i'm going to pick a text field and we're going to drag product down to values notice because the product field is text it defaults to count now this is not the actual count of product it's actually counting how many transactions were in california for the year 2021. so i'm going to come up to the top and rename this number of transactions and enter we're going to drag product down again somewhere in the second values area we click right click show values as and we'll do the same calculation percent difference from the base field will be years and the base item will be previous click ok and look at that down more than 10 percent that means this was the starting amount and 22 this was the ending amount of 23 so we went down by minus 10 percent but this is counting transactions over here this is the amount of the sales and for the same time period we went up in terms of amount of sales but here we went down in terms of number of transactions all right both of these calculation names are terrible we'll just call it percent change same over here all right so our second reporting task is done region year sales report now if we go back over to data sets our third report is average monthly sales by product and we're going to bump into the limitation of standard pivot tables here the problem with this calculation is it says average monthly sales and we do not have monthly amounts for every line in this data set that means we have a problem with the size of the sales amount known as the grain of the table the grain is at the transaction level so what we need to do when we have calculations like this is we have to make an intermediate table and calculate the monthly sales amount for each product to look at what one of these intermediate tables looks like let's go over to the sheet average monthly sales by products i had to calculate the total sales for each month for each one of these products in four different tables or we build one table where it lists each month and the product then we have to calculate using a formula the total sales for each product by month and then calculate the average now really this is not what we should do once we have this problem that means we don't use a standard pivot table we jump over to a data model pivot table now in this video we're not going to make these calculations we're just looking at it here as a preview for next video where we get to learn how to use a data model pivot table all right our fourth report is going to be the easiest one by far year by supplier sales crosstab report that means we need year and supplier as our conditions and the sales field in the values area select one cell in the table alt n v t existing the location our last sheet b3 click ok i'm going to check years to send it down to rows supplier over to columns sales down to values right click number formatting we'll add some number formatting change the name adjust the column width done now fundamentally when we do a cross tab report an intersecting cell is using an and logical test that means all the numbers used to calculate that sum came from transactions in the data set where the year is equal to 2022 and the supplier is equal to gel booms now that's our third finished report now notice we have data from the year 21 to 23 but what if we get new data we have three different reports and back here we have a data set these columns we imported using power query these we created with worksheet formulas but no problem if we get new data for example here's an updated file with new data we simply go back to excel and if your queries and connection pane is not showing you click that button we can go back into the query right click edit or simply double click this is the power query window and remember we said the first step we can click the gear icon you can use this button to go and navigate to the new file but let me show you if i close all this right in the excel data ribbon tab get transform we can click the get data and down to data source settings there's our connection we can change the source click so you have two locations to change the file i want to click browse navigate to the new file i'm going to double click it click ok click close and now look over here we have 606 rows i can click that refresh button or right in the table i can right click refresh and here it's updating it's got the new data if i come over to the table and click control down arrow to jump down to the bottom it got the new data and look at this up to the units all of this was part of the new data but those worksheet formulas all automatically got copied down control home to jump back to the top we updated this table but remember when you create a standard pivot table that tool takes the data and caches the data so if we look at the report it's not updated yet we have to come to any one of the pivot tables and right click refresh that updates the pivot table cache and we can see the new data and we can see the sales went up from 2023 to 2024. if we go look at our region sales report there's the new data and look at this 23 to 24 for california went up but 23 to 24 for washington went down all right in this video we started off by learning how to use power query to import data then we used excel worksheet formulas to create a data model we created a flat table so we could use the amazing standard pivot table tool to create our reports and then we created a line chart and importantly in this video we saw that the standard pivot table has a few advantages over the other tools and one of them is this amazing grouping ability where we can create a yearly monthly sales report with just a few clicks we also saw that the show value as calculation is similar a few clicks and the calculation is done and then we saw that across tab report is fast and easy now stay tuned to next video data analysis video number two we'll learn how to use power pivot create dax formulas and data model pivot tables
Info
Channel: ExcelIsFun
Views: 35,690
Rating: 4.9775281 out of 5
Keywords: Excel Data Analysis, Data Analysis in Excel, Analysis in Excel, Excel analysis, Excel Analytics, excelisfun, excel is fun, Mike Girvin, Michael Girvin, Highline College, BI 348, Microsoft Power Tools, PivotTable, Power Query, Formulas, Charts, XLOOKUP function, Standard PivotTable, PivotTable Slicer, monthly sales report, import data, Data Analysis basics, E-DAB, MSPTDA, Analyze data in Excel, Learn data analysis, free data analysis class, Data Modeling in Excel, Excel reports
Id: LpvCj9vCtRA
Channel Id: undefined
Length: 36min 4sec (2164 seconds)
Published: Mon Apr 19 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.