Excel Data Analysis Class 02: Power Pivot, DAX Formulas, Relationships, Data Modeling & Much More!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to excel data analysis video number two where we get to learn how to use power pivot and create dax measures now we're going to create these one two three four different reports and a visualization now we already did this last video but in that video we used a standard pivot table and worksheet formulas now last video we learned some great importing data skills with power query will build on those in this video we also saw that the standard pivot table grouping and show values as features allowed us to create reports quickly the limitations of the standard pivot table were that the formatting and the naming of the formula had to be redone each time we did a calculation and we saw that some calculations like average monthly sales were very difficult with formulas and a standard pivot table when we jump over to powerpivot and dax you're not going to believe it the formulas can have number formatting and a consistent name and when we get to calculations like average monthly sales it's a much simpler formula [Music] now it's not just these cool dax formulas that we're going to get to learn we'll also see how to avoid using the x lookup function to bring attributes from the lookup tables or dimension tables over to our one flat fact table instead we build relationships between tables then we can have all four tables in our pivot table fields dialog box and drag and drop fields from any table in addition last video we had to calculate this sales column in our flat fact table well guess what you don't even need to do that in dax well in dax here's the one formula that calculates that extra column and then adds it for the sum calculation in the pivot table that means inside the formula we just say hey i need this table and instead of x lookup because there's a relationship we use related multiply it times units and then sum x knows to go through every row in the f sales calculate the line sales and then add it to get total sales this all happens inside the pivot table also this formula will have one consistent name and it will always have the right number formatting now that all sounds awesome and wonderful and it is but it does mean that you have to learn how to create a data model and build dax formulas here's the five steps we used in our data analysis process last video in this video step one is get the data we use power query step two we create our data model but here we use dax formulas relationships hide fields and power query if necessary step three we create the reports based on the power pivot data model we build our pivot table reports step four we create our visualizations and step five when we get new data we refresh with power query now step one is to get the data now power query and the data ribbon tab that's what we use to import data and clean and transform the data before we get it to our data model our lookup or dimension tables for product and sales rep are already in the worksheet but just like last video our sales data comes from an external file now last video we started off with this file and it had three years worth of data and then later when we got new data the file had all the old data plus the new data in this video we want to see a different scenario we're going to be given three text files dot csv for comma separated values we'll need to import all three years create our data model reports and visualizations and then later someone gives us just the new data for the new year we simply add the new file to the folder location click refresh and power query updates everything now one other important distinction we want to make here is this data set has a total of 800 records that's not a lot of data but as we look at the standard pivot table power pivot and power bi it's helpful to use a small data set as we learn all three tools but if you actually get big data millions of rows of data then you gotta use power pivot or power bi the data models in both power pivot and power bi are specifically designed to deal with big data and we'll see a great big data example in video number four the text files from last video were dot txt with a tab delimiter these are dot csv if we do our trick from last video right click open with but i'm going to open with word pad we can see the delimiter is a comma and that delimiter is splitting the data into four fields when you download the files below the video you'll unzip this folder and if we look inside that's the file we'll add later if we look inside the start folder these are the three files we're gonna import initially if we click up in the address bar that's the folder path that power query will memorize then power query will take all three proper data sets and append them one on top of the other to create a proper data set and later when we add the other file it'll just add the new data to the bottom now before we import our data make sure you have the power pivot ribbon tab showing go to file down to options over to add-ins in the manage drop-down select com add-in click go make sure that microsoft powerpivot for excel is checked now once we have our powerpivot tab and the xl ribbon showing we can't just click the data model button and start working on our data model because we have to get the data into the data model and the tool we use for that is the get and transform queries and connection groups power query in this way power query and powerpivot always work together to create a data analysis solution now to import multiple text files in a pen we go to data get and transform the get data drop down from file and down here from folder you want to navigate to the start folder so i'm going to double click that's the address that power query will memorize we click open power query is looking inside that folder and showing us all the files that it found we want to click transform this opens up the power query window the content column if we click off to the side not on the green binary word but off to the side we can see that each row is looking at a file in that folder these are the attributes for that file there's the name there's the extension and a bunch of other attributes over in the query setting pane in properties this is the name of the query it's also going to be the name of the table that we load to the data model now we want to rename it f sales the f is for fact table in data analysis a fact table holds the values that you want to summarize and since these are sales we'll call it f sales and enter now we only have one applied step and i want you to notice last video we didn't look at this but there's a formula bar and every step uses a power query m code function now we don't need to edit any of these because they will be created automatically but you can see this one is called folder.files and it contains our folder address if you move that folder or you need to change it you come to the source step gear icon and click you can browse and change the address when you click ok then that address is changed in the folder.files function now the content column has the files that we want to append into a single proper data set and we don't need any of these other columns now if we were in the excel worksheet we would highlight these columns and delete but in power query they make it easier we come to the content column right click the field name and down to remove other columns and just like that they're gone now over here it added a new step table dot select columns function now we have a column with our three text proper data sets and we can use this combine files button now this button works when you have text files that are proper data sets and you want to append them into a single table so we'll click now this combined files feature is going to build a bunch of code for us but it wants to know is it all right to use the first file and the first 200 rows from that file to build the code for us to append all the tables now when you get txt or csv data it's usually coming from a reliable source like a database so that's fine file origin and very importantly it got the delimiter right if it didn't get it right this process would not work now when we click ok power query will build a bunch of applied steps and queries for us on the right we can see these applied steps on the left we see a number of queries created and down at the bottom is the f sales query delivering an appended proper data set all of it done automatically now over here power query got a sample file created some code and then made a custom function and then down in f sales if we go back through our query and look here's the two steps that we created it then in the fourth step invoked that custom function and used it on each row and then completed one two three steps including changing the data types and there's our proper data set now what's so amazing about all of these queries and applied steps is if we go back to step number one remember we're pointing power query to a folder right now it has three files but later when we drop new files into that folder this first step will list however many files we have and then power query will rerun every step but this time the proper data set will incorporate all the new data all right now we need to load this to the data model we go up to home close close and load drop down close and load to this opens up the import data dialog box and down at the bottom is add this data to the data model we check and then immediately come up and select only create a connection this allows power query to run all the applied steps and queries to append the tables without loading the data to the worksheet if you accidentally leave the default option table selected and you load it to the data model it'll load it to both places the worksheet and the data model and we don't want to load it to both locations especially if you have big data because the worksheet can only hold a million rows so when loading to the data model check that click only create a connection and then click ok in the queries and connection pane we can hover and there it says loaded to the data model and there's our folder path up in the top all of these queries are loaded as a connection only they're used only as part of the transformation to append the tables we can collapse this folder of queries up at the top now we want to go have a preview of this data in the data model and we can get to the data model by going to data data tools the manage data model button or over in the power pivot ribbon tab we can click the manage data model button it opens up the power pivot for excel window we can see it has its own ribbon with tabs down here it says f sales and this is a preview of our data now when we load the data to the data model it gets loaded into a behind the scenes ram memory columnar database and this database compresses the data to a very small size that's how it can handle big data now the way the columnar database works is it takes each one of the columns in our table we have four columns it stores them as a unique list for the date column there are 426 unique dates product id and sales rep id would have four each and the units column has 187 unique items that's a total of 621 bits of data it's storing but the full table in the excel worksheet if there are 606 rows and 4 columns that means 2 424 bits of data so this columnar database even on a small data set like this has to store a lot less data but on big databases it's dramatic how much it can compress the data now when it comes to analyzing later on the columnar database creates a map that enables it to reconstruct records that are needed for analysis now later we get to come back here and create our dax formulas now to jump back to excel you can use that button or if the window is smaller you can simply click or you can use alt tab to jump back to the last active window now we need to bring both of these tables into the data model and although we could click inside the table and go to powerpivot and use the add to data model button it's much better to use power query to import all your data because the tools inside of power query for dealing with data are much better than in the powerpivot user interface so i'm going to click in a single cell and in the get and transform data group we click from table range this opens up the power query window that name is fine the two steps that are created including the data types are fine and notice the 1.2 is the data type icon that represents a decimal number this is a number with up to 15 decimals for almost all numbers we use in calculations we want to use the data type decimal number except if you have whole numbers then you use the 123 whole number data type the currency data type is not like currency number formatting this data type only allows up to four decimal places and i'm not even sure why they called it currency but luckily when we get over to power bi desktop the exact same data type is called fixed decimal number decimal decimal text text and whole number everything's looking good so we go to home close close and load close and load two we do not want table we want only create a connection add this to the data model click ok we can hover and we see that the data has been loaded to the data model we click in one cell and d sales rep click the from table range button the name is fine the steps and the data types are looking good close and load close and load two only create a connection add this to the data model and click ok we can hover and see that it's been loaded to the data model using power query we have completed step number one which is importing the data into the data model now we move on to step number two which is building the data model we can use the manage data model button to jump over to the powerpivot for excel window we can see we have one two three tables now we need to go back to f sales because we actually need a fourth table now whereas in a standard pivot table you can use the grouping feature for date calculations in a data model pivot table in both power pivot and power bi we use a date table to make our date calculations now just as we have to have a lookup table or dimension table for product and sales rep and these tables represent a unique list of all of the sales rep and all the products the date table is going to have to have a unique list for every possible day for every year from the fact table once we have this date table we'll be able to make more varied calculations and reports than we can with the standard pivot table now in powerpivot we can create a date table with a single click when we get over to power bi desktop we'll actually have to use dax formulas to create the date table all right here's the fact table to one click our way to a date table we go up to design calendars the date table drop down and click new when we click new this builds a complete date table it went through the fact table found every possible year and then gave us a unique list of every day in all those years it also added these attribute columns like year month number month and a number of others now we created this with one click but there is one disadvantage to a date table in powerpivot and that is when we get new data later it doesn't automatically update so every time you get new data you have to remember to come to the power pivot for excel window design calendars date drop down and click on update range and then manually change the dates now we'll see how to do this later now over in power bi desktop although we can't create the date table with a single click when we use dax formulas those will update automatically when we get new data now i want to come down and rename this table so i'm going to double click just like in excel and we'll name this d date our next step is to create a relationship from the fact table to each one of our lookup or dimension tables that way we can drag and drop any of these fields into the pivot table and make calculations between tables now home ribbon tab view we're in data view data view is where you view the tables and create your dax formulas when you want to create relationships you go to diagram view now each one of these represents a table table name at the top field names below later we'll have formulas listed here also we can move these field lists around here's the date table when i expand it notice it creates a date hierarchy that means we can drag one field into the pivot table and it will list multiple date attributes now each one of our dimension tables has a unique list of products sales reps and dates that means there's one of each on this side now if we have one of each product here of course in the sales table we hope that we have lots of duplicates of each product same here sales rep is listed once but hopefully that sales rep has many sales and also over here a unique list of dates but hopefully we have many sales for each one of those dates that means we have a one to many relationship between each one of the dimension tables and the fact table one side many side now to create the relationship we can drag the unique list of product id this is also known as the primary key we drag it over to the same column in the fact table that contains the duplicates product id to product id and there we see a one to many relationship this is known as the primary key this is the foreign key we can do the same for the sales rep table sales rep id from the one side over to the many side now it actually doesn't matter we can drag from the mini side over to the one side and powerpivot's smart enough to know which is the one side and which is the many side now we'll come back here later because we're going to hide columns or calculations that we don't want to appear over in the reporting area but for now our relationships are created now we can go over to data view and before we build our measures in the fact table let's adjust our date table now columns like month and year day of week and the name day of week those may be perfect for you if you drag this into a pivot table you can get the total for all periods for the days of the week and see which day generates the most sales you could also add extra columns like for fiscal periods but for us i don't need any of these columns and since we're building a data model we don't want to have extra columns because remember this is all stored in that columnar database and i don't want to waste extra space clicking on the first column i'm going to hold shift click on the last column and right click delete columns yes i want to permanently delete them now we're left with four columns the first column was automatically created when we click the button the remaining three columns here were created but they were created using dax functions if we click in the top cell of year we can see the year function the month function those are exactly the same as over in the excel worksheet but when we get to calculating the month name in excel we would use the text function but over here in dax they named the function format now these are calculated columns and we want to create our own calculated column one of our calculations we're going to need requires that we have an end of the month calculation to create a dax calculated column i double click where it says add column and we're going to name this end of month and enter now when you create a dax calculated column or later a dax measure you actually do not create them in the cells you have to come up to the formula bar and this will be the same in power bi desktop also now the function for end of the month is the same as over in the worksheet for the start date argument we need to select the entire date field and just like in excel we click at the top of the column and properly it puts the table name and then the field name in square brackets but in the excel formula we had to have the at symbol the implicit intersection operator for the formula to pick out the correct value from each row but in a dax calculated column it happens automatically you put in the table name and the field name in square brackets and it uses something called row context to grab the correct value from each row so at the end of our column used in the start date argument i type a comma and if we want the end of this month we put a zero end of next month would be one end of last month would be minus one so that's our formula and that formula is the same as over in excel except for we don't have to put that implicit intersection operator when i hit enter the correct value is calculated for each row in the table if we scroll down we can see february is correctly shown as 228. now by default a date shows up with date and time number formatting and for both of these columns it will have no effect if we use them in the pivot table or in our calculations but if you want to add number formatting to show just the date home formatting and here's the format drop down and you can do that if you'd like now one other crazy thing although we use the proper convention of table and then field name in square brackets and we will not deviate from that the automatic date table feature does not follow that convention now the reason why is this feature was built at the beginning of power pivot and they never updated it as we'll learn later when we create measures we reserve square brackets when we're referring to measures in other formulas but when we use columns we use a table name and square brackets for field names back in diagram view we have our four tables and our relationships now we need to create measures in the fact table we go over to data view f sales we can pull this up i'm going to expand one of the columns because in the measure grid we're going to build all of our formulas called dax measures that we're going to use in our pivot table now as we talked about earlier we don't even need to calculate a column for line sales we can simply have one dax measure that calculates the column and then totals it and that's the formula we use in the pivot table however since we're learning this for the first time i am going to show you two ways to accomplish the same goal which is a total sales formula in the pivot table the first one is two steps you create a dax calculated column and then you create a measure that sums that column and then we'll create the second measure that's the one step process it does both in one formula now we'll start with the two-step method and we'll add a calculated column called line sales and for every row in this table we're going to take units but we need the product price and in dax because there's a relationship between product id and the product id column in the d product table we'll see how to use the relationship to look up the product price we double click add column type the name of this dax calculated column and enter when we create dax formulas we always have to use the formula bar and the name of the lookup function to look up through a relationship is called related i hit tab and the only argument is the column name and i'm going to use down arrow of the field that contains the item we want to look up now i'm going to close parentheses and hit enter all of these are product id 2 looking up the correct price and if we scroll down we have lots of different product ids and the related function using row context to see each product id in each row and then the relationship it looked up the correct price for each row now this lookup formula is pretty amazing because we didn't have to put the lookup value into the formula like we did with x lookup because there's automatically a relationship between product id in both tables we just put the name of the column that has the thing we want to go get and bring back to this dax calculated column and through the relationship it automatically retrieves the right price now we come up to the formula bar and now we times and i can simply click on units and that's our formula for line sales when i hit enter the automatic row context in the calculated column calculates the correct line sales for each row now this is the two-step process so now we create the measure to use in the pivot table now when you create a measure you click into cell and type the name of the calculation that you want to appear in the pivot table field list so i'm going to type total sales and notice when i type it jumps me up to the formula bar and in excel and in power bi we use an equal sign before the name of the function but when you create a dax measure in powerpivot you have to use the assignment operator colon equal sign and then you type the name of the function i know that seems weird but that's the way it works in powerpivot now i can simply highlight line sales close parentheses and when i hit enter this shows me the grand overall total however when we drop it into the pivot table it'll get the correct amount depending on the row column and filter conditions now we need to add number formatting so i right click point to formatting select number use a separator two decimals is fine click ok and now that number formatting appears every time we use this measure and the name of that formula will also appear now this is the one two step method and i did it just so we can visually see the amounts for every row because now we're going to do the one step method and this is the preferred method when you create measures to use in the pivot table or over in power bi in your visualizations we're going to type the name it shoots me up to the formula bar colon equal sign and then instead of using the sum function we're going to use the sum x function tab it has two arguments table and expression the table well just as we had a table down here and added a calculated column we have to tell some x what the table is type f s and then tab the expression argument that's where we put our formula and we can use the exact same formula we used over here related to look up the d product price close parentheses times f sales and i'm going to get units now the sum x function works exactly like the table in this calculated column but instead it does it all in a single cell sum x takes that formula and iterates through all 606 rows using row context to pick out the correct units and to look up the correct price it then calculates all 606 line sales numbers and then the sum part of sum x adds them to get the total so when we close parentheses and hit enter we get the same grand total now we can add number formatting in the home ribbon tab formatting and any of these options or more format number use a separator click ok now you have a choice between creating a dax calculated column and summing or using sum x the convention is to use sum x now sum x is an iterator function because it iterates over a table and then makes an aggregate calculation and there are other iterator functions such as average x count x and concatenate x they all iterate over a table and then make an aggregate calculation now deciding on whether to use a dax calculated column or a single measure is based on preference but there is a difference when you create this dax calculated column the values are stored in the columnar database when you create a one-step measure and the values have to calculate in the formula then when you drop it into the pivot table or change the conditions or criteria in the pivot table report each time the formula has to recalculate my rule of thumb is if my formula starts to calculate too slowly when i drop it into the pivot table then i come and make a dax calculated column and make my aggregate calculation in the two step method otherwise i stick to the one step method now if we go over to diagram view we can see in f sales there's our measures now we want to build our first pivot table even though we're not done with our data modeling we still have to hide some fields and add a bunch more measures but we want to see the magic of dax measures now our first report i want year in month and actually in this hierarchy i want to get rid of dates so i'm going to come over and right click date remove from hierarchy remove and so now when we create our pivot table we'll drag date hierarchy and then our total sales measures now we can create a power pivot data model pivot table directly from the power pivot for excel window by using this drop down here and this drop down not only has pivot table and pivot charts but a bunch of combination options we're going to use the top option to create a pivot table now this jumps us back to the excel window with the create pivot table dialog box open we click existing then we have to click collapse i want this on year month sales b3 click ok click ok again and so now over here on this sheet in excel there's our pivot table area and in the pivot table fields task pane we can see all sorts of tables the cylinder icon means these four tables were created with power query and were loaded to the data model these two tables come from the excel worksheet they are not in the data model if you try to use them with your data model they will not work you can actually hover your cursor over each one of these and get a message about the source these tables up here say they're from a query this one up here says connection but that's the date table that was automatically created in the power pivot window now our report is going to come from d date and f sales and i'd like to move these over to the active tab so right click show inactive right click show inactive when we go over to active collapse more fields that is beautiful in our pivot table fields task pane we have two tables we can drag and drop for each tables and look at our measures the f of x icon means this is not a field from a table but it's a measure we can drag and drop now we're going to take date hierarchy and drag it over to rows we see our years and underneath there are months and we are eventually only going to use one of these but for the time being i want to show you that both of them work the same so i'm going to check and check they both go down to the values area and that is amazing we get the correct totals for each year and the grand total the measures already have number formatting and the labels at the top are exactly as we intended now for this two-step method formula the total sales without the spaces i'm going to uncheck this now we're left with the sum x formula and we got to figure out how in the world did that sum x formula which is the same in every row how did it know to calculate the correct yearly total and then get the grand total in the last row now the measure did it with something called filter context if we consider that measure calculation right there when we drop the measure into the pivot table as soon as it saw the row condition 2021 the entire fact table was filtered down to just the records for 2021 and then the measure had a smaller table that it had to deal with when it made its row by row calculation and then sum when the measure got down to the next row that measure is not dealing with the whole fact table it's only looking at the records for 2022. now if we compare how a dax measure works to how excel worksheet formulas work dax measures use filter context to always get as small a table as is necessary to make the calculation but excel worksheet formulas if you highlight a column it uses every single cell in that column every single time now the way that the filtering takes place if we consider that cell right there and that condition on the row and we jump back to the power pivot window and look in diagram view more specifically when we drag the year field from the d date table into the row area of the pivot table this entire table becomes filtered down to just the days for the year 2021 and that filter flows through the one to many relationship and filters the fact table every time you drag a field from a dimension table into the row column or filter area in a pivot table or over in power bi desktop into visualizations every time you do that the dimension table becomes filtered and that filter is passed through the relationship over to the fact table if we come back to our pivot table and expand and click in the cell that has the measure that calculates total sales for march this is how the filter context works here's the full date table with 1095 rows here's the one one-to-many relationship over to the fact table with 606 rows as soon as this measure sees march the date table is filtered down to 31 rows and then it starts to move that filter across the one-to-many relationship once it hits the fact table the fact table is filtered down to just the dates for march and in this case there are only four rows so then sum x has much less calculating to do because there's only four rows and that's the magic of dax measures and filter context now we have some more data modeling to do and we want to notice something about the fields in this field list we're never going to drag product id or sales rep id into the pivot table so if we want to be polite and build a user interface for the report creator we should hide those fields also line sales and units those are number fields and if you accidentally drag those to the values area they create what's called an implicit measure and there's a number of problems with implicit measures to see what these problems are let's drag line sales let's mistakenly drag line sales down to values sure enough it calculates the correct amounts but it doesn't have the number formatting it doesn't have the name we want and if i look over here i don't see a reusable sum of line sales measure also when we incorrectly create an implicit measure if we ever share this with someone over in power bi all these beautiful measures that we create show up but the implicit measures do not now the other problem is this implicit measure it actually shows up in our data model so i'm going to alt tab to jump back over to the data model but i don't see it here data view f sales i don't see it in the measure grid and this terrible button which should be on by default i'm going to go to advanced and there it is show implicit measure when i click this and you could see that little blue icon that shows up over here also that indicates that this was automatically created when a field was dragged to the values area now when we click on it we can see the problems you can't edit the formula you can't change the name down here you can't add number formatting down here and of course this will not show up when we send it to power bi so the rule is don't create implicit measures now i always turn this on just in case i accidentally drop a field then at least i can see it over here and right click delete delete from model now very importantly let's go over to home diagram view and we want to hide all of these fields these are number fields these fields we're never going to use we don't drag date from f sales we drag date or any of these date attributes from the date dimension table so i click on the first one hold shift click on the last one right click and hide from client tool over in power bi it says hide in report view we can see that they're hidden now cost and price we're not going to use those supplier and product those we want to be able to drag into the pivot table but product id we're not going to use so hold control click right click hide from client tool right click sales rep id hide from client tool we're also never going to use month number so right click hide from client tool and now the pivot table field list our reporting area is much easier to use and we're not going to make mistakes by accidentally dragging and creating an implicit measure now here's our year monthly sales report and last video i told you that month names would not sort correctly in a data model pivot table well when we created the automatic date table powerpivot created this month number column but then it sorted month using the sort by column feature and it said hey please sort this month column with text values by the month number column and 1 to 12 will sort january to december in perfect order now it's automatic in powerpivot when we get over to power bi desktop we'll actually have to invoke the sort by column feature to get our month names to sort correctly now we have our report and step four is to visualize now last video we created a line chart and guess what it's exactly the same with a data model pivot table we click in a single cell insert charts there's the line chart drop down we're going to select the first one and there's our chart we can move it we can get rid of elements that don't help deliver the message this is chart junk i can select it and use the delete key these field buttons we do not need right click hide all field buttons on chart click on the chart title up in the formula bar we'll type our name total sales click on the vertical axis and we'll use our keyboard and notice over here we have our pivot chart fields when i control 1 a second pane opens up format axis we come down to number open it up and we'll have zero decimals tab now we have our line chart that shows trends over time and the beautiful thing about the connection between a pivot table and a pivot chart and so if i come over here and collapse the chart updates all right now we get to create our next report make sure the pivot table is selected and over here i want to look at the pivot table field list and guess what i forgot to hide the dax measure without the spaces that was the two-step method we did just as we were learning dax and we don't need it in the reporting area so let's go hide it back in the power pivot for excel window data view we select our measure and then right click hide from client tool it's grayed out over here and the pivot table field list shows one measure now we get to create our next report so we're going to go over to the region year sales report i'll select cell b3 and because there's a data model in this excel workbook when we go up to insert tables and the drop down for pivot table there's an option for from the data model now if you don't see these four data connectors select table range i'm going to select data model we can also use the keyboard alt n v t for table d for data model now this is the new dialog box pivot table from data model it already has the cell selected now this dialog box is much easier than if we start the data model pivot table from the power pivot for excel window because it only takes one step rather than four steps if you see this dialog box then to create the data model pivot table select the dialog button use this workbook's data model in either case we click ok now we're going to use the number of the data model tables and you can move them to active if you decide to work in the all tab we need to select region from sales rep do not select this one select the one with the cylinder icon i'm going to drag region down to rows there's our unique list up and date i don't want the hierarchy i want to open more fields and i want to drag year down to rows and then our measure i can check and that's the total sales by region and year now the next calculation is we need to figure out the year over year change as a percentage now last video we saw the ridiculously easy show values as in a standard pivot table that's an amazing feature and it's why we want to use the standard pivot table when we get to a data model pivot table we're actually going to have to figure out a tax measure that can do the same thing now fundamentally what we need for a calculation like this is in that cell i need the total sales for this year but i also need the total sales for the previous year now if we create a worksheet formula notice we are allowed to just use cell references that's the difference between the two years and we compare it to the starting amount and that gives us the correct decimal or percentage change but we don't get cell references in dax formulas so what we're going to have to do is we still need the two different numbers in the same cell we'll use the total sales measure for the d5 part but for the d4 part because of filter context that means this 2022 flowing into the formula we're going to have to change it from 2022 to 2021 and the way that you change filter context in dax is to use the calculate function so we'll jump back to our power pivot for excel window and in the f sales measure grid we're going to create a new measure i'm going to call it previous period sales colon equal sign and there it is all you have to do is type c and then hit tab to get to the most powerful function in dax calculate now the first argument expression that's where we put the formula where we want to change the filter context so i'm going to type a square bracket and notice that even though we hid this measure it still shows up in our drop down so we want the top one tab now i type a comma to get to the second argument filter and this is where we put the filter that we want to change the filter context now remember we want to change it from the current year dates to the dates one year back and you're not going to believe this there's an amazing function called same period last year and all it needs for the dates argument i'm going to type dd is the first column of the date table tab and this function can take the 2022 dates that flow in from the row area of the pivot table and automatically throw them back a year when we close parentheses that means that same period last year for that 2022 cell is now delivering all the dates in the date table for 2021 to the filter argument those days are then used by calculate to change the filter context for total sales now we could type comma and there are more filter arguments and all of these arguments work in an and logical test but all we need one filter to help change the filter context so close parentheses and enter now we don't need to add number formatting because we're just going to use this in other dax formulas but we do want to try it to verify that it's doing what we think it'll do i'm going to drag it down to values and sure enough just like that bam it got the previous year's total sales and whatever condition or criteria that we drag into the row column or filter area that will work now we can uncheck this back here i can right click hide from client tool and right below we'll call this year over year percent change colon equal sign and there's a great dax function called divide now for the numerator we need to take the current year minus the previous year so square bracket total sales minus square bracket there's our previous period sales so in the numerator this will give us the difference between the two years comma and in the denominator square bracket p the previous year sales now the alternative result that's what you want the dax formula to put in the cell if there's a divide by zero error if we omit it it'll put a blank and it actually uses the dax blank function which shows nothing that's our formula close parentheses and enter now i want to add some number formatting so up in the formatting group i'll just click the percent now we can test it now when i check this it's going to give us the correct results for each year but the totals are not correct and we don't want them there so we have to amend our formula now notice the year column here there's exactly one year in the filter context one year in the filter context but in the california total washington total and grand total there's more than one year so we're going to use the if function and say hey does the year column have one value if it does run the formula otherwise show nothing so up in the formula bar we can use the if function and then here's another great dax function has one value now we want dd and there's the year column close parentheses so if there is only one year in the filter context comma we want our formula otherwise and this argument will use the blank function to show nothing if we omit it so we'll leave it out close parentheses and enter now it shows blank here because in the measure grid there is no filter context it's the same as the grand total cell in the pivot table it has all the years in our pivot table now we have the correct show values as calculation but using dax and if we come over and drag a new condition to the row area our dax measure works perfectly that 42 change is between aspen 22 and aspen 21. if you come off to the side and take end minus the start and compare it to the start we get exactly the same thing now i'm going to uncheck product and there's our finished report just like we got with our standard pivot table last video but now we're using dax now for the next report we're going to go to the average monthly sales by product sheet and i've already started the data model pivot table in the active tab there's our two measures and we have the date and product table with product and year in rows now if we're going to calculate the average monthly sales for whatever condition we drop into the row area we first have to calculate the total sales for each month before we make the average calculation now last video we saw how difficult this is we had to create an intermediate table before we made our average calculation so for the carloto product we actually had to create a unique list of all the months and calculate the total sales then make the average calculation but with dax formulas it's easy to make that type of calculation because dax has iterators like some x and average x that have a table in the first argument now remember what sum x did it took the formula and iterated over every row in the f sales table to calculate the transaction or line sales and then it added when we get to average x there's an amazing dax function called values and it generates a unique list in the current filter context so if we put end of the month column into the values function for each row in the pivot table it'll always generate a unique list of months and then the formula total sales will iterate every row in the table which is all the months calculate the total sales and then the average x function will average those monthly amounts all right in the f sales measure grid we'll start the formula average monthly sales and we'll use average x down arrow and it needs a table in the first argument and luckily values delivers a table and read the screen tip returns a one column table that contains a unique list and that's exactly what we want so dd and we'll down arrow to the column that we created specifically for this calculation tab close parentheses that's the table that comma the expression will iterate over square bracket and there's our total sales close parentheses and when we hit enter we can come down right click format i'm going to add some number formatting change the name up here to indicate the unit there's no filter contact so that's the average monthly sales for all years now with our pivot table selected we can check and there you go the average monthly sales for aspen in 2021 2022 and all three years now if we look at the carlotta 2021 measure calculation although the formula is simple to create there's a lot going on here first this measure sees the conditions and criteria coming from the row area of the pivot table here's the pivot table row conditions the year 2021 flows into the formula hits the values function and filters the d date table now the end of the month column is just one column in that table here's what happens to the date table all rows are filtered to show just dates for 2021 there's a unique list in the first column but look at the end of the month column the month is repeated for every day in january and then february and so on so it's this column that the values function is looking at to generate a unique list so it's the values function that allows us to create a table that has the correct grain or size to summarize the monthly totals then in the second argument of average x we have our measure and the carlota condition flows in and filters the fact table for every single measure so that only carlota records are showing then as the measure iterates down the end of the month table the month condition flows in to each one of the calculations so in row one only records for carlota in january 2021 are showing in row two carlota for february 2021 are showing in the third row the march carlotta records are showing and that filtering process happens all the way down so that the total sales measure can calculate the correct monthly sales for carlota and then the average x takes those numbers and creates the average now there's a special name when you take a measure and iterate over a table like this and allow the row by row condition to flow into the measure and affect the filter context it's called context transition and the cool thing is and we'll simulate what the dax formula does here if the condition is 20 2022 and the product is aspen our one dax average x measure creates a completely new table to iterate over and then the total sales measure calculates a new set of monthly totals and then calculates the average monthly sale now here's a warning about context transition if you try to skip creating the measure total sales in the measure grid and then using it in your formula and you try to type the formula out in the second argument of average x you will not get the correct results and the reason why is the measure allows context transition to happen allowing the row condition to affect the total sales calculation when you use the straight formula that context transition doesn't happen the number we get is actually the total for aspen for the entire year 2022 because none of the individual month conditions can flow in it's not allowed to get the monthly total for the particular product it gets the grand overall total for every row so instead of aspen for january it just gets the grand overall total 12 times and of course if you add the same number up 12 times and divide by 12 you get the grand overall total and that's the incorrect answer here now if you really want the formula in the second argument of average x if you wrap the calculate function around it that forces context transition because the calculate function is the only function that can change the filter context and in this case it's converting the row context into filter context and in fact every measure you ever create has a hidden calculate function wrapped around it and that's why all measures automatically do context transition and this pivot table illustrating this is in the finished excel workbook that you can download below the video now let's take a look at what happens if i change the year to 2021 there are some zeros in our data set and that average is including those zeros in the calculation the dax formula we created would not include those zeros because there are no records in the data set that's the average with zeros that's the average we got however it's not hard to amend the formula if you do want to include zeros in the expression argument since whatever we put there is going to iterate over the table we use the if function and say hey if the measure is equal to zero then i want you to put a zero there otherwise run the measure and that will give us the averages including those zeros now for our fourth report we're on the total sales by supplier and year sheet let's click in cell b3 alt n v t if you do not see this drop down d if you do and we're going to put our data model pivot table in cell b3 click ok now i want to show you what happens if you have one of these rogue tables that's not in the data model if you accidentally check supplier it does add the field but notice it added a one to the name of this table as soon as you realize you did this you have to go over to the data model and delete the table that was automatically added so there it is product one and it will not work because it doesn't have a relationship and it's a duplicate table so if this happens you right click delete say yes most of the time it's safer to bring things over to active right click show inactive we'll do that for date and the measure table since i don't have any fields showing is automatically inactive and now we get to witness the beauty of a data model pivot table we can drag fields from multiple tables and use our pre-made formulas check it's already got number formatting in the name my cross tabulated report is done now that's our fourth report i want to go back over to the region year report we want to create a similar report to this where we calculate year over year change but we want to do it by counting transactions and in this formula we'll learn about dax variables now here's our formula for year over year change and notice we used previous period a measure twice if we click in the previous cell in the measure grid we created it and then we hit it and used it in our formula rather than creating two different formulas if we have repeating formula elements like this we can use a variable now the formula will be similar to this however instead of total sales we're going to create a measure called frequency so down in the measure grid i type the name of the measure colon equal sign and you're not going to believe this getting a count like using countifs or the frequency function is even easier in dax because we get to use a function called count rows and what makes count rows so amazing is that it counts rows and if we put the fact table into count rows it doesn't matter what condition or criteria we drop into the pivot table because filter context filters that fact table and count rows counts how many rows there are so that's the supercharged countifs frequency function here in dax when i hit enter down in the measure grid we get a count of every record in the table 606 and now we'll create our year-over-year change for the frequency measure now this is going to be a long formula so i'm going to pull the formula bar down and we're going to try and format this also formatting the dax measure can make it easier to read now we're going to start with the name colon equal sign and i'm going to try and format this so i'm going to shift enter that adds a new line we type v-a-r that means there's a variable a space and then the name and then an equal sign and i'm going to shift enter and create the variable on a new line and the convention for formatting dax formulas is to hit tab and the variable has to get the frequency from the previous period so we type calculate and the expression that's our frequency comma and we need to change the filter context and just as we did before we'll use same period last year that needs the first column of the date table close parentheses now wherever this formula is even though it's a variable filter context will flow in same period will throw it back a year and then frequency will be calculated for the previous period close parentheses now we shift enter and i'm going to backspace and now after the variable where we want our formula we type return shift enter and tab we're going to use if shift enter tab has one value for the year that's the logical test comma shift enter then we need divide and it's a new function so the convention is shift enter tab and in the numerator we need frequency we need the difference between whatever this period is and our variable and look at that in the drop down it says x y there's the numerator comma shift enter previous so we'll put previous period count in the denominator now to close off the parentheses for divide the convention is shift enter backspace and you put the parentheses directly below the function and result if false if we leave it out it'll put a blank so we're done with if shift enter backspace and there's our formula now when i hit enter it'll actually in powerpivot jump the variable up to the top line so when i hit enter there's our formula we get a blank down here because of course in the grand total or measure grid there's lots of years now i want to format this so i'm going to click that percent now i've got to show you a hot tip this daxformatter.com website has been around for years created by the smartest dax guys in the world alberto ferrari and marco russo you paste your code here click format and it formats it for you and now with these two measures we can create our final report in cell g3 alt n v and d and ok is highlighted so i hit enter now i'm going to be careful here and in date more fields i'll drag year down to rows sales rep there's the region i'm making sure that that cylinder is there region down below year and then our list of measures i'm going to check frequency and year over year and there's our finished frequency distribution now we're all done with our reports and our data modeling if we go back and remind ourselves oh yeah we created a query and we have a new file now inside of start there's three files but let's copy this and paste it now remember power query is looking at start so now it's going to see four files and now we want to refresh and this will be different than last video last video we had to use power query to dump a table in the worksheet and then create a pivot table from the table we had to refresh twice here power query and the data model working together all i have to do is refresh the query now these tables didn't change so all i'm going to do is click that refresh button and it's broken and the reason why is and we mentioned this before in powerpivot that date table does not automatically update so we have to go back to the data model we have to go to the date table sheet design date table drop down update range and the end date is going to be 2024. click ok so now this is updated and our reports are updated also we can go and look at every one of the reports that is absolutely amazing even back here on our line chart there's our updated line chart and in our queries and connections pane we can see 858 rows if we double click this query it opens up the power query window and if we go back and click on the source step we can verify that sure enough it got all four files now the last topic we want to cover is once we have a data model here in powerpivot i want to show you how to automatically send it to power bi desktop which is a free download and how to publish it to powerbi.com which requires that you have a power bi pro or higher license now next video will learn how to use power bi but here we just want to see how to share in power bi desktop you go to file down to import and here it is power query power pivot power view here's our file i'm going to double click this is warning you that it's not going to import everything from excel but it's going to import just what's in the data model click start i'm going to copy data you could keep a connection if you want and when you click close and go over to model and after you move things around that's the same model that we created over in powerpivot but now it's in power bi desktop and next video we'll learn how to use power bi desktop now publishing to power bi.com is even more amazing than sharing it to power bi desktop and here's why if you create this model all these beautiful measures and relationships and you need to share it with people on your team you just click the publish button inside of the excel workbook and then anyone on your team can access this through the pivot table drop down and make pivot table reports from a single source of data truth stored at powerbi.com this is how easy it is file and you always wondered what publishment publish you have to select a workspace that people have access to i'm going to select this workspace and it's the second option when we click export it loads the data model to power bi.com down here you can see it working you can click this button to jump over to power bi online and making sure you're in the correct workspace there it is so now this is stored online and when you're logged into excel the pivot table drop down will offer this as an option i'm in a blank workbook i go to insert pivot table drop down or the keyboard alt n v earlier we use t for table d for data model but now we use b for power bi online and when i click that b it opens up power bi data sets we type the name of whatever that data set is and there it is now when i click this it is not going to download the data model it's just giving us access through the pivot table user interface so i'm going to click and there it is we can check fields from different tables we can select our measures and bam our report is complete and the beauty of this of course is that we have a single location for this data model that everyone can access no more sharing files changing things it's just in one universal location now we want to compare a powerpivot data model pivot table to a standard pivot table with the data model pivot table you can have number formatting and a consistent name attached to the formula you can avoid the two-step calculation process where you have a helper column and then an aggregation with a one-step measure like we saw with sum x some complex calculations can be easier with dax like iterating over a table to pre-aggregate numbers before making an aggregate calculation we saw that with average x relationships can replace vlookup and x lookup you can have multiple tables in the pivot table field task pane you can deal with big data you can share and send the data model over to power bi with the standard pivot table they're just flat out fast and easy when you have less than about fifty thousand rows of data now of course you can have a hundred thousand rows of data in the excel worksheet and make a standard pivot table but that columnar database is so amazing at compressing data that that 50 000 rows is a rule i loosely follow the grouping feature in the standard pivot table is fast and easy for dates times and numbers and the show values as feature has many calculations that are easier than the parallel dac solution you don't have to spend time building the data model with dax formulas and a date table and some reports like a frequency distribution report for text data are super fast and easy with a standard pivot table all right in this video we saw the five steps for power pivot data analysis the first one use power query to import the data the second step is to build the data model adding the relationships creating the date table creating measures and hiding the fields that we don't need in the reporting area then step three we created our data model pivot table reports step four we only had one visualization but we visualized and step five when we had new data we refreshed and everything updated all right stay tuned for next video where we'll get to apply our data modeling and dax skills from this video but in power bi desktop where visuals and interactivity are amazing
Info
Channel: ExcelIsFun
Views: 33,932
Rating: 4.9574919 out of 5
Keywords: Excel Data Analysis, Data Analysis in Excel, Excel Analytics, excelisfun, excel is fun, Mike Girvin, Michael Girvin, BI 348, Data Model PivotTable, Power Pivot Data Model, E-DAB, MSPTDA, Learn data analysis, free data analysis class, Filter Context, Context Transition, What is Context Transition?, SUMX Function, AVERAGEX function, DAX Formulas, Learn Power Pivot, Comprehensive Power Pivot, What is Power Pivot, How to create DAX Formulas, CALCULATE function, DAX Measures
Id: eIaKC6zLmb0
Channel Id: undefined
Length: 76min 24sec (4584 seconds)
Published: Mon Apr 26 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.