DAX Summarization Tricks in Power BI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let's talk about summarization tricks in parbi now what exactly is summarization if you're coming from the world of Excel and if you have built any pivot table on top of a pivot table you would understand what summarization would mean I mean you have built a pivot table on top of pivot table haven't you yes or no well if you haven't you're going to see what summarization is if you have you're going to love this video no further Ado let's [Music] start in order for me to make sense you'll have to first take a look at the data model understand the data a bit and then we'll take a look at some Nifty summarization tricks so I'm working with a simple star schema right here we have the sales table on the many side of the relationship on the one side we have the calendar table and we also have the products table which is connected through the product ID if you were to quickly glance through the sales table it looks like this we have the transaction ID column the date of the sale the product ID the number of units sold and the channel through which they are sold now I have built a simple visual right here in the visual I have Channel and the three channels that we are selling through through and here is a question that I'd like to know please tell me the average units sold per day now how are you going to do that let's take a look at the measure that we have been able to build to be able to calculate the average so if I do this average of the sales table and the units column if I happen to drag that in the visual I'm going to get a number now how do I know or not know if that is the right number or not let's first take a look at the granularity of the sales table because if you just take a look at the sales table our sales table is to the granularity of the transactions and if you were to take a look at the days right here the days are duplicated second January's duplicated third January is duplicated so on and so forth because on those dates there were multiple transactions happening and if you were to take this vanilla average of the sales units what you have received is not the average units sold per day but the average units sold per transaction because the granularity on which the base calculation is is running is to the granularity of transactions not to the granularity of the day against which you're wanting the calculation so the correct calculation is going to be this calculation which is going to be the daily average units right here which is where I take the total units which is the sum of the units of the sales column and divide that by unique days which is nothing but the distinct count of the sales table date column and if I were to drag this calculation in my visual I now get the correct average units the point that I'm trying to make is that whenever you're trying to do these average calculations please make sure against the table which you're doing it is at the same granularity against which the calculation is asked for now let's raise the bar and let's just take a look at a few more interesting nuances and tricks around summarization of the data all right quick interruption in the video this Interruption is going to be absolutely worth its while we have launched the M course on power query people have been asking me that when are you launching the M course and I have been dodging those questions for a very very long time but finally we have an M course that is going to teach you m right from scratch and take you up to a level where you start to solve very sophisticated problems on your own but if you want to know the fundamentals of how the M language Works what are lists what are records the different concepts of power query how does the looping work what is each keyword in power query how do you create a custom function in power query and all of the good things that come around with the language that will take you from the user interface level from solving basic problems to solve even more advanced and more sophisticated problems using the M language in powerquery I've got a full course that is open for now there are a few last eight considerations that you have to keep in mind if you would like to enroll into the course all of those good details are in the description of the video I highly recommend if you would like to learn the M language with me this would be a great time to get yourself a seat into the course here is another question that I'd like to solve let's say a against this pivot table that I have against the year and the month I'm trying to solve for what is the sales of the best selling day of the period so if the month of Jan is presented I'd like to know the sales value of the best selling day in the month of Jan what is that now I don't really want to just calculate the max sales I want to first of all take a look at daily sales numbers so I would want to draw up a pivot table in my mind or a summarization in my mind which is where one row is going to be one unique date so all the days of the month of January 1 through 31 against that I will write some sales number whatever sales has happened against that month and then finally I will pick up the max value of the month of January and show that number right here what calculation can we do that now if you take a look at the table of the sales right here the sales table is not at the granularity of the day that means I need to First summarize the table up to the granularity of the day then find sales and then give me the max value of that whenever you're dealing with problems of summarization the very first thing to do is that you take a look at your data model and you ask yourself a question that do you already have a table in the data model which is at the granularity at which you want the calculation for instance I am looking at the max Day level sales and I already have in my data model a table which is a calendar table which is where one row or the granularity of the table is Day level one row is one unique day so if I were to write this particular calculation best selling Day sales I'm going to say that hey please step inside every single row of the calendar table which is where one row is one date and then against that please calculate total sales whatever is the max value please pick up that and present the number right here let's take a look at the result of this so I just drag this to my pivot table and I do get the number right here how do I know this number is actually correct well what I can do is I can go over to my calendar table in the calendar table I can create a pseudo column just for a moment just to check my answer and I can just reference total sales right here so I can say something like total sales press enter I can apply two filters one for the month of Jan the other one for the month of year of 2011 so 2011 and the month of Jan filters applied I'm going to sort the data in the descending order and the highest value should be 192 plus minus a few decimal places which are not showing up right here but that is also what my answer is right here which is 192 that I can be sure that this value is absolutely correct let's take a look at a very similar example again from the concept of summarizing the data up and then trying to find the value let's say against this year and the month I would like to find out that what was the sales of the bestselling product again in my mind I'll have to summarize the data so I'll have the list of all the skews or the products listed down down in my mind against that I will calculate some sales number and then for whatever the sales has been I will pick up the max sales of the bestselling product and display that number right here to be able to do search calculation I obviously cannot do that in the sales table because my sales table is at the granularity of transactions and if I do that the calculation is going to give me the max transactional value and not the max product value let's first take a look at do we already have a physical table in our data model which is summarized Ed already to the granularity which is where our calculation is running our calculation is at the product level we have the product master or the product Dimension table which is at the granularity of one row means one particular unique skew and that is where we will do the calculation so our calculation is going to look something like this I'm going to say something like best product sales why don't you step in inside every single row of the products table which is where one row is one product against that please calculate my sales value once you have calculated the sales value against every single products please give me the max of that now let's just drag that to our Visual and what we get is the best selling product now obviously you can build another pivot table at the product level and you can verify the calculations are right or not but the point that I'm trying to make is that whenever you're trying to do summarized calculation your first attempt should be to take a look at the data model and do you already have a physical table that is present at the granularity at which you are doing the calculation or not if that is there use that but what really happens if you do not have like a dimension table or a master table which is not already there in the model consider another calculation let's just say that I want to find out in the year of the month of Jan the sales of the best Channel like whatever channel was there please give me the sales amount of that now if you take a look at our data model we do not have any Dimension table or a master for the Channel all that we have is the channel present in our sales table and that is when I will start to summarize my data which is where one row is going to be one channel against that I will calculate my sales and then present it but how do we do that are we going to create a physical table right here probably not but we are going to make a virtual table let's just see how can we do that I'm going to go ahead and start to write this particular calculation which is where I'm saying that hey why don't you go ahead in every single row of the values function sales table Channel column what does the values function do if you have a column which has duplicates and sure enough do have duplicates in the channel column of the sales table please remove the duplicates give me a unique stable and then in the unique stable which is where all the channel are listed just once please calculate the sales the first sales number the second sales number and the third sales number whatever that might be and then eventually find the max of that and lay down the answer on the pivot table let's just take a look at the result of this so I'm just going to drag that right here and that is going to be my result obviously you can make another pivot table to verify is this result right or not but I'm pretty sure it is right some sometimes the summarization is going to be tricky that means you may have to summarize it by multiple Columns of multiple tables and then try to find out the answer consider this let's just say that I would like to find out the channelwise bandwise maximum sales value that means for the month of Jan in 2011 please tell me what band band means the category of the product is that a premium product is that a low segment product or a kind of free product that is listed in the products table please tell me the bandwise channelwise best selling sales value so if you have multiple crisscross columns so if you take a look at our data model the band is actually coming from the products table and the channel is actually coming from the sales table we'll have to create a customized summarized table and then do the calculation that we would want to be able to find the max sales value how do we do such calculations I'm going to write this very simple measure right here which is where first of all I will summarize my sales table because for the nth time that I'm saying the sales table is at the granularity of the transactions however we would want it at the granularity of the channel and the band so I'm saying hey summarize the sales table please bring it up to the level of Channel please bring it up to the level of product band once that is done uh in the summary table for every single row that has been created which is nothing but the concatenation of unique's band and the Channel please calculate total sales and then just find the max of that now let's just take a look at the results right here if I just drag that and put the results right here actually gives me the answer right here and again if you are obviously a pivot table ninja you can fire up a pivot table try to do these calculations and validate if these answers are right or not all right that's been it those were a few tricks on summarizing the data first before you carry out your calculations take a look at the granularity of the transaction tables that you have and obviously if the calculations don't match that you now have a few ways of summarizing the data up to the granularity that you would want and then carry out whatever calculations that you would like to do obviously they can be more complex examples but this was a good head start for me to give it to you should you have any questions please put them down in the comments and before you go I have a brand new course on M although we spoke about DX in this video but I have a brand new course on the M language and if you would like to join that the link is there in the description of the video there are a ton of people who have joined the course they have liked it a lot and I would be glad to welcome you inside of the course all the details of the course are there in the description of the video please check them out and I'd love to see you on the other side thanks so much for watching this and I'll catch you guys in the next one bye [Music] now
Info
Channel: Goodly
Views: 13,378
Rating: undefined out of 5
Keywords: Chandeep Chhabra, Goodly, Desi goodly, Power BI tutorial, data summarization, data model, granularity, average units sold per day, best selling day, sales of best selling product, M language, Power Query, M course, Power BI tips, data summarization techniques, Power BI calculations, Power BI techniques, Power BI tricks, Power BI tutorial for beginners
Id: r6rGD1JQvKk
Channel Id: undefined
Length: 12min 51sec (771 seconds)
Published: Thu Nov 09 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.