Filter top 3 products with a row for others using DAX in Power BI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
- Ciao, friends. One of the most interesting features of Power BI is the ability to show the top three or top N products in a report, sorting them by one of the measures that belong to the report. One of the most common questions, once people start to use the feature, is what are the sales of all the remaining products? If you're showing only the top three, then you also want to know how much all the remaining products sold. And in a previous video, I've shown how to create a query using DAX studio that solves the problem by adding an additional row, others, the result of the query that contains the sales of all the remaining products. Now, the goal of that video was to show how to write complex code step by step. That said, most of the comments to the video were, "that is cool, if I can use DAX Studio and build a query, "but what if I want to obtain the same result in Power BI?" That's why in this video, I want to show you how to create this. This is a Power BI report that is showing the top three products. So, the top three ordered by sales amount, and it also shows an additional row, others that contains the sales of all the remaining products. It's a report, so it's dynamic. It will work with any slicer, any filter that you apply to the report will work automatically and update the report automatically. And you can also choose whether you want to show the top three, the top five, the top N or whatever number. So it's completely dynamic and works straight in Power BI. We will need to write a lot of code. And I want to show you in the video step by step, how to solve the entire solution. That is why it will take quite some time to solve the entire problem. And I will not be able to go into a lot of details about how to optimize the code or how to understand some of the details about the filter context and the behavior with different, with filters on different columns. If you are interested in the topic, my suggestion is first look at the video to get a general feeling of the technique and understand what we are going to solve. Once you have seen the video once, go to the SQLBI website, read the article, download the file and maybe, follow the video a second time, doing step by step, the same operation that I'm doing in order to re-create the solution on the demo file. Once you have followed all these steps, now you will be ready to implement the same solution on your data model. Now, because we will need to write a lot of code, we start by analyzing the requirements and build a list of all the steps that we need to carry in order to obtain the final solution. So let's start by analyzing the details of the report, searching for the important points. Let me clear everything. And the first thing we need to worry about is how to limit the visualization to the top three. Here we have only the top three products. We will also have the row containing others, but that will be a next step. For now we only want to limit the top three, ordered by sales amount. And because we want to do things the right way, I also created a slicer. So what we will do are the dynamic top 3. And this is the first step. Limit the number of rows in a visual showing only the top three. Let me start writing down the different steps. Show only the top N dynamic. So this will be the first of our tasks. The second step we need to carry, let me again clear everything, is find a way to show this row, others. Others needs to be an additional row added to the data model and you see that others appears several times. We have others here, others here and again others here. So, others is a row that we need to add to the model, but it can not be a product. Because we cannot just add one additional product to our product table named others. That product would have a category, would be sold in a year, would have a brand. So it will not be visible in all those different places. Whereas, what I want to have is a new row others that appears everywhere I want. And I want to control how it happens. So that drives us to the second operation that we need to do which is, add the others row. And I just write out of thin air. Because it will not be a product, it will be some different entity that we need to add to our model. That is not enough, because once we have the row containing others, we will need to provide values. You see that we have a value here, a value here, and a third value here for the row containing others. This value is not... it is not a product, we said that already. So the sales amount that we show in the first rows is the sales amount of one product. But the sales amount that we show for others is a different value that we will need to compute. That drives us to the third point, which is compute a value compute a value for others. So we are now at three points. We need to understand the fourth and the most complex of all these points, which is sorting. If you look at the first three products, they're sorting one, two, three. And the sort order is defined by the sales amount. So the product that has the largest value for the sales amount appears as the first value. But the final value, the value for others, appears as fourth, even though it has the largest of all the numbers. So despite having the largest number, it always appears fourth. And this requires additional code and additional attention to details. We will want to use the correct sort order where by "correct sort order" I mean, a sort order that shows the top three ordered by sales amount and the others row as the last. So the final point will be compute, not compute but sort rows the right way. Where by "right" I mean, the way that we want the data to be sorted. Now we have the list of all the details that we need to worry about. And we will solve the problems one step at a time. So we start from the first one and then we move to the different ones. But we need to pay attention to one important detail that is, all the requirements all the requirements need to work together. And this is the most complex part. Because we will need to find a solution for every individual step, taking into account the fact that we have all the other requirements. I mean, there are different ways you can use to limit to the top N products the elements in a matrix, but not all the features are available to us. Because we will have the other points to solve. Now that we have our shopping list and we know exactly what we need to do, it's time to go to the demo and start writing code together. Now, let's start by solving the first problem, which is filtering the visual in order to show only the top three products. I have different ways of solving the same scenario. I need a bit more space here. The first one is just using the features of Power BI. I can use a filter on this matrix and say for the product name, that I want a filter which is top N, showing the top three by sales amount. I drag the sales amount into the value and once I click on apply filter, as soon as Power BI finishes, OK. I now see only three products, but this is not exactly what I want. I don't want to see one, two, three products overall. I want to see the top three products for each individual category. Now to obtain that, I cannot use just this top N. So let me get rid of this filter. Instead what I need to do is create a new measure, that computes a ranking and then apply the filter to that measure. So we can do that. We can create a new measure that I will call ranking. And ranking does a RANKX over ALLSELECTED product name, based on the sales amount. Let's look at the result of this measure, I can simply add it to my report and you see that it shows one, two, three, four, five for cameras and camcorders. But if I reduce cameras and camcorders, it shows again, one, two, three, four, five for home appliances. So if I limit this measure to show only the three, the value which are less or equal than three, then I will be happy. I can go on ranking and show that I want to see all the values which are less or equal than three. I apply the filter and now I am in a better position because now you see that I'm showing the top three products for each category. I can also get rid of this filter. So now I have one, two, three, and I'm able to show the top three. But what I want to do is a bit more complicated than that. I want that number to be dynamic. So I don't want to show the top three and hard code the value of three in my filter, which right now is here. I want to control everything through a measure. So what I do is I create a new parameter table that contains values from one to 20, and I will use that number dynamically in another measure to limit visibility of my rows. First of all, I need a new parameter that I call top N. It's a number that goes from one to 20 with an increment of one. And I also add automatically the slicer. The engine created a new table top N and I also have a slicer let me put it here. Okay. Now with the slicer I can select a number or use the slicer to change the value. Let's go for three. And once I have selected three, I will change the filter. Instead of filtering the ranking, I will create a new measure called row visible that defines whether the row needs to be visible or not. So let me build a new measure here. We call it row visible. What I need in a variable: the value of the ranking. And I just call it ranking. I need the value of top N value. And I use the measure which has been created automatically by the parameter. A third variable result that basically checks that ranking is less or equal top N value. Result will be true if the ranking is less or equal than that value. And finally, RETURN result. Once I have my variable, I can add it to the report and you see that it shows... let's enlarge it a bit, true for the first rows and then false for all the remaining ones. So I only need to filter true for this row, and I will be able to filter exactly the row that I need. So I add a different filter to this report. I get rid of the filter from the ranking, which is already gone. Okay. And instead I add a filter for row visible that says, row visible need to be equal to. Oh, I cannot change it because right now it's a Boolean value. That's my bad. Row visible cannot be a Boolean value. I need to convert it to an integer. So that zero is false and one is true, so that I will be able to say the value need to be one. And when I apply the filter, you see that I still have the ranking column, which is no longer useful, but most relevant, I have only three rows visible. And if I change this number and I increase it to four, the report changes showing more rows. At this point I can get rid from my visual of the ranking and the row visible column. These are no longer useful. And I solved the first of my problems, which is creating that dynamic ranking row. Now let's see how we are. So far we solved the top N dynamic. So we can mark the top N dynamic because we have been able to show the top N with a slicer. The next step is to add a row containing others because we are able to show the top three. We want to show an additional row containing others. Let's do that together. We have values here. And we already said during the introduction that we do not want to add a new row containing the product because that row will be a product. And so it will have a given brand, a given category. We do not want a new product. We want to add another entity that will be visible depending on what we choose. To do that, we create a new table that contains all the values of the product name plus this additional row. To do that, we need to create a new table. Let's call it product names, where we use ALLNOBLANKROW of product, product name. And this returns all the product names, but I want to UNION to this, a new table that contains only others. Now I have all the values of the product plus an additional row others. If I use this table it will not work because this table is not linked to the remaining part of the model. So adding the filter, adding this column to the report is not going to change anything. I need to build a relationship between product and product names. So I want the product name here to go to product name. So that when I apply a filter on this product names column, it will actually filter the product table. But my product names column also contains the others row that does not correspond to any product. Now by default, Power BI created the relationship as a one-to-one bidirectional relationship. That's not what I want, I want product to be the many-side and product names to be the one-side. So this is a many-to-one unidirectional relationship. Now it's much better, the many-side is product and the one-side is names. I can go here. Now I need to remove from the report, the product name and instead I use this new product name. It's gonna take a while because right now the code which is executing the calculation is using the product name column. And you see it's not working either. Because my ranking column is working with the product, product name, not with the product name, product name. Here I need to use ALLSELECTED product names, product name. And now that I use the correct column in the ranking, it is working. The thing is, I do not see the others row. So I created the others row, but I don't see it. I want to make it visible. To make it visible, I need to force the visibility using the row visible column. Row visible forces a row to be visible or not. And right now it is true whenever the ranking is less or equal than the top N value. We can add farther logic to it, that says well show the row, or we can actually do that with the ranking. It's probably easier. We change the ranking and we check. First of all, we compute the ranking and that is fine. Let me also format the code in a better way. But we also check if the others row is visible or not. So in order to do that, we need to check if is other selected and we check if the SELECTEDVALUE of product... product names not product but product names product name. I need to reduce the font a bit. Product names, product name is equal to others. And then for the result we check IF is other selected so if the row other is the one selected, we return minus one. So it will always be less than three. Otherwise, we return the ranking. Now because the ranking of others will be minus one, it is less than three. And if I did it correctly, we should be able to see no we have an error somewhere. SELECTEDVALUE IF, the syntax for... There is a missing parenthesis somewhere, this is fine, this is fine. Oh yes, I'm missing the RETURN part. RETURN, result. Okay, much better. Now I have my ranking, which should show others. Let me show, let me see what happens I have others here with minus one. I wasn't seeing it because sales amount was not showing anything. But now you see that with the ranking column, I see the row containing others. So I have created the row others. Now I see others in different places. I see it here, I see it here and I see it also here. And it's in the wrong position right now, because if I sort by sales amount, it appears here, because it does not have a value. But at least we have been able to go one step farther. So what we did right now is add a check mark to this point. We now have the others row that is visible in the matrix. The next step is to compute a value for others, because others right now shows a blank. And there, we do not want to show a blank, we want to show the sales of all the products which are not in the top three. So to do that, we will need to update the measure we use, we will no longer be able to use sales amount. Instead we will need to create a new measure that computes the sales amount for all the products, except for the others row. For the others row we will use a different calculation. Let's do that. So the next step is fixing this column sales amount. I cannot use at the sales amount column. I will need a new measure. But let's start by writing a new measure. Let's call it sales amt, which at the beginning is just sales amount. Let me get rid of sales amount and we replace it with this new measure. That's the same number. I only want to format it as a decimal number. I have two things to fix here. The first one is the total. The total right now is the sum of the individual rows because the filter is being applied to the entire calculation. Instead at the total, I want the sales of all the products, because later I will change others to show the sales of other products. So let's start fixing this. I want to see the grand total of all cameras and camcorders here. So I go here and I can define a variable that is sales of all, which is CALCULATE sales amount, and I need to remove the filter from the product name column product names, product name column. Okay, RETURN, now here I need to check if I'm filtering one individual product, I return the sales amount. So IF ISINSCOPE of product name, product name. That means I am at the product level. And then I show the sales amount. Otherwise I show the sales of all. Okay, that should work. And that should fix the total. It is not. Why not, here I have the product name, product name, let me see sales amount. Oh, here I should use product names, product name. Okay. Now I have the grand total here. I have individual values here, which is what I want. I need to change this number. When others is selected, I need to compute the grand total, which I already computed minus the sum of these three values, which are the top N computed, which are the sales of the top N. Let's do that, that requires some time. So we are here; instead of this sales amount we will need to start computing more variables. Let me reduce the font a bit because we need to write a ton of code. First of all, I need to retrieve how many products I want to grab because I already have the sales of all. I need to compute the sales of the top three, but I don't know what this three is. So I need a variable products to rank, which is my top N value. That will be three in my case. Then I need to check if the others row is selected or not. Because if the others row is selected, I will do one calculation. If an individual product is selected but not others, then I will compute the standard sales amount. So is other selected this is the same technique we used earlier. So we check that the SELECTVEDVALUE of product names, always product, product name, product names, product name equals to others. And that I will use later. Now I need the sales of the top three. In order to compute the sales of the top three, I create a temporary table containing the product name, the sales amount. I will extract the top three. And finally I will sum the sales amount of the top three. So I need another variable, which I call products with sales amount, where I use ADDCOLUMNS to ALLSELECTED product names, product name. I need a new column amount, which is the sales amount. Now products with sales amount, contains all the products with the sales amount. I need to extract the top three. So I can take the top three products, and I do a TOPN of my products to rank out of products with sales amount, ordered by amount. Now top three products at this point contains the top three products. And I can compute the sales of the top three, which is a SUMX over top three products of the amount column. Now sales of top three contains the sales of the top three. I already have in a previous variable here, the sales of all. So the last thing that I need to do is compute the result which is, IF is other selected, then it's the sales of all minus the sales of the top three. Otherwise it's sales amount. This is not needed, no it is because I need a RETURN result. And then a comma because the next argument is sales of all. Okay. Now if I did everything correctly, this computes the correct value for the others row. Let's check it. Now you see we have the sales amount for the correct rows. We have a value for others. And we have fixed also the total. So that solves the third of our points. And we can check it. We have shown only the top N dynamic, we added the others row, we computed a correct value now for the others row. We have the last step, which is sorting rows the right way. Now sorting rows the right way is not that complex, but it is kind of tricky. Let's see with the details. Let's see with the demo, the reason why and what we need to do. We are already showing the value here. You see, we're showing one, we're showing the row and we have this minus one. Data right now is not sorted. And if I sort it by sales amount, you see that others appears at the beginning, but this is by chance because of the number. If I sort by ranking, sorry. If I sort by ranking, it will, I sort by ranking the right way descending, it appears also and always at the top because I'm using minus one for the ranking of others. I could use for the ranking of others a different number. I could use four; if I use four, then the number will appear after number three. The problem is the number four is already used by a regular product which is ranked four. So I need to add space to my report, shifting all the products which have a ranking greater than three to a higher value. So that four will be a whole and I will place my others row at the fourth place. To do that, I need to tweak the ranking measure. Okay. Ranking is already nearly correct because it's computing the RANKX. It's checking with is other selected, but here instead of minus one, I will return four. And instead of the ranking, I need to do farther calculation. Because I want to leave the fourth place free. So if I do that, I need to say, well if the ranking is less or equal than the products, I don't have the products to rank. So I need to add a variable, variable products to rank, which is my top N value. Okay. If the ranking is less or equal than the products to rank, then I return the ranking. Otherwise, I return the ranking plus one. Why do I do that? Well, if the ranking of the current product is less or equal than three, I return three. If it is four, I return five. This leaves space to four to be used for the result. So here I can now use four, if the other row is selected. If I hit Enter and I go here, you see I no longer see the others row, it disappeared. And the reason is, I'm using in the rows visible I'm checking that the ranking is less or equal to the top N value. That worked earlier because I was using minus one for the ranking of the others row. Now I'm using four, which is actually greater than three. So I need to leave space for an additional row which I created and I use top N value plus one. And now I have my others row that appears here, which is fine. I have data sorted the right way. And it is being sorted by ranking. The last detail is that seeing ranking here in the report is not really cool. Because I'm showing a column, that is there only for the purpose of sorting data, but it's not really needed. This is not a big deal because I need the column in order to sort the data correctly. But I can simply go here and reduce it so that it disappears. And the result, the final result is that I now have, where is it? Ranking need to be sorted. Okay, one, two, three, four. Now it's nicer and now I can get rid of it. Okay. So I no longer see the column. It's still there but I do no longer see it. And I see three products, Others with the value correctly computed and everything is dynamic. You see that if I change the year, everything changes, the product will change. And if I change of course any value in the slicer, everything works, including the number of products which is visible here. Why I don't see anything here? Oh, I don't see anything, that's my bad. You see if I show two I no longer see the value, because I made a mistake in the ranking. That should not be four, that should be products to rank plus one. Otherwise when I select two products that will be four and it will not be visible. Now, even with two or also with one it still works or four or whatever the numbers are correct. So now we can go to a conclusion and say that we marked also the fourth place. So we have been able to show only the top N products add the others row, compute a value, and sort the rows the right way. As you have seen, quite a ton of DAX code, some mistakes here and there. And as I told you at the beginning, the code we authored together is not super optimized. If you go to sqlbi.com, you will find the article, the optimized code, and the example that you can download and use to practice. An important detail here is that writing this code is not easy. It's kind of hard and requires you to understand several details. The reason why I wanted to show you all the steps to author the code is basically to demonstrate you that once you master the basics, and once you have the right understanding of how to make DAX work, then you can really make Power BI do whatever you want. Power BI offers you building blocks that you can use in order to build gorgeous reports. DAX is one important building block. Now, if you are already familiar with DAX that's fine. This is a new technique that you can start and use starting from tomorrow. If you are new to DAX or if you didn't understand all the details, then jump to sqlbi.com. There we have free introductory videos to DAX that help you start with the right step. So that you do not spend a lot of time trying to understand concepts, which are actually easy if you learn them the right way. Then if you want to move farther, we offer a lot of different courses that can bring you to the next level. That's enough for now. And I hope you enjoyed watching the video. Remember to hit subscribe button, just to tell us that you like these kind of videos and you want to see more. And for now enjoy the article and enjoy DAX.
Info
Channel: SQLBI
Views: 33,520
Rating: undefined out of 5
Keywords: DAX, PowerBI, Power BI, CONT, C0120, Alberto Ferrari
Id: nVvlEHKr_0o
Channel Id: undefined
Length: 36min 46sec (2206 seconds)
Published: Mon Jan 04 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.