When to use KEEPFILTERS over iterators

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
- Ciao friends. In this video, I want to show you how to use the KEEPFILTERS function in order to correctly compute values when you need to handle arbitrarily shaped sets. Now, what the heck is an arbitrarily shaped set? It is a specific shape of the filter context that you can either create by writing DAX code, or because you use a slicer on top of a hierarchy of columns instead of a single column. So as soon as you have arbitrarily shaped sets in your code, you need to pay extra attention whenever you work with them. Because if you do not fully understand how the filter contexts are merged together, it is likely your measures might compute wrong results. And fixing them requires understanding where the problem is and how to solve it. Keep in mind that this is not one of my usual videos. In most of my videos I spend most of the time writing DAX code in order to show you how to solve a scenario, this is not one of those. Here, in the entire video, we will write only one line of DAX code. The rest of the time is spent in understanding why that single line is so important, and will solve our problem. With all that said, let's start by looking together at the scenario. I have a report, which is slicing by year, the sales amount, I actually have the year and the month. And I have a slicer. Now, the slicer is filtering the hierarchy year and month. So out of year and month, I select November and December in 2007, and January and February in 2008. So I have four different months, but in two different years. For the sales amount, everything is working as expected. But then I want to create another calculation which is the monthly average. Monthly average needs to compute the sales amount month by month, and then provide the average of all these values. So I'm using AVERAGEX over the values of the 'Date'[Month], and then call the measure, [Sales Amount]. Because of context transition, [Sales Amount] computes the value month by month, and AVERAGEX takes care of computing the average. If I take this measure, and drop it in the report, you see that the numbers of the month levels are correct. And also at the year level, this number is a reasonable average of these other two values. The problem is of the grand total, 1.7 million cannot be in any way the average of values which are way below one million. So, that is preposterous. I mean, this looks like one of those DAX shenanigans that are carefully designed in order to drive developers crazy. At least, that's one way of looking at the problem. There is another way of looking at the same scenario, which is that maybe, just maybe, we are not fully understanding how the filter context is being built inside the iteration, and some issue is happening because of the code we wrote, and not because of DAX. So, we will need to understand exactly how this filter context is created, and how the calculation happens in every individual part. That requires some time and some theory. Yes, I know, boring theory. But it's the kind of theory that is actually needed in order to understand how to fix this specific problem. And, by the way, also to understand a bit better how iterations and filter context are merged together. We will need to start from the very basics. That is, we start to understand what happens when you apply filters on individual columns, and then we will move on our specific scenario, where we have the arbitrarily shaped set, and we will understand the details of how the filter context is built during the iteration. Let's get started. Let me draw first some details about it. As I said, we need to start with two filters on two different columns, and this is exactly what I have here. I have one slicer that is filtering the year, and another slicer that is filtering the month. Now, these two slicers place filters in the filter context. The first slicer applies a filter on the year column, and filters 2007 and 2008. This is a filter applied to the filter context. The same happens for the month. I have the month column, for which I select January and December. Again, another filter placed in the filter context. Now that I have two filters in the filter context, which roles will be visible in my date table? Well actually, when DAX needs to compute this value, it does a CROSSJOIN, so it builds the CROSSJOIN of all the possible values. So, this will generate a filter context that contains both the year and the month. Actually, it does not generate a filter context, but it makes visible in the filter context roles from the date table that contain 2007, January, that is a good role. And 2007, December, again, that is a good value. The same happens for 2008, both January and December are good values. So, these two filters are equivalent, the green filter and the blue filters, they are actually the same. And this is a scenario that happens when you have simple filters. Whenever you have filters on one column only, this is the scenario you have, and everything is simple and easy to understand. The problem is, in our scenario, we do not have a simple filter. We actually have an arbitrarily shaped set. Let's see that in more detail. This is the report we have, and you see that the slicer is not filtering two columns separately. It is filtering the year and the month together. So, November and December are visible for 2007, whereas January and February are visible for 2008. This is what is called an arbitrarily shaped set. And arbitrarily shaped sets creates this problem. In order to understand where the problem is, and to debug it, we first need to focus on the cell that contains a problem. These numbers are correct. All these numbers as we said are correct. The problem appears here, 1.709. Here is where the value is wrong. So, in order to understand exactly what is happening there, we first draw the filter context that is active in that cell, then we follow the code step by step in order to understand how the code is being generated. So, let's start by creating the filter context. I need to show you the details. The filter context as we said is generated by this slicer. Let's draw it. So the filter context contains the year, and it also contains the month, both in the same filter. Now, for the year we have 2007, and for the month, November and December. So we have 2007 November, and 2007 December. Whereas for the 2008, we have January and February. We have January and 2008 February. Okay, now, this is the filter context that is active in the cell where we have our problem. Now let's look at the code. First we have AVERAGEX. AVERAGEX is an iterator. Being an iterator, it iterates its first argument, and its first argument is VALUES of 'Date'[Month]. Let's highlight it. What does VALUES of 'Date'[Month] return? Well, we can just look at here, we have one, two, three, four different values for the month. Therefore AVERAGEX will iterate four times, once for each month. Let's say that we are positioned here, so the row context is pointing to the first month, November, and we start computing our measure. Now, our measure is [Sales Amount]. Because it's a measure, we know that CALCULATE is there. Inside the measure we are using CALCULATE, just because it's a measure, and CALCULATE is always there when we use a measure. CALCULATE among the many operation it does, it performs context transitions. So it will take the row context which is pointing to November, and generate a new filter context that contains November. So we have month, November. And this is the regional filter context, and this is the filter context... after context transition. So, we have one filter coming from the slicer, a new filter coming from CALCULATE, and the engine now needs to merge the two filters together in order to generate the new filter context under which it computes sales amount. How does the merging of two filter context happen? Well CALCULATE says you are filtering the month in the new filter context, remember that this is the new filter context, you're filtering the month in the new filter context, and it happens that you are filtering the month also in the old filter context. So in order to merge the two contexts, because the new filter context has the option of overriding the old filter context, CALCULATE will remove any reference from the old filter context and then merge the two together. Now, what happens when the two filters are now, what is the final result of the filter context? We have a filter on the year, which is what remains of the old filter context, that filter 2007 and 2008. Any reference to the month disappear from here. And this filter is copied as it is, we have the month, which contains November. Now, this is the new filter context which is being used when [Sales Amount] is being computed. So during this calculation here, we are using this filter context that filters two columns, the year and the month. Now, what happens when you filter two columns? Exactly what we have drawn at the very beginning. You have the CROSSJOIN of all the possible values, which in our case, means we are seeing 2007 November and 2008 November together. The original shape of the filter context that was filtering November only in 2007 is completely lost. What actually happens during the calculation is that the engine goes on the first month, let's say now it's January, and it computes this value, the sum of 2007 and 2008 together. That leads to this value. Then on the second iteration, it does this operation, then it does this other operation, and finally computes this operation. So the numbers which are being aggregated are these numbers, which lead to an average which is 1.7 million, that is the same value we have seen at the beginning. So the problem here is not DAX. The problem is that we started from an arbitrarily shaped set, and when calculated during context transition, override operator to replace the month part of the old filter context, it destroyed the arbitrarily shaped set and transformed it into a regular set. So we no longer have information about the relationship between the years and the months. That is the problem. Now how do we solve it, how can we make sure that the original shape of the arbitrarily shaped set is not lost, does not get lost during the calculation? Let's go back to our original scenario. This is just what we wrote earlier, it's just more readable. Let me enlarge it a bit. This is the original filter context, let's say original filter. And this is the new filter. What we need to do is to tell CALCULATE, "Hey, when you will merge the two filter context together, please do not override, don't do that. Avoid doing this operation, but instead, just sum together the two filters, so that these filters is used as a single filter that keeps the old filter context and only adds the new filter context to the filter." So what we want to do is to keep this filter, and add this filter. There is a specific function that does it, which is KEEPFILTERS. KEEPFILTERS tells CALCULATE that we do not want to override the old filter context, we want to keep it, and then add the further filters without removing the old filter. As soon as we use KEEPFILTERS, as you will see, the problem will disappear. Let's do that. I still have my monthly average measure here. Now I don't want to override this measure, so let me copy the code and we create a new measure, that we call Monthly Average KEEPFILTER. And here is the first problem. Where do I add CALCULATE here? Where do I add KEEPFILTERS? There is no CALCULATE anywhere. Actually, we know there is one CALCULATE very well hidden here, but we would like to add KEEPFILTERS here, but KEEPFILTER is a CALCULATE modifier, and CALCULATE is not visible here, it is hidden inside context transition. That is why KEEPFILTER can be used as a CALCULATE modifier, but also as the top level function in iterators. If you use KEEPFILTERS that way, this KEEPFILTER will change the way context transition happens when iterating over the values of the 'Date'[Month], and this KEEPFILTER will be actually used here where we have the measure reference. Let me just hit enter, then we need to select the measure, and format it, I just want to use the decimal format. Now I can put my measure here, add commas, and you see that now the numbers are correct. Let me get rid of this filter. Now the numbers are correct. I see 768, which is the correct average of all the values which are shown month by month. So KEEPFILTER solved our problems. Now, I know, you are asking yourself when to use KEEPFILTERS and when not. Answer to this and a lot of important details about using KEEPFILTERS with arbitrarily shaped set is present in the article on the SQLBI website. So if you find this topic interesting, make sure to go to the SQLBI article, read it, download the file, and start playing with it. In general, think that whenever you have a problem with DAX, or whenever the numbers are not correct, there are no dark forces in action. It just that, you probably have not understood well how the filter context has been created, or the specific calculation that is creating your issues. So whenever that happens, you need to do exactly what we did here, slow down, take a long breath, maybe a cup of coffee, and then write down the filter context at the beginning of your form, and then step by step, follow how the filter context is being created. The answer to your wrong number is likely to lie somewhere in the process of building the filter context, and fixing is only a matter of knowing how to make a filter context do exactly what you need. Enjoy DAX.
Info
Channel: SQLBI
Views: 12,226
Rating: 4.9850469 out of 5
Keywords:
Id: bGVLguWf4Ls
Channel Id: undefined
Length: 18min 56sec (1136 seconds)
Published: Tue Apr 27 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.