- Ciao, friends. In this video, I want to show you how to create a calculation that uses an OR condition instead of an AND condition with slicers. Before we dive into the details, let's review together the requirements and the calculation
that we want to obtain. If you look at this report, we have two slicers. One which is filtering the
occupation, only professional and the other one that is
filtering the brand, Contoso. Now, these two slicers
are working together on the report. And the number you see here, 204, represents the number of products sold that are both of the brand Contoso and they have been sold to professionals. So the condition we are using is professional, professional. And Contoso. And there is an AND
condition between the two. This is the default
behavior of any slicer. What we want to obtain is
a different calculation. We want to build this
calculation, the units or. Units or you see shows
a much larger number and the reason is units or, instead of using an AND condition, uses an OR condition. So we are showing the products, which are sold to professional or that they are Contoso. Now, this is not the
default behavior of slicers. So in order to obtain this goal, what we will need to do is, write some DAX code that gets rid of the default behavior of Power BI and replaces it with our own logic. Before we go farther with the video, a big warning for everybody. In this video, we are going
to show all the details of how to write the DAX code. So if you are in a hurry and you are just interested
in looking at the code and implement it in your model, well, you can just go to the SQLBI website, download the PBIX file, look at the code and you are just happy. If you are interested instead in seeing and learning the whole process of writing the different
versions of the same code and how we obtain the last piece of code, then follow me and we
will start from scratch, build a calculation, do a lot of reasoning in the meantime. We will see good solutions, bad solution, intermediate solution, a lot of DAX code. Ready? Let's get started. Let's start by writing the first solution. The first solution is easy to understand. It uses SUMX, iterators, and comes with a lot of limitations. But it's easy to understand and it's the first step in
obtaining the final version. So let's get started. We need to build a new measure. I already called it or number one and what we need to do is first of all, retrieve the values of the occupation, retrieve
the values of the brands because we will use that in our conditions and then we iterate over sales in order to compute our value. So we start saving the brands and we take VALUES of product brand. That is the slicer coming with the brands and then we also need the occupations. Occupations. VALUES that is in customer. Customer occupation. Then we need to go for the RETURN. As I said, we are going
to iterate over sales and retrieve all the sales that either have one brand
that is in the selection of brands or an occupation that is in the current occupation. So we do a SUMX over sales of quantity. This just returns the quantity but we want to filter sales where I need to access
the customer occupation. So I need RELATED customer occupation IN occupations. And I need an OR condition. So OR customer occupation IN occupations or the product brand IN the brands. So I scan sales and I
retrieve all the rows where either the customer occupation is in the slicer occupation or the brand is in the brand. That should do the trick. Let's give it a try. I need to take the measure,
drop it in the report, get rid of this and it doesn't work. It doesn't work, it
shows the very same value as the AND version, and the reason is, I'm iterating over sales, so this reference of
sales is still working under the filter context
coming from the report. I need to get rid of it. I could use ALL sales. Let's do a first try. If I use ALL sales, now the number is wrong,
because by using ALL sales, I also got rid of the filter from the year and any other filter from the report. So ALL sales is a bit too much. What I want to do is get rid of the filter from occupation and brand, which is coming from the slicers and keep any other filter. So instead of using ALL sales, what I can do is embed this filter into a CALCULATETABLE that gets rid of the
filter from the brands, ALL customer or the occupation first ALL customer occupation and ALL product brand. So these two ALL are
needed in order to get rid of the filters coming from the slicer, which I will then replace
with my OR condition that appears here. And if I did it correctly, now I see values and these
values should be the correct one. Let me just check that they are indeed. Yes, they are the same
value that I have here. So we have a starting point. Now, this formula works just fine and there is nothing bad
about the formula in itself but it has a very strong limitation. Let's look at it into more detail. In order to compute the sum of quantity, which was my primary goal, I basically had to embed my condition by filtering inside this CALCULATETABLE and this CALCULATETABLE
is inside an iterator. So I had to rewrite sum of sales quantity in order to push the filter inside it. It works, there is nothing
bad with the formula but the strongest
limitation is that it works with sum of sales quantity. What if I want to apply the same logic to another measure, for example? I have another measure, which is the DISTINCTCOUNT
of sales product key. If I try to obtain the OR with a DISTINCTCOUNT or
the OR with any measure that cannot be aggregated
with a simple sum, this pattern, this way of
writing code, doesn't help me. It will not go very far. So I need something different. And that something different
is to replace the iteration, this SUMX with a filter. Think of that. If I take this CALCULATETABLE and instead of iterating over it, I use it as a filter, then I will be able to
use whatever measure because the filter will
be applied to the model and any measure will work just fine. So another way of
obtaining the same result is to copy this entire code and write a new version of the same code. Let me paste everything
and call it or number two. With or number two, we do not want to use SUMX but we want to keep everything else. So this CALCULATETABLE is fine but I will not use it to iterate. I will use it as a filter. So I add a CALCULATE around it. And I compute number of units, which happens to be exactly
the sum of sales quantity and this entire CALCULATETABLE now works as a filter, which is added to the filter context, replaces
the entire filter on sales with the new condition. This version is very
close to the previous one. It has its own drawback but first of all, let's
check that it actually works. We can put or number two here. The number produced is the very same. What is the problem, the main
problem of this solution? Well, the main problem is
this filter over sales. I'm filtering the entire sales table and when I mean the entire sales table, I mean the entire sales expanded table that is extremely large. So the optimizer might be able to work on my code and
product something good but it is very likely
that it will not be able to optimize it the best way and so my code will be suboptimal. Suboptimal basically means slow. Filtering sales, filtering an entire table is quite always a bad thing. It would be better to restrict the filter to only the columns
which are actually needed for my model. And if you look carefully at this code, I actually don't need
the entire sales table. What I need is a subset of the sales table that only contains the customer occupation and the product brand. So I can replace this filter sales with a filter that
retrieves only the brand and the occupation. Now, brand and occupation, they belong to two different tables. So it will not be easy. I need to build a table that contains the occupation and the brand but I can do that. Instead of filtering sales, I can use SUMMARIZE. And I use SUMMARIZE to
group it by customer occupation and product brand. Now, this SUMMARIZE returns
only occupation and brand and by the way, I just noticed that I'm creating a new, I'm modifying the measure. What I wanted to do was
creating a new measure. So I just copy the entire
code in a new measure. This is the same code as before, but I need to call it or number three. So I SUMMARIZE sales
by occupation and brand so that the result of this SUMMARIZE only contains two columns, occupation and brand and I can still work with my condition. Everything is embedded
inside CALCULATETABLE. So this code is very
close to the previous one, apart from the fact that it doesn't work. Yes, because I'm still using RELATED, which is no longer needed. Neither here nor here. Earlier, I was iterating over sales, so I needed RELATED to
go through relationship. Now I'm iterating over a temporary table that I built on the fly. So there is no need to use RELATED. Now, this code, let me
reduce the font a bit because the code start to be longer, this code should produce
the very same result. We can get rid of or number two, put or number three. And it no longer has the drawback of iterating over sales. This is or number two, or
number three uses SUMMARIZE. Again, this code is not perfect. I mean, not perfect doesn't mean it's bad but it has one drawback. It requires one scan of the sales table,
which is happening here. The group sales by occupation and brand. And another scan of the sales table to compute the units. So at least I will need to
scan the sales table twice. Once to build the temporary table containing occupation and brand and a second time to compute my units. This might take a lot of time depending, of course, on
the size of the table. I can actually optimize it or at least, change the
way the code is computed because this SUMMARIZE
scans the sales table but I do that because I
want occupation and brand. What I could do is different because what I want to
do is replace the filter on occupation and brand, so I do a SUMMARIZE
inside a filter over ALL. I can actually build a temporary table that contains the cross
join of occupation and brand and then use that instead
of this SUMMARIZE. This will save me both the CALCULATETABLE and the SUMMARIZE with the
scan of the fact table. So I can copy again everything and build a new measure. It is or number four. Instead of iterating over SUMMARIZE, we iterate over ALL customer occupation. Actually the CROSSJOIN
of customer occupation and product brand. Missing a comma here. And now that I have put ALL here and here, I can actually get rid of
the entire CALCULATETABLE because I no longer need
to get rid of filter, I'm just ignoring them
using the CROSSJOIN. Now, this code is shorter than earlier, so shorter is typically good. I don't know if it's
gonna be faster or not but at least it's a different way of writing the same expression. It is or number four and it might work. Let's give it a try. Let's get rid of or number three, or number four. You see it still produces the same result. So we moved from the first version that uses SUMX, bad, because using SUMX I have to rewrite sales quantity. And we created a second version that removes the usage of SUMX
and instead it uses CALCULATE with the table iterated as a filter. Then we use SUMMARIZE in order to reduce the number of columns which are used in the filter in order to speed up the calculation and make the code better. And then we finally ended
up with this version, which is shorter. It has the advantage of
working with a smaller table. And it's kind of a good solution. But we can actually do
something different. I don't know if better
or worse, but instead of using this version of the code that embeds the OR condition, you see that we are
basically building the table that is used as a filter and we can build a table
also in a different way. To understand that, let's go back to the drawing and let's
reason a bit on top of it because what we want to obtain
is professional or Contoso. So let me clear some of this stuff. Now, how can I obtain
professional or Contoso as a table, as a set
just using set functions? Well, actually, if you think about that, you can think of the
problem in a different way. When occupation is professional, let me clear everything. It's probably easier if
we just write it down. Okay. When the filter on the
occupation contains professional, any value for the brand is fine. So my condition can be occupation equal professional. And if the occupation
equal professional is true, then regarding the brand, I can use any value. So if the occupation
contains professional, then the brand can have any
value, it doesn't matter. And at the same time, if the brand contains Contoso, so if the brand equals Contoso, then regarding the occupation, I can use again any value because if the brand is Contoso, any of the value for
the occupation are good. I can create this set, the one containing occupation,
professional and brand any by creating a CROSSJOIN that contains the selected brand, cross joined with any occupation. And I can build this other set, this one that contains the brand
Contoso and any occupation by creating another CROSSJOIN. And finally, I can union the two together. Doing it this way, I can build the table in a single step, using table operations. Let's do that together. What we want to do is build the table with a simpler step. Let me copy again everything and we build a fifth
version of the same code, which only uses sets. We no longer need those variables. And what we need to do is cross join not ALL customer occupation but VALUES of customer occupation. This is our occupation
that was professional. Cross joined with all the brands. Then we compute the CROSSJOIN of ALL the occupation with
the VALUES of the brand. The first CROSSJOIN builds
professional, any brand. The second CROSSJOIN builds
any occupation, Contoso. And then we union them together. This code is the same, produces the same set as the previous one but only uses set operation. This is or number five. Let's give it a try. We can get rid of or, put five. Again, we still have the same result. So you see that we have
created several versions of the same code and we
ended up with this version that just uses set operations. Whether this is the best solution or not, well, it really depends. It's hard to tell whether this solution is better than the other one. If we want to take a decision and we want to understand which one is better, what we need to do is work a bit more and do some further analysis. So let's draw some
conclusions from all the trip that we did through this DAX code. We started from a first solution that uses iterator with
a lot of limitations. Then we moved to more set-based operation using CALCULATE and the table as a filter and then we start refining
the same code again and again until we ended up with a version that is very elegant because it's short, it
only uses set operators. But the real question is
which one is the best? As of now, it's a matter of personal taste because apart from the first version that clearly had limitations, all the other ones, they kind of work and choosing the best one takes time. So if we want to understand
what is the best one, we will need to start a different trip and understand the characteristics and the features of all the measures in order to choose the best one, depending on the model, the size, the number of brands,
the values of the slicer. But that would take a lot longer and it's probably the
content for another video. For now, go to SQLBI.com
download the file, start to play with the code and maybe, you come out with a different version that you can share in the comments. Enjoy, DAX.