- Hi friends. In this video I want to show you how to analyze a slow query in a report. In this example we have
many visuals and with Performance Analyzer I can
capture the information for the execution time of each visual. This way I will capture the slowest one so I start recording and I
click on refresh visuals. At this point an important
thing to do is highlighting what is the query that has
the biggest execution time because it's what is causing the slowest visual in the report. And in order to retrieve
this I have to sort by DAX query time and I have to
sort by in a descending way. Doing that, I know that this matrix is what is the slowest DAX
query in the entire report. And of course there is
also other time to complete the rendering and other but now
we want to ignore it because our focus is we want to test
the execution time of this DAX query and we want to
possibly to optimize the code. So in order to do that I
can copy the query here and once I have copied
the query in the clipboard I can paste the query in DAX Studio. I connect DAX Studio to Power BI Desktop and now when I paste the code here, this is the very same query
executed by the visual that is displaying the
matrix in Power BI Desktop. At this point you can click run and you run this query in DAX Studio. You can see that this query
shows the result which is the first thing you can
see in this pane but another important
consideration is how much time we wait for executing the query. This important information is
contained here in the output. You can see here that this query has been executed in 340 milliseconds. First of all we have to
execute the same query multiple times in order to make sure that we are not measuring
some noise caused by some other program running on
my computer for example. So in order do that we can
click run multiple times. One two and now I go back
to output and I can see that the duration of the query is between 275 and 340 milliseconds. At this point I can also consider, wait a minute, am I executing the query in the worst possible condition? In order to do that, we have to make sure we don't have any data in the cache and this can be obtained,
and it is a best practice when you analyze the query in DAX Studio, to run the query checking this option which is Clear Cache and then run. So with a single click, each
click will clear the cache and immediately after
execute the query again. Let's try to do this another three times and see what happens,
so one, two, and three. Once we complete this and
we go back to the output, we can see that this time, the execution time is considerably slower. We are at one second and a half, 1,500 milliseconds, 1,577 and 515. So you see that executing the
query in the worst possible condition also stabilize
better the execution time. Which is important when we want
to analyze the performance. At this stage we know that
this query is considerably slow especially when it is
executed with a cold cache. So the next thing to do is
gathering more information about the query plan and
the execution timings. These two options, query
plan and server timings are what I want to use in order
to capture this information. And so I click on query
plan and I click on server timings as soon as
this operation completes and when you see here in the output, query trace started after
you clicked on the button, you can start your next
execution with run. Now as you see I don't look
too much at the result, at what I get in the
result, I'm more interested into the execution time in the output. But at this point you see that
we have other two panes here, query plan and server
timings that contains additional information about my execution. So let's see the query plan. The query plan is the actual
query plan of the query. We have a physical query plan here and a logical query plan here. As you can see, this information
is relatively complex. If I scroll down just in
the physical query plan which is probably the more
interesting part of the entire of the two query plans we have available. It's very long and believe me, I don't use this information as the first way to optimize the query. If I can I try to move to the other pane server timings here and
server timings has a number of information that are
related to the execution time of the query with some number
that is important to know. So let me describe what we have here. Total is the total execution time, how much time we wait. This is the same amount
of time we have seen in the output window
but this time it's split between formula engine and storage engine. And the formula engine
is the time spent waiting for the formula engine and the SE is the time spent waiting
for the storage engine. So the sum of these two numbers
correspond to the total. Now usually a DAX expression should be evaluated as much as possible in the storage engine and
then in the formula engine because in theory the storage engine is faster than the formula
engine but these details are something that you need to understand looking at how the VertiPaq engine works. There are other videos and
other courses for that. Now let's just focus
at the very high level about what is the problem of this query. Once I realize I have a
problem between formula engine and storage engine I can also
look at the other information like the number of storage engine queries, which is the number of requests made by the formula engine towards
the storage engine. And here you have the
details about each single storage engine request
made by the formula engine. So you see that we have
already a complex situation, a complex query plan
and a lot of information about the storage engine. So let me spend some time looking at the query that we are executing. Now if you look at this query, you can notice that this is some DAX code that you might not have seen before because this is the
syntax of a query in DAX, whereas you might be used
to writing DAX measures. And what is the DAX measure used here? Actually I know that this is customers, the customers measure that
we have in the report, but the easiest way to
find this information is going back to DAX Studio, clicking on the matrix here. This click highlights this visualization and because I selected this visualization I have here in values, customers, which is the only measure I have. What can I do at this point? I can include this
measure here in my query by searching for the
measure Sales Amount here. You see that I can click here, I can click on sales amount in this list and I can right-click and
click on Define Measure. When I do this, the definition
of the measure is included in my DAX query and I
can change, for example, I can format the query so
that it is more readable. This is a good idea. And once I have this I can take
a look at the sales amount. I should have used customers
so let me do this again. Customers is the measure
I want to include. I right-click here and I include the definition of customers. I click on Format Query
again and now I can focus my attention on this definition
of the customers measure. Why I want to have the
definition of the measure here? 'Cause now if I want to test
the behavior of the measure, changing something I will
just change the code here, I click run again and
I test the same query with a different version
of my measure customers. And this time I can tell you this measure is easy to optimize. Why? This query is violating one
of the best practices in DAX. Filter columns don't
filter tables in CALCULATE. And what are we doing here? FILTER sales. So I can tell you this is
an error following the best practices in DAX and the
right thing to do is to filter just the column
Quantity greater than zero. But in order to not change
the semantics of this query, I have to make sure because I
was filtering a table before, I have to include this
condition in a KEEPFILTERS. This way I'm sure that I
will obtain the same result, but in case you're not, pay
attention before executing the query, you should
go in the results pane. You should take a look at
the results of the numbers you see here and in the next execution we have to make sure we
have the same numbers. Let me try. I click run now, remember I am clearing the
cache and then run, and I have an error here because
I probably forgot something. Yes because I removed the
filter before and I still kept the ending bracket of the
FILTER so now I fixed the code, I run this again and if I go to results, you see that I have the same
number we have seen before. You can go back to view the
video to check the numbers are the same but believe
me, they are the same. Now if I go to the server
timings I discover that all of a sudden, my
query has been executed in 12 milliseconds instead
of one second and a half. And why this? Because my best practice application to the formula, reduced the
time for the storage engine, for the formula engine, the number of queries and
the overall execution time is now down to 10 milliseconds. So what should I do at this point? I just have to select this
measure here in DAX Studio, copy and paste this measure as
the new version of customers, of the customers measures in Power BI. Enjoy DAX!