Rolling average with working days in DAX

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
ciao friends and welcome to a new video from sql bi in this video i want to show you how to compute a rolling average considering only working days we have already created videos and articles about how to compute moving averages they are not hard to compute because you can rely on dax predefined functions like dates inferior to compute the set of dates that needs to be considered for the rolling average but if you want to add further conditions so you want a rolling average over the last 30 working days then the predefined functions do no longer work you will need to author everything by yourself and so the code becomes much more interesting to author because not only we will need to find those working days but we will also need to solve several small details in the process of writing the code i'm going to show you the full process starting from the beginning because the code is quite intricate so at the end we will write a lot of dax code but hopefully starting from the beginning it will be easier to understand the complete flow and then adapt the code to your specific model and we will see two variations of the same code one that relies on a calculated column in the date table to compute the working day number it works it's fast but it has the limitation that you need to create a calculated column and if for example you are working with a live connection then creating a calculated column is not an option so we will adapt the code to work just in a measure let's start from the very beginning with some examples this is our regular contoso model i just removed some columns to show you how to compute them using dax and i have this report that is showing by date the day of the week and the sales amount we have sales on saturday and sundays but what i want to do is compute the sales amount the average sales amount day by day considering a moving average of the last 30 working days so i'm not interested in computing the last 30 days but the last 30 working days and the first ingredient that we need in order to decide whether a day is a working day or not is a column in the day table that tells us whether a day is a working day or not the column needs to be computed somewhere because uh it's not only that you need to get rid of saturdays and sundays but you also need to get rid of any holiday and the holiday depends on a lot of factors on the country on the region on the year on the business on a lot of different details that we are not interested in worrying about using dax so we assume that you have in your day table a tape a column working day so that tells you whether a day is a working day or not so for example here we have a sunday and sunday is not a working day then we have saturdays and sunday that are again not working not working days and the same here saturdays and sundays that are not working days so we are going to use the working day column in order to decide whether a day is a working day or not now let's focus on the problem by looking again at the content of the table what we need to do is being able on any individual date for example well not on this sunday but on this thursday what if i want to go back five or ten working days well i need to start counting back so from here i need to start counting one then two three i do not need to take into account sunday and saturday so i gone four five six the six six working days before that day was this wednesday and this counting operation needs to get rid of saturdays and sundays the easiest way to accomplish this is to add a new calculated column to the date table what i want is the working day number the working day number is just a number that increments of one every working day and stops incrementing during saturdays and sunday so that when i'm when i know the working day number of a date i can just subtract the number of working days that i want to go back and i will be able to compute the date that i want to use as the starting point for my moving average let's do that together an easy way of creating this calculated column is to rely on the rankex function so we can create a new calculated column and with a larger font we call it working day number we can use rank x to perform a ranking over date based on date date if i compute a ranking this way i obtain a number that ranks the dates from the smallest to the largest one actually it's going to work the opposite way you see that it starts from the very beginning with a large number and then it decreases by one because i want to do the ranking by date but i want the order to be ascending by default it's descending and now i have my working day number that is not yet the working day number because it grows also on saturdays and sundays you see that on saturdays and sunday it goes 7 8 and what i want to do is stop counting seven and eight so the technique to do that is just to get rid of non-working days i do not perform the ranking on data but i filter the data where date is working day by using date working day i have only the dates that are working days and my number will stop growing on saturdays and sundays let's hit enter to see the result and now you see that here i have saturdays and sunday that repeat the value 5 and then it goes on increasing on monday twice and wednesday i still have a small problem here you see that the value of saturday and sunday is not the same value as friday what i want is to stop counting i want four four and then continue increasing with five so what i need to do is just subtract one whenever the current date is not a working day so i can just subtract one using if date working day i subtract 1 otherwise i subtract 0. by using this if statement now i have my number that let me see uh it's not working why if oh yes if not date is working day one otherwise zero okay so i have here saturdays in sandy that repeat the number four that is the number of friday and the same happens here i have friday saturday and sunday that repeat the number friday nine and they repeat the number nine this monday is not a working day so it's not increasing again despite working this is using the if statement and it would be better to avoid using if statements in a calculated column but in general reducing the number of if statement increases the speed of your code so i can rely on a different technique because i know that if not day working day is true then it is considered as 1. index 1 is true and 0 is false so instead of using if i can just subtract not working day and the code results are identical so i still have a value that repeats the value of friday to saturdays and sunday both here and here but it is slightly better optimized and this is going to be important not now because in a calculated economy it's not so relevant to have optimized code but it will be later because when we do everything with a measure having code that runs faster is better because that code will be executed all the time now that i have my working day number i can easily compute from the current date the previous working date the pre the date to be used as the beginning of the set of dates that i want to use because if i know for example that the current working day number is seven and i want to go back five working days i need to go to the working day number two which happens to be this wednesday so by simply doing a subtraction i can discover the working day number of the date that i need to use as the lower part of my rolling average and it's now time to start writing the code of our rolling average let's start writing the rolling let's go for a 30 working day average rolling third working day average first of all i need to find the upper boundary of the set of dates that i want to use as my filter and the upper boundary is just the maximum working day so i store in a variable the max working day which is the max of date working day number return and let's return the max working day just to check that we have a good number we place it here and we have the rolling i'm it's not over rolling 30 day average this is just the maximum working days 96 97 97 97 because it's saturday and sunday then 98.99 then i want to subtract 30 from this value so let's create a new variable that is the number of days 30 so we retrieve the minimum working day which is the max working days minus the number of days number of days let's check that we have a good number also here and we do this is the current maximum working days minus the minimum working day the next step is to find the set of dates that i want to use and i can just use these two values min and max working day to find the set of dates so i can create a new variable dates to use and i can filter all day date where well let's go for a table filter date is a small table or date where date working day number is greater or equal than the minimum working day and at the same time the date working day number is less or equal than the maximum working day this table contains all the dates that happen to be between the two boundaries and the last step is just to compute the sales amount in this period and divide it by the number of days 30 days so i can create the result by just dividing what i can do just calculate sales amount sales amount using the dates to user let's make it a bit shorter calculate sales amount calculate sales amount in the dates to use and then i divide this value by the number of days so to compute the average sales by the number of days and then i just return this time the result let me decrease the font a bit because the code is going to be much longer than this so now this is the rolling 30-day average which is not yet perfect it's gonna work it produces a number but this number is not exactly what i want let's format it as a decimal number and with two decimal places and with commas because i want to compare it with the values so this is the sales amount and this is the rolly 30-day average that after 30 days will start to be a better number it's working but you see that it's not perfect look at the value between friday saturday and sunday here we have the value of the sales amount the rolling 30-day average but the same value is repeated here here and here for friday saturday and sunday so i do not expect these numbers to be identical i would like them to be different and the reason is i want the i don't want to count the sales of saturday on friday because friday is a working day saturday is not a working day but we still have sales it cannot be that the value of friday saturday and sunday are the same saturday and sunday it makes sense that they are the same because we do not have sales in sunday but friday and saturday they should be different by only considering the working day number we are adding the sales of saturday also on fridays what we need to do is change it a bit we cannot just use a filter for the working day number the working day number needs to be greater than the minimum working day but the working day number needs to be less than the max working day and the date needs to be less than the max date so we need a new variable to compute the max data let's call it the max date which is just the max of date date and we change at this condition to say i want the date date to be less or equal than the max date max date by doing this small change i now have a better picture you see that the value on friday is a value the value on saturday takes into account another value and the value on sunday is the same as saturday so it's going to be a bit better than before still this calculation has a small glitch because being a rolling average it will continue computing values also at the end of the period and slowly decrease it's not evident if we use just the matrix but it becomes much more evident if we use a line chart so let me move it here let's get rid of the day of the week and transform that into a line chart you see the effect the rolling average is computing a good average and it's showing a behavior over 30 days in a better way but we do have here at the end of the period we no longer have sales but the rolling average is slowly decreasing that is an effect that i want to avoid so i want to blank the measure after the last date with sales that requires changing again my rolling 30-day average a bit i want to compute the max dates ever let's do that here let's call it max date ever which is calculate that the max of day date removing any filter from the model so that is the last date ever in the not max day date max sales order data sales order data removing any filter and once i have the max date ever i just check if the max date is less or equal than the max date ever i return the result otherwise i blank out the measure so to avoid having the tail of decreasing values and by doing that you see that now my chart stops when there are no sales so this code is now working better it's computing the rolling average over 30 days taking into account that you might have sales also in non-working days that might be wrong but the database we have works like this and we want to take into account also this aspect and we got rid of the tail of the rolling average after the last date the problem with this code is that it needs a calculated column so creating a calculated column is not a problem if you are working in import mode with vertipaca but if you are about to create the same calculation in a live connection then calculated columns are not an option and what we can do is create the same algorithm but this time in a measure without relying on a calculated column that requires us to use add columns to create the calculated column as a temporary calculated column in the measure and then perform the entire calculation using the uh using the temporary calculated column instead of using the calculated column inside the model so it's going to become a bit more tricky because we need to change a couple of lines of code and add a temporary table to our measure let's do that together the code is very close to the rolling 30 day average but we will need to change this part because the maximum working day cannot rely on the working day number and the same will happen here we will need to create a temporary table for the working day number and we will need a table containing the working day number computed on the fly so let's start by copying the code and create a new measure that we call rolling 30 working days live right now it's the very same as the previous measure as a decimal number we need a temporary table that has the same code as our calculated column so i copy the code of the calculated column and then in my rolling 30-day average i create a new variable data that contains basically a copy of the date table but this time with the new column so we use add columns over all date data and we add a new column that we call working day let's use working day number and here i place the same code as before that i will need to format a bit rank x or filter ascending rank x okay now i have a date table a dates table that is not working because i'm referencing also the working day so i need to add this date working day now i have my dates table that contains the same structure of the date table actually it contains only the date and the working day column and then it performs the ranking number of days is fine max working day can no longer be computed as the maximum of working day number because we cannot rely on the working day number so what we can do is compute before the max data and the max working day becomes a max x over dates of my working day number column not over all dates but a filter of dates where the date date is less or equal than the max data and we also need to format this uh max x is need to go down okay so now i have the number of days i have the max date the maximum working day everything is computing on top of the dates table so i'm no longer relying on the working day number column then the minimum working day is still correct the max data ever is still correct and these two user is not going to fill the old data but it filter dates where the working day number column my temporary column is greater than the minimum working day and the date is less than the max date these dates to use contains a table that has the date column inside so i can use this as a filter in calculate and this is just going to work i just need to get rid of any filter on the data in case i didn't mark it as a date table and that's it now this code no longer relies on the working day number and if i did everything correctly that should produce the same number as the other measure let's check it it's already a decimal number we can go here take the rolling 30-day average live and you see it's a bit slower than the previous one because it's not relying on a calculated column but it produces the same number and it works just by using a temporary calculation so it does not rely let me see that's quite a lot of code but it does not rely on any calculated column quite a complex formula but having followed from the beginning you should be able to understand it and maybe adapt it to your specific needs as you have seen once you learn docs and once you get used to use tables and do calculations solving a problem like the rolling third the rolling average over a given number of days considering only working days requires some effort but it's a calculation that is not that complex yes at the end the code was quite complex but having seen it step by step from the very beginning you can appreciate that by using dax you combine simple functions to create complex calculations and the version with the calculated column is going to be fast and efficient but requires you to create a calculated column if that is not an option you can go for the measure version that works nicely in a measure it's going to be a bit slower not that slow because it only needs to compute the dates table that has the cardinality of data and data is not quite often a very large table once you have a small number of years in the day table the table is quite small and the calculation can happen on the fly quite easily if you have a lot of data then it's going to be slow and for sure the calculated column version works much better enjoy dax [Music]
Info
Channel: SQLBI
Views: 12,994
Rating: undefined out of 5
Keywords:
Id: mOVDGFyfWYg
Channel Id: undefined
Length: 25min 44sec (1544 seconds)
Published: Tue Jul 19 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.