Configuring Incremental Refresh in Power BI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone in this video today we're going to look at an exciting feature which is incremental refresh in power bi so if you're into that stay tuned [Music] in today's video i want to talk a little bit about incremental refresh so what is incremental refresh to begin with incremental refresh allows you to refresh the data set in power bi and instead of refreshing all the data that you have you can actually refresh a part of that with of course as a benefit that your refreshers will be quicker your refreshers will be more reliable because the power bi refresh doesn't have to work for such a long time and on top of that you don't use as much compute resources because of course you only need to refresh a part of that now before you can work with that you need a pro license at least you also need to make sure that the data that you work with supports query folding so for example a sql database any sources that support query folding that in fact support transforming the steps in power query to a sql statement for example now if you have those requirements you can actually get started with incremental refresh and let's just have a look at how that could work so let's imagine we're working with a file like we're looking at right now this file if i go to my screen here is a file that's 150 mbs and i actually don't have so much data in this one i have a lot of rows but if i look at the data set there's just three columns here and the size is mostly because of the unique sales keys that are in there so let's say you find this too big maybe it's not 150 b's maybe it's even 10 times as much then you could look into incremental refresh if you fulfill all the requirements that i just spoke of and how can you configure that to configure that you go to uh power query first of all and in power query let's see in power query you need to have a table that contains a date because incremental refresh needs to have a date column to actually apply the policy on now the table we have here has a date key which is actually a date time key so that works just perfectly fine and if you want to get started with incremental refresh you're going to need some parameters and the parameters need to be with a certain name so that incremental refresh recognizes this so let's start by creating some parameters so you can go here to manage parameters then you click on new and the first one can call you can call it range start and you can't just call it rainstar you have to call it rainstart because otherwise it won't work and make sure this one is of the date time type and then you can just type something like uh maybe the start of your period should be the first of 2022 the first of january then you want to make a new one which is going to be called range and and your range and one is to also have the data type datetime and perhaps this one is going to be up to today but it doesn't really matter because power bi will later on when we configure the settings will decide themselves what dates to pick so you can just fill in the date here that you like so i could just say like maybe around today the first of may perfect now if you click ok here you'll find that these range parameters are created and even though i only fill in the dates you'll find that there's actually a date time here now so that's all that you need now what's left in power query is that for all of the sources you're going to use you need to apply a filter so the incremental refresh function will actually use that filter later on to make sure your dataset gets smaller so the easiest way to do that is to click on the column on the arrows take a daytime filter and then in the bottom it says custom filter in your custom filter uh you can do something here with the pop-up box so you could say that first of all the date key has to be after or equal to and then you select your parameter right there so it needs to be after or equal to your range start and it needs to be before and then you can select the other parameter which is range end now you can press ok exactly so the step that we created here actually performs the filters and one thing i recommend you to do even if your data source supports query folding double check if the step that you just filtered on also still is folded to the server and you can do that by right clicking on the step and seeing if that view native queries tab the two before last one is still available i'm not grayed out so if i click on this one i actually see that all these steps that i have here were translated to sql query and the last step right there is going to help help us with the incremental refresh so make sure this works and actually if it's grayed out perhaps you can move that step a little bit earlier because the order doesn't matter as long as the query folding works now with this in mind we can press ok and refresh the dataset let's fast forward this a little bit and with that we now have a data set that only has the data within the date range which is the the whole of 2022. okay sure that works and if i would save this file right now one of the benefits of course is that our file size will decrease and if i look up here it's now 17 mbs that already helps if you want you could even reduce the data more now the next step is to actually configure the incremental refresh for that you can go to the data tab and in the data tab there is a table here and that table if you want the the schedules refresh to be configured on this one your right click on it and you go to the incremental refresh part and this will make a pop-up appear that shows all the incremental refresh policies okay now the first step is to select the table which you want to apply your filters on so if you have multiple tables where you want to configure your schedule refresh you can select all the tables here and apply a different refresh policy on each one of them now let's see how it works for just this one so we click on incrementally refresh the table and the first step looks for us at where we need to start archiving our data so how long do we actually want to get data in the model so let's say we have five years of data and actually we just need to see three years of that because everything all that we don't even look at then what you could do is you could write three and then say that you want the last three years here so it now says in the bottom right there the three years before your refresh date that's how long it will archive everything and if you want to see the exact dates then hold on because we will see that right now now first we're going to do the next step and that is how which data do we want to incrementally refresh so we actually archive up to three years ago but when do we actually need to refresh our daily data so maybe you only need to see data that is newer than two months ago because you know that everything is older than two months will never change anymore in your database well this is the part where you specify that so i could say we just need two full months and after clicking this you will find that power bi now shows that your archive is the first of 2019 until today and uh or did and then at some point down here you will find that there is a period of two months after that archive that will be refreshed on every refresh so this is very helpful that this date series shows you what happens these are the fundamentals okay now there are some more options down here some optional settings and i won't go into too much detail about them but this first one is about using direct query together with incremental refresh so that there's a bit of history that's archived and then there's some direct query which is always fresh but you need a premium per user license for this or you need premium capacity so that's for another video then something else you could select is to refresh only complete months you might be working in the financial administration and only after the months are fully booked it makes sense to show the data so this helps with that and lastly there's another option that says detect any data changes and this works in a fun way as well because some databases actually record what the mutation date is so let's say that you have a data set and you can historically change your records you might have a column that shows when a record was changed now with this option right here you can actually refer to that column and say like okay i only want everything with the mutation date that was yesterday or at least the last three days only those to be refreshed and that way you can actually reduce the amount of data that you refresh even more but your data just needs to support this so on a side note this is never the same column as you implement the other incremental refresh on this is a separate column okay i don't have this as an example now but if you're interested in this make sure to have a look at the learn more section and then last but not least in the picture below here you can actually see what refresh policy you have applied and with all this you have actually set it all up and you're good to go so you can click on apply right now perfect now with this set here if you now upload your data set you can upload it to a pro or premium workspace so i can go to home publish save your changes and in this case i just want to publish it to my premium per user workspace but know that you can also do it to your pro workspace there we go let me fast forward this for you there we go so now we can actually open our workspace there we go and up there in our workspace we can now see that only the following months are showing so there's january until april 2022. now what is happening in the background so the first time that you upload your incremental refresh uh power bi will actually set the partitions so if you have a very big data set it's going to make some smart partitions so you can actually see hey this year's is going to be refreshed and these years don't have to be refreshed so it might mean that your first refresh that you're putting on on there it might take a little bit longer than you expected but after all the partitioning is set and the environment is built for you incremental refresh all your refreshes will be quicker so in this case we all just saw that there were only four months of data and let's see what happened after we refresh our data here okay so at this point our data is refreshed and let's have a look at what that looks like in our in our report so we refreshed all of this and if we now go to the the report that's connected instead of just showing 2022 it now shows the history that we configured now my data set only contains data until 2020 but even if you configured it differently for your set you should be able to see it here okay that's how it works so every subsequent refresh from now on will only refresh the part that fits within the policy of your incremental refresh now there's a few limitations that you need to think about because if i go to my workspace now the first big disadvantage is the following if i want to download my file it's not possible anymore with an incremental refresh policy so if i click here you'll see the pop-up in the top here that our file cannot be downloaded anymore because it uses incremental refresh so have a good look at what that means for you in the future if you want to make a change you actually need to have saved this file somewhere otherwise you won't be able to make a change and after you make the change locally you'll have to re-upload it and it might have to apply the incremental refresh policy over and over again now that's mostly happening if you're working with a pro workspace because in a pro workspace it's difficult to directly connect to your data set maybe if you work with the api it can work but not with something easy as tabular editor now in a premium per user or a premium capacity workspace on the other hand you're actually able to still change the data set and i think for those purposes it's much easier to work with incremental refresh let me show you how that works so if you want to work with data sets that have incremental refresh and at some point you need to edit the data model or create some measures you can actually go to the settings pane of that page you can click on let's see premium and it shows a workspace connection down here so the workspace connection that shows on the settings of your workspace you can use that to connect to tabular editor so if i copy this i can go to tabular editor click on file open and then i click on model from database and i just paste the link in the top here for it for the server now i'm going to click ok here log into my account and after doing all that and selecting my incremental refresh we're actually able to see the model that we have here and any changes that we want to make we can actually do it here like delete this measure that doesn't work and i can then up here to save the changes back to the currently connected database and this will then be reflected in the data set online in the power bi service so this is a downside that you should be very aware of before you start going into incremental refresh because not everybody is comfortable with these tools now can i give some other tips here actually when you're working with incremental refresh you might want to apply this to a data flow so you can apply your incremental refresh to a data flow so that is the one that actually gets a little bit of data all the time then you can create a separate power bar report that is your data set which will connect to that data flow and in the future then if you need to make any adjustments you can just do that in the in the data set whereas the data flow can remain the same for that for them then even if you wanted to you could even have a lean report that just connects to the data set so that you have the data flow separate from the data set and the report separates from the data set as well that's the choice you have to make now with all that that's all i wanted to say about incremental refresh it's also for the the data platform 500 certification so have a good look at that and for any questions or remarks let me know in the comments i'd love to hear about it and i'll see you in the next video [Music]
Info
Channel: BI Gorilla
Views: 35,027
Rating: undefined out of 5
Keywords: power bi, power bi tutorial, power bi tutorial for beginners, incremental refresh examples, incremental refresh, power bi incremental refresh, configure incremental refresh power bi, incremental refresh dataflow, incremental refresh without premium
Id: Kui_1G6kQIQ
Channel Id: undefined
Length: 16min 1sec (961 seconds)
Published: Tue Jun 07 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.