Incremental Refresh with Power BI Premium

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yo Adam Saxon here we guy in a cube and today we are gonna talk about incremental refresh inside a power bi you guys ready let's do it if you're finding us for the first time be sure to hit that subscribe button to stay up to date with all the videos like I said we are at Microsoft built conference and I ran into Christian Wade you guys may know him he is the clicky clicky draggy droppy master and he said he was he was kind enough to say that he was gonna teach us the clicks for incremental refresh and we're all gonna learn the clicks for this it's Christian how are you doing I am so well thank you so much for having I'm glad we could find the time to do this this is great people are gonna love this so incremental refresh yeah where do we start I mean where do we start it's a huge feature yes you know where it it's huge cuz you can get to huge models rightly it's huge in so many ways yes so many dimensions so you know incremental refresh is something traditionally reserved for analysis services for high-end scalable large models in analysis services and we're bringing it in to power bi premium and in the backend we actually use the same technology but we've condensed all of the complexities of how you set up incremental refresh you no longer need to write two thousand lines of code it's a simple dialogue with a handful of dropdowns that's how we roll that's how we roll I love it love it all right do you want to show us how this work to show you let's get into it let's get into it the deal with incremental refresh I mean it's pretty simple right we have a large data set potentially beyond a gigabyte potentially even beyond the 10 gigabyte limit that we have empowered the AI premium today because we intend to lift that and we intend to have power bi premium limited only by the premium capacity size excellent which will then bring it in line with Azure analysis services level scalability yeah and the largest capacity is 400 gigabytes so you know on the azure analysis on our generic services and we will get comparative sizing in power bi premium once we lift that limits now even over a gigabyte it's very difficult to perform a refresh today when you have to refresh all of the data right every time you do a refresh right yeah because we it's all or nothing right we just gotta read fits 10 gig model we got to refresh all of that right which can take a while and it's just inefficient it's like a every time you refresh a data warehouse you run an ETL job in a data warehouse you don't reload all the data right doesn't makes no sense whatsoever we we're not lazy no we're just efficient we're if we're efficient as a yes we're efficient so if you have maybe a five-year data set and you only need to refresh the the last 10 days because that's all the data that can change it but why refresh all five years I just want I just want the 10 days or the one day yeah much better much more efficient and more reliable does it require long-running connections to volatile because they could timeout on exactly great we don't want a timeout you know especially you know over HTTP you could have token refresh issues and all sorts of things yeah it's more reliable it's more efficient and it uses up less resources love it yes love so I have an example here this is a subset of the full New York taxi data set which in its entirety is about 2 billion rows and here you can see that the first time that the refresh was performed on this data set it took about an hour and 10 minutes because it loaded all the history didn't have any of the history so the first time you perform the refresh in this load all the history right yeah yep a subsequent refresh then took 15 minutes instead of loading years of data it loaded say five days of data which is too way better that would cover us for any data changes anything I look older than five days is set in stone so we don't need to refresh it so as you can see we're getting much faster refreshes as in this example here that's after you publish it to the service right before you publish it to the service we got to do some so you need to do some stuff right you need to define that incremental refresh policy where do we do that we do that in desktop so when can you get it Adam you can get it today it was in the May release of power bi just right right that's right so you know we're very excited about this this is a calm nation of a lot of and work you've been losing sleep a rabbit lost lost a lot of hair and getting close to me and you know but anyone who has ever set up incremental refresh in analysis services and written 2,000 lines of code this one is for you right this is gonna make your life so much easy it's pretty insane all right okay I'm here in power bi desktop and the first thing we want to do if I have a large data set like five years of data I'm not gonna fit it all into desktop anyway right not many of us have between fifty and a hundred gigabytes of memory on our laptop that's 64 gig in an exception respect so but it's not practical to do that no it'll not practice it up right absolutely and even if you could load it it's probably we really slug good so we need to filter the data as it comes into power bi desktop as we had to before incremental refresh and we would use power query to do that now the difference here is that we're going to filter the data using some predefined parameter names they're going to that are going to be picked up by the incremental refresh process downstream okay so I'm going to come in to manage parameters and we have the range start and range end parameters these are the predefined parameter names so far I'm showing you nothing new in an desktop this is we've been able to create parameters for a long time and then we use those parameters to perform the filtering so I'm going to select bake time filters between this is where we learn the clicks and here we say parameter range start and here I need to select before and then I say primer to range end and that's it this will filter the data coming into desktop because I only need for example one day you have a beta here in desktop it's only once it's published to the service that I want to load all of the history right so having done that and I can then define the incremental refresh policy all right so the policy is for a particular table so you access the context menu for the table say incremental refresh if you don't see incremental refresh right now it's a preview feature so make sure you enable the previous feature French is very much otherwise you won't see it you'll be like I don't have it exactly okay so there are a couple of points here on this dialog that are worth pointing out first off this is only supported in premium hence the diamond and the the message as a message says if I perform a refresh here in desktop it's gonna make no difference if I've only got one day of data filtered by those parameters it's still gonna have one day filter still in effect it's in the filter is still in effect in the desktop this only takes effect once it's published to the service right and then the first time it will load all the history then it will do incremental refreshes the other thing that points out here is that currently we can not round-trip this dataset back to desktop so we can not download the pbi X from the service we may introduce that at a later date but it wouldn't be very useful anyway unless you have a two really big a laptop like yours once we remove that 10 gigabyte limit some of these models will be easily 50 gigabytes so you wouldn't want to be opening that in rich on your your desktop PC anyway that's a bad day right there yeah exactly exactly so let's say we want five years of data in total okay all right and so if so we're saying five years total and data so if we go into with a sixth year that other year falls off yes so the first time you refresh it in the sixth year like on January 1st the the oldest year will drop off right so it'll maintain the rolling window for you and it will manage all of that for you this is the type of logic that people use to coalesce themselves for Azure analysis services so the first time we refresh it will load all five years and subsequently it will only load one day right so if I'm performing a refresh daily it will just load the new day and I'm saying in this particular case I'm saying the data doesn't change historically right so I don't need fresh any of the historical data now if the data can change so let's say I need to cover the last ten days in case of any data changes anything older than 10 days is set in stuck well I want to at least refresh the last ten days then I say okay let's make this ten days got it does that make sense that makes that makes sense to me it doesn't make sense you leave a comment down below and we'll try and clarify got it great okay and then we have a couple of other advanced features you know this may be all you would need in most cases this is all you would need to know to specify but then we can also detect data changes so we've gone from refreshing the whole five years for every refresh to just refreshing the last ten days which is obviously much more efficient if but what if I don't even need to refresh the last ten days maybe the data hasn't even changed in the last 10 days but I would like to just check whether it has or not okay right and I can do that if I have an auditing column like last updated date and it will use that and it will store the output of that column to detect where the data has changed in that incremental range this makes it that's great okay that's an advanced feature but you can get ultra efficient yeah we've gone from somewhat alike a like ultra yes some more efficient to super efficient yes we're now we're only refresh in ten days but then we can get this ultra efficient refresh potentially zero days or two days and taking it to the next level the next level of the okay and then lastly I only bring incomplete periods like financial systems might publish a month at a time or for oil and gas for barrels per day it makes sense to to not include partial data because I've read charts where we have the partial date so it's like hey we're going to then yeah it's like well what is that last day look really weird so that's what the other check what's the spin on it so having defined this I can then publish to the service and again it will only work on a premium workspace and it will limit publishing to a non premium workspace it'll be grayed out yeah and that's it then you publish it you refresh it first time it takes a little longer subsequent refresh is a freshing much faster you're meeting your SLA is much more efficient and your users are happy everyone's happy and then we can just define schedule refresh in the service for saying when to run it and if it's on-premises we need to use the Gateway yep and but outside of that in the service there's not a whole lot you got to do you just set up refresh that's it or hit refresh now and it just works yeah you don't need to tinker with the parameters the service will use those parameters you know generate the period one thing I noticed is it normally when you have parameters defined if you go into the settings of the data set you can change those parameters in the UI but because of this like the parameters will be grayed out you don't have access to those exactly you don't have because incremental refresh is using those and a write exam by are reserved exactly and those values that we set for the parameters here in desktop have no effect yes once we run incremental refresh in the service a service will parameterize those partitions or periods using those parameters makes so that's why we need them and that's why they've got those reserve all right all right guys what do you think of this is this great have you struggled with a s before but now you're looking forward to using increment incremental refresh here let me know what you think leave it down in the comments below we will answer as best as we can also we'll have links down below to the documentation and blogs or you can go find out more info about this as well if you liked this video be sure to give it a big thumbs up smash it if you so desire if it's your first time here hit that subscribe button to stay up to date with all the videos and from Patrick myself and Christian thank you so much for watching keep being awesome and we'll see you in the next video
Info
Channel: Guy in a Cube
Views: 87,313
Rating: undefined out of 5
Keywords: Incremental Refresh with Power BI Premium, power bi premium, power bi premium demo, Incremental Refresh in Power BI Premium, incremental refresh, power bi, incremental refresh in power bi, power bi demo, Power BI Desktop, power bi desktop parameters, power bi desktop tutorial, Power BI Desktop update - may 2018, power bi refresh, power bi refresh data, power bi tutorial, microsoft power bi, microsoft power bi demo, microsoft power bi tutorial
Id: CajQjq70Kpg
Channel Id: undefined
Length: 12min 31sec (751 seconds)
Published: Tue May 15 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.