Deploy Power BI dataset schema changes WITHOUT refreshing!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

- Yooo! What's up? It's Patrick from Guy in a Cube. And in this video, I'm gonna show you how to employ, Schema changes to your Power BI model without actually refreshing the entire model. Stay tuned. (bright music) If you find this for the very first time, be sure to hit that Subscribe button to stay up to date on all the videos from both Adam and myself. Okay, Patrick, what are you talking about? Publishing changes to my model without refreshing the entire model? That's the only way you can do it. Well, it is. Well, no, it is not. I've been having this conversation with a lot of people. Actually came up in a session Adam and I were doing was like, Hey, I've configured my model, set up Incremental Refresh, I haven't refreshed and I have several years worth of data Gigabytes worth of data and now I need to add a measure. I need to add a calculated column, but I don't want to completely refresh that it takes me so long. That's why I'm using Incremental Refresh. Is there a way to do it? Absolutely. Using the ALM toolkit. If you don't have the ALM toolkit, if you go out to the website, you can find it. I'll put a link to it where? Down below. Okay, so go ahead and get it downloaded and install, then open up the desktop. Open up the model that you created and then I'm gonna show you how to do the Schema Only Deployments, alright? So instead of all this talking, you guys know how I like to do, let's do what? Let's head over to my laptop. So let's say you've created this model, and you've configured Incremental Refresh. Tick save Incremental Refresh, configure, and someone says, hey, I need to add a measure to this model, I need to add a calculated column or I need to add another table or something like that. And just to verify, right, just to verify that all my data is there, this model, you can see the model, if I live connect to the model that's deployed out to the service, you see that there's multiple years, not just 2020. You can see I'm live connected to the data set out in Power BI, which is the exact same data set that I'm using right here, and somebody says, Hey Patrick, I need to, I need this new measure. So what do you do? You kind of go, (sighs) Well, if I deploy it, then I gotta refresh the entire model, which could take a bit of time, so hang on, hang on, come back in a couple of days, don't use that model anymore. Because it's gonna be out of sync and it's gonna take me a little bit to refresh it. And they go, what? what the French toast are you talking about? No, no, no, no, no, I need this pretty quickly, because we got some end of the month reporting coming up, and you go, Oh, wait, I remember, Patrick did a video, he showed me how to do it. So let me show you. The first thing you need to do, like I said, is download the ALM toolkit. If you just go out to the internet, use your favorite search engine of choice. And if you just type in ALM toolkit, you'll see here's the homepage and just click Download, Download Latest Version. And then once you do that, it'll appear in external tools, click on External tools, it's gonna open up. When it opens up, you'll need to go out to the service where you have deployed your data model and go to that workspace and however you normally go grab your endpoint, go grab your XMLA endpoint, because you gonna need it. So once you have the endpoint, you paste it in right here. You can see at the top, you can connect to the other data sets, you can connect to a BIM file, but I'm connected to my local Power BI Desktop and I'm connected to my endpoint. You click OK, It may require you to go through a series of authentications. So I'm gonna go ahead and get authenticated. Once you get authenticated, it's actually gonna run a Schema compare for you. And what you'll see, let's go ahead and maximize this out. There's no difference in my Schema right now. So we say, Hey, we need this new measure. And so they say, we need Gross Profit Margin Percent. So let's go over to our model and I'm gonna add a new measure. So click Measure, and we go ahead and paste it in there, and Bam! Voila, alright. And then we'll head over to the ALM toolkit, there's an Options button. Always go here, get this Options button. And what I want you to do is make sure that process as an Option is set to Default. There's other options there but be sure to choose Default. Always check this table, I like to check this table because, sometimes there's other tables and other things that it'll pick up and try to process even though I haven't touched those tables. So I like to choose this process only affect the tables. So it only, processes the tables that I actually need to change. Alright back to my laptop. And then I like to choose this one because if I don't need to change any data, this for table updates retain partitions. And when you're working with Power BI, Power BI has these refresh policies when you set up Incremental Refresh, this is more specific to Power BI. But when I am deploying something that where I don't need to process bring new data into my partitions or affect the scheme of the partitions, I choose this option also. Because I'm minimizing what's going back to the data source and what's actually being processed on the bottom. So I click OK. And then it's got to do another compare because I've changed some things, let it finish the compare up. And then what you can do is, you can actually hide some objects and things that you only want to see in this section right here. So I can say hide things that's not affected, skip them to only can see what's actually affected. I'm gonna add this new measure over here to the service, to my data set in the service, alright. I'm gonna use Show Everything. You can go play with those and test those out. And the next step is you have to click Validate. Validate is gonna say hey, this is what I'm about to do. So I'm gonna click OK. And then all I'm gonna do is click Update, right? And I say, Are you sure? Now you need to make sure, on your capacity for the XMLA endpoint, that you have read, write, enable. And so how do you do that? Need to be a Capacity Admin, go over to your Capacity, click on that particular Capacity, and then go to workloads and scroll down and look right here, XMLA endpoint and set it to read write. Make sure you've done that, and then we're gonna choose Yes. And watch this. Boom! Just like that it's done. It didn't read the database, it didn't do anything to Analysis Services to my semantic model, just deployed out the Schema. Let's verify this. So let's go over to the one where I'm live connected to the data set. You can see I only have three measures here. I'm just going to click Refresh. I'm waiting anxiously. Whoa! that was fast. Let's copy this core and let's drop this new one into the value. And wow! There's the data just like that. I could format it up if I need to but there you go, I didn't, I didn't process partitions, I didn't reach back to the database. All I did was a Schema Only Deployment using the existing data that's out there. That's the first thing you can do. Deploy Measures, is a piece of cake. But then somebody kind of comes and says, Hey Patrick, got something for you buddy. I need to add a Calculated Column. And I know what you're thinking, right? That's a row by row operation. So maybe I need to bring the data back in. Not really, not really. Unless you've added additional data to your partitions, or you've added additional data to the tables, yes, you'll have to process but if the data that's out there, it's okay, and you just need to do whatever Incremental Refresh it normally needs to do. And all you need to do is accept, just check a few boxes are in, check a few boxes. So let me show. Let's head back to our model and what I'm gonna do is say New Sales and I'm going to add a column and we're going to call this Sales Order Line Total. And we're gonna say this is my Sales Amount, plus freight, plus tax. So that's my new Calculated Column. And what I'm gonna do is head over to the ALM toolkit, and I'm gonna go to Options. Let's not compare yet, go to Options. What we're gonna do is we're gonna make sure we check this, right, 'cause I don't want it to reach back to the database or affect my partitions, I'm gonna choose Default, and make sure you always check Process only. Very similar to what you did with the measure. And then I'm gonna do a compare, exciting, exciting, hey, there's the difference. There's the difference. So if you click on this green right here, it'll show you, Hey, there's an addition, Patrick's adding something new. And then I'm gonna go ahead and validate this. So it says, Hey, I got to update all these partitions. Don't worry, don't worry. So what I'm gonna do just to, just to give you a little bit of confidence, I have the profiler running against my SQL Server. So I'm going to clear this out and I'll bring it over after I do this update. Just so you can see exactly what's happening to the database, I'm going to click Update, and say Yes, it's doing something. Oh, it does say processing. Ah! it did process. But actually, I hit the profile while running against the database, nothing happened to the database, so now I'm concerned, right. So let's go to my Power BI Desktop that's live connected. I'm gonna do a quick refresh. There's my column, click in the white space wonder they did have my data. Boom! My data didn't reach back to the database, because I've checked that box and maintained my partitions. I don't need you to reach back to the database, to go and get new data 'cause my data is fine. When I run my normal refresh in the Power BI service, go ahead and do whatever you need to do to bring new data in for whatever I've configured my policy that I've configured. But beyond that, I don't need you to go back to the source, because what I have in my model is perfectly fine. That processing what you saw was a processing on analysis server on the tabular model. Behind the scenes it did a process calculate. One of the operations of process calculate is to create those calculated columns. That's one of the items that it does, alright. So now we got a bigger request, we got a bigger request. And they said, Patrick, I need to add a new column to my Fact table. Doesn't exist, not a Calculated Column, an Actual Column and I need to add another table. I'm like, (exhales) in this case, it does absolutely need to process every partition in that table because it's got to reach back and grab, reach back and grab the column, right, that new column has got to grab the data for that column. And whatever new tables that you pull in, it's got to reach back and grab the data for those tables. And more importantly, it's gonna do a process calculate, because it's got to create that relationship, and part of that calculate is where it creates the relationship. Like I said this calculated columns relationship, and some other items. Let me show you what I'm talking about. So let's head back over here to this model. And what I'm gonna do is I'm gonna go to transform and I'm gonna go to my sales table. And I have a step here where I remove some columns. So I'm just gonna click that step. We're gonna add the Promotion Key Dimension. And so we're gonna click OK. And then what we're gonna do is, we're gonna choose Recent Sources, and we're gonna add the Promotion Dimension. Let's go find it, Dim Promotion right there. Click OK. And then what we're going to do is click Close and Apply sure Import. I'm getting a little ahead of myself and click Close and Apply. And so what Power BI is gonna do, it's gonna use my Refresh Policy Parameters and pull the new data for the sales table. Sorry, I was pretty fast, and then pull my dimension in. Got to make sure it created the relationship. Let's bring Promotion over so you can see. So it definitely created the relationship. So then we'll head back over to the ALM toolkit and we're gonna do some options here. And so I don't want it to retain. I unchecked that because I needed to process those partitions in the affected table. I needed to reach back out to the source and pull that data in. I also needed to process that Dim Promotion and that's why I have it set to process default. So any to any tables that's been modified, you should pick it up. Most importantly, though, I also said, process only affected tables. That way it's only going to process the two tables that's affected, let me show you. All right, so we can click OK, and do a quick compare. And you'll see, right, if we go here and say, hide skip objects we can see it's added Dim Promotion, it's got to add something to my sales table, my new column. And then here's my relationship that it's got to create. And so we're gonna choose Quick Validate. And you can see, hey, it's saying, this is what I'm gonna do. What I'm gonna do over here is on my SQL profiler, I'm gonna stop just for one second, clear this out, I guess I could have just hit pause and stop. And let's see what actually happens when I push this out. So I click this, I'm going to choose Quick Update, say, Yes. While it's publishing it out, you'll see it's saying Hey, I'm doing some processing. And so actually what it's doing is, it's reaching back to the database and it's saying, I need the data for Dim Promotion. It's gonna pull the data in for Dim Promotion. And then for each one of my partitions in my sales table, it's actually got to reach back and pull that data in. I missed the one for Dim Promotion. But it started in the process each one of my partitions, you can see here's 2017, here's 2011, 2015. So it's part of the process in each one of my years. And then somewhere in here, you'll see the months. How to start doing the quarters and months for 2020. At some point that little mingle them in here. So you can see right here, there's a quarter for 2020. Alright, so it's all done. Now, let's head back over to the model where I'm live connected. Let's click Refresh, we should see a bunch of things happen over here. All right, that's nice. So there's my Promotion Key. And there's my Dim promotion. Let's drag this up. Let's click in the white space and let's go ahead and choose the English promotion name. I didn't do any massaging of the data. Look at there, that's all of the promotions and let's see if the relationship was established. There you go just like that. It did have to process every partition in the table. And so if you have lots of data and you're adding a new physical column with a relationship, that can take a bit, but it's easier than trying to process the entire model because you may have multiple fact tables that have a significant amount of data A more meticulous approach, where you're doing one partition and one table at a time, you can use Management Studio to connect to the model and then you can process each individual partition using some XMLA, you can process each table individually if you want to, or each part of the partition individually. Alright, what do you guys think? I'd love to know who's using this? Who's not using this? Who saw this for the very first time today? You guys know how I like to do, let's continue the conversation where, in the comments below. Your first time visiting Guyina Cube channel, hit that Subscribe button, like my video, big thumbs up. As always from Adam and myself, Thanks for watching and we'll see you in the next video.

Info

Channel: Guy in a Cube

Views: 17,526

Rating: 4.9546313 out of 5

Keywords: power bi, power bi dashboard, power bi data modeling, power bi dataset, power bi desktop, power bi desktop tutorial, power bi for beginners, power bi premium, power bi pro, power bi service, power bi training, power bi tutorial, power bi tutorial for beginners, power bi xmla endpoint, alm toolkit power bi, alm toolkit, business analytics, business intelligence, data model

Id: s0j6d3UAw9U

Channel Id: undefined

Length: 13min 9sec (789 seconds)

Published: Wed Oct 21 2020