Unboxing new Power BI composite models

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
- Ciao, friends. I quite never record videos on news about Power BI but this time, I'm going to make an exception. Microsoft got us used to a new release of Power BI every month but not all the months are equal. With the December 2020 version of Power BI, they are introducing DirectQuery against Analysis Services. The capability of connecting to a Power BI model in the cloud and extending its feature with your own tables, columns and calculations. They are setting a historical milestone in the development of business intelligence. Historical milestone. I'm not saying that lightly. I am old enough that I have seen a lot of new features appearing in the business intelligence world. Some were nice, some were cool, this is neither nice nor cool. This is huge. It is fair to say that as of today, we can finally seal the marriage between corporate business intelligence and self-service BI using Power BI. Now, if you want to know why I'm so excited about this feature, follow me. I am going to show you a few demos about this new feature and how you can take advantage of it. Let's get started. We start with an empty Power BI file and we create a live connection with another Power BI model that is stored in the cloud. This way, we will be able to use all the measures and tables that are stored in the cloud. Then we will take a look at the limitations, and how DirectQuery over Analysis Services help us overcome these limitations. Let's get started. I have this empty Power BI file that I'm going now to connect to the Contoso model on the service. So I need to search for Contoso. Connect to my schema and right now, I'm creating a live connection. You see that I have the measures defined there. I only have sales amount here and I can slice by product brand, for example. Transform that into a matrix and finally, let's make the font a bit larger, otherwise we will not be able to read anything. Let's go 20 points. So if my only interest is that of using the features which are available in the model, that is fine, it's working. But what if I want to do more? For example, I might want to do an analysis on this dataset based on the price range. I want to slice by product by price range. Creating a new column that categorizes my products, depending on the value of the price. This requires creating a calculated column and the problem is, I cannot create a calculated column. If you look at the right click on sales, I can create measures, I can create new measures but I cannot create columns. And the reason why I cannot create a column is because there is no space where I can store the information. The data is in the cloud and there is no space where I can actually store this new calculated column. But what I can do is, enable DirectQuery over Analysis Services, adding a local model to the current one. To do that, I only need to click on Transform Data. Power BI sends me a warning saying that this model right now has a live connection and if I go next, it will add a local model and this enables DirectQuery. That's exactly what I want to do. So let's go for add a local model and create the local model. Okay. Okay, let's close that. Now I have a local model and I'm connected with DirectQuery to Analysis Services. Look what happens when I right click on sales. I can create a new measure but I can also create a new column. If I click on New Column, I now have my usual user interface where I can create my price range column. Price range and I just use SWITCH over TRUE. If the net price is less or equal than 100 this is low. Otherwise if the net price is less than 1,000 then it's medium. Otherwise, it's high. This is a new calculated column that I'm adding to the table that is stored in the cloud but I am adding it only for this Power BI file. I just need to hit Enter. Now the column is there ready. Let me copy this visual, put it here. Now I can get rid of the brand and you see that I have my price range column here. I can take it, drop it in the report and it's showing sales amount sliced by the price range. That is incredibly cool. I can extend the features of the cloud model with my own calculated columns. And I can go further than that. Look at the price range, it is being sorted, medium, high and low; That's not what I want. It's using alphabetical sorting; I need a sorting column. Again, that's not a big deal, I create a new column. I just copy the code of the previous column. Let's call it price range sort. I want low to be sorted one, medium, two and high, three. I created another calculated column. Then I can go on the price range, use the sort by column and sort it by price range sort. Now you see it shows, if I sort by price range, low, medium, high as I would expect. You can increase the power, the analytical power of your model by adding new calculated columns if you have the local model. Now, the thing is now that I have a local model, I can actually store more information in this local model. I can use it not only to create calculated columns but also to add new tables that I can then use in my DAX calculations. Let's pretend, for example, that I have some budget information that I want to add to the corporate model that contains sales about the last few years. Now with DirectQuery, I can do that. Let's give that a try. I have an Excel file that contains budget information. It's stored here, let me see that. We have an Excel file that contains some information. Let's max it. For each country and brand, I have low, medium and high, three different scenarios with the value that I have for the budget. This is Excel and I want to load this data in my model. Let's do that. I just go here, get data from Excel. Click on my budget table, click on Open. Then I will transform the data because I want the scenario to become a row instead of a column. So we load data from budget, transform it. We have the full power of Power Query available. This is the scenario. I want to unpivot these columns. This is the scenario and this is the value of the budget. Data type are correct and I can click on Close & Apply. Look what happens. It warns me about potential security risk because now I am mixing data on my computer, data in the cloud and the queries will go back and forth, exchanging data back and forth. We are fine with that, so just click on OK. Now it is loading the budget table from Excel in the local model and I have it here. You see that I have the budget table. I can also look at the budget table. Here it is with all the data loaded from Excel. Let's get rid of this visual. We don't need that anymore. Now I can take my budget, give it a meaningful format, like a decimal number with commas. If I drop it in the report and I slice this by brand, you see that it just does not work because there is no relationship right now between the brand, which is in the product table in the cloud and my budget table, which is here on this PC. We can further look at that here. You see that we have blue tables. The blue tables are the one coming from the cloud service and the budget table here is unrelated to the remaining part of the model. And product is, where is product? Here. So we have products here that has no relationship with budget. Because there are no relationship, the number produced here is always the same but I can create a relationship because now I can just take the brand from here, drop it here and I'm creating a relationship between a table in the local model and another table, which is in a different model in the cloud. I want it to be many to many. I want products to filter budget and I just click on OK. Here we go. Now you see that I'm slicing by brand. I have the sales amount and I have the budget. The budget is kind of too large because I need to slice it by scenario. So let's filter only one scenario. Let's say that I'm interested in low. Now I have the budget low. It can be low, medium or high and the sales amount. I have created a table. This table is coming from Excel and I am mashing up the data of the Excel table with another table in the cloud. Beware that the data is still in the cloud. I didn't have to change anything. I didn't have to download data from the cloud. I am mixing these two environments together. And why I find this so exciting? Because now I have what I call the really needed real time. Let's say that I think that Contoso, the budget of Contoso, for whatever reason, is not correct. Well, I can simply go back to my Excel file, go back to my Excel file, open it. I look at Contoso. Right now here we have one million. Let's use a much larger value, like eight million. That's 80 million, a bit too much. Eight million. I save it. Close that. And here I only need to do refresh. The refresh operation is loading again the data from Excel but it's not touching in any way the data which is in the cloud. Now the value for Contoso is larger because it has loaded new data coming from the Excel file. This is what I call real real-time because that is, in my opinion, what is really needed. Real time is not about the sales that happened three minutes ago. It's about being able to quickly update your data, refresh only your data and see the new figures, the new projections immediately, mixing data coming from the corporate data model and your own data model. So, so far we have looked at how to create calculated columns, how to create a relationship between tables, how to refresh the data. But it's interesting to see how far can we go? What kind of calculated column are we able to compute? The simple calculations like the one that we have seen so far, or we can we actually do more than that? And I wanted to test it. I wanted to play with it and create more complex calculated columns. Let's look at what I did. What I wanted to do is create an ABC analysis. I wanted to categorize products depending on how they contribute to the total sales amount. The product that I sell the most are in class A, the product that it sell medium are in class B, and the product that I sell less are in class C. ABC analysis has a simple solution, which is based on four calculated columns. So you can go for a very simple implementation of ABC but you need to able able to create calculated columns in the product table. And these calculated columns are not trivial. They require some DAX code. And the question is can we do them with DirectQuery over Analysis Services? Well, the answer is yes. Let's do that together. If I want to categorize my product, let's do that in a new page. Let me get rid of the budget. Okay, here I only have the sales amount and I want to slice product by product. So let me add the product code. This is already a matrix. It's already sorted by sales amount. So you see that this product is the top seller. Then we have a second product, a third product. I want to cluster together the top seller until I reach 70% of my total sales. And that will be class A. To do that, what I need to do is create one column containing the sales amount. I already have a measure but I need it as a column because I need to work on that. Then I will create the running total of sales amount, transform that into a percentage and finally, compute the cluster. These four calculated columns are quite heavy and the problem is that I got used to creating calculated columns in this visualization. I go to product and I create columns here but as you see, I cannot see anything because this table is using DirectQuery and cannot be shown. A table created in DirectQuery is not visible regarding its data in this visualization. So if I want to see the result of my query, I need to do everything here. I have the product code, I have the sales amount and as I said, the first thing that I need is a new column containing the sales amount. Let's call it ABC amount and I use CALCULATE sales amount ALLEXCEPT product, product key. This computes the sales of the current product only as a calculated column. I can then drop the column here. The value is the very same as the sales amount but this time, it's a column. Let me get rid of that filter that is no longer needed. And I can also remove, at this point, sales amount. I have the product code, the ABC amount sorted alphabetically. Next, I need the running total of the ABC amount. Another column. ABC amount running total. This requires a variable, current amount, which is product ABC amount. And then the result is a SUMX where I FILTER product where product ABC amount is greater or equal the current amount. Out of this filter, I retrieve the ABC amount. Now, if I did it correctly, this is going to show me the running total of the ABC amount. Let's put it here and you see that for the first row, it shows the first value and then it grows, grows, grows using the ABC amount as the step. Next, I transform that into a percentage, using a third column. ABC running total percentage, which is just the division of ABC running total by CALCULATE sales amount ALL. The sales amount of everything. I put ABC running total percentage here. I format it as a percentage. And now this is telling me that the first product alone is responsible for 4.26%, the first two together, 6.64 and so on. Finally, the last column of ABC is the ABC class, which is just an IF ABC running total percentage, if it's less or equal 0.7, this is class A. Otherwise, IF it is less or equal 0.9, it is class B. That should be nine. Otherwise, it is in class C. Now I have my third column. I can drop it here and this is computing, I can drop it here. I didn't want it on the rows, I want it... Oh, let's move that to a table. Okay, now I have the product code, the amount, the running total, the percentage and the ABC class. It's likely that I don't want to use that with the product code and I want to see all these columns. So I just want to slice by ABC class. Move back to a matrix. Add, for example, the sales amount and now this is telling me the ABC class, the sales amount. If I want to know how many product I have, I can just build a new measure. Let's call it number of products, which is just a COUNTROWS of product. Put it here. And here I have my report. 493 products are in class A and they produce this amount of sales. Now, you have seen how easy it is to create new columns and new measures, add new tables, increase the power of your analytical model and by doing that, I have not moved a single byte of data between the cloud and my PC. The data is still in the cloud. It's managed by IT, it's refreshed all the time. And I'm responsible of refreshing and managing my data. I am extending the power of a model which is stored in the cloud. Now, you know how I am. I typically want to go further and try to understand how things are working. So let's take a look at what is the result of this query? How is the engine running this query? We can take a look at that by clicking on View. We enable the Performance Analyzer. Start recording. Refresh the visual. Now the matrix is showing a query. I have the time needed to execute the query but I'm not interested in performance right now. To me, it's amazing the fact that it's working. I want to understand more. So I copy the query. I launch DAX Studio and then we'll take a look at what query has been executed. Here we have the query, let me paste the query. And it's interesting to see that the query is not running a query as if the model was a local one. But then we have all these SQL queries. These are not really SQL queries. These are DAX queries executed against the Analysis Services engine running in the cloud. And you see that the define section contains MEASURE, COLUMN, define COLUMN is new, it's a new statement added specifically for that. And then it's doing all the calculations. So we are seeing new DAX statements. There are likely to be new DAX functions, new tools. Now, let's draw some conclusion about all this. As you have seen, it's extremely easy to extend the power of an analytical model stored in the cloud with your own model. I have never been so excited about DirectQuery. You know, if you have seen videos and sessions in conference made by myself, you know that I'm not a big fan of DirectQuery against SQL. Why that? Because DirectQuery against SQL is DirectQuery against a storage model. SQL is a storage model, it stores information. Analysis Services is different. DirectQuery against Analysis Services is DirectQuery against a semantic model. This means that IT can build the semantic model containing the entire business logic of your company, handling the huge amount of data that is likely to be present in your data warehouse, creating all the calculation and handling that as an Analysis Services model, published on a premium instance or on Power BI. If you want to connect to it and extend its capabilities, now you can do that. You can use all the measures, which are stored there and add your own calculations. Now, this is exciting. This, in my opinion, is really an historical milestone. With that said, beware, it's a new feature. It just hit the market, so we still need to understand. Don't run too fast. Don't rush, don't redesign your entire data warehouse infrastructure based on this. We need time, we need to learn it, we need to study it, we need to understand exactly where it works well, what are the limitations and how to use it the proper way. Once that is understood, then I think this is really a game changer. This has completely changed the adoption of Power BI, especially in large companies. So hurry up. Don't change everything today, but start learning because this is really going to change the way you look at your business models. Enjoy Power BI.
Info
Channel: SQLBI
Views: 63,716
Rating: undefined out of 5
Keywords:
Id: g-nRxDVt3To
Channel Id: undefined
Length: 23min 38sec (1418 seconds)
Published: Thu Dec 17 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.