Creating real time dashboards in Power BI with push datasets

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
ciao friends and welcome to a new video from sql bi in this video i want to talk about real-time dashboards and push data sets introducing a tool that sql bi recently published that make it easier to create push data sets but first things first what is a real-time dashboard it is a dashboard that updates in real time let me show you an example of this as you can see we have a dashboard which is not really a cool dashboard i'm totally not good at building dashboards but it has the interesting feature that it updates its values continuously you see that the numbers change without any human touching the keyboard or doing anything because this is a real-time dashboard that detects changes in the database and updates its content based on the latest data that is available in the database this is the kind of dashboard that you are likely to create and use for example in the whole of an office to show the sales of today or the number of open tickets or the number of tickets that you have been servicing during the day kind of information that are useful to see in real time creating real-time dashboards is a kind of a complex topic because power bi with the vertipack store is not designed to create real-time dashboards power bi if you use vertipack is designed to work in pull mode meaning that the database the model at data refresh will pull data from the database populate its internal vertipack store and then all the reports are created based on the vertipack store the vert pack store is updated as as of the last data refresh so you do not have real time if you are working in import mode which is the default way of working of power bi if you need real time well the first technology that comes to the mind is a direct query you could use direct query and direct query is real time with that equity whenever you run a dax query that query hits the original data source so you always have the latest information available the thing is that it query provides you real time but it also comes with limitations and problems performance is the first problem you need to face let's face it that it query is slower than vert pack then from the dax point of view you have some limitations and if the entire model is designed using that query then you will have to find a modeling workaround in order to overcome these limitations and the entire development of the model is quite a problem besides is it really worth moving the entire solution in direct query just to build one real-time dashboard because again it's unlikely that your entire data model needs to be real time if you have 10 100 reports as part of your overall solution and maybe one two dashboards need to be real-time the remaining part of the project can work in a regular import mode because when you analyze the sales of the last 10 years it is not really relevant what is happening today so some dashboards require real time while the vast majority of your dashboards and reports do not need it that is why there is a better alternative to moving the entire project to direct query which is using push data sets push data sets work in a very different way if you compare them with vertipac models a vertipack model works in pull mode meaning that a data refresh time that date the model pulls data from the database populates vertipack and then it starts answering to queries and until you send another refresh the model stays as it is push data sets are very different a push data set just sits there waiting it has its own database but the database by default is empty and the data model doesn't do anything so power bi will not do anything just sit there waiting until somebody pushes data inside it so power bi is there waiting for data to arrive and as soon as data arrives it is inserted into the database of the push data model and all the dashboards which are linked to the database are refreshed to update the new data the technology is very different and it's not new at all it has been there for a long while and it has not been used very widely the reason is push data sets are different in their implementation but they are also different in the way you create push data sets you cannot create a push data set just by creating a power bi desktop file and uploading it to the service because the technology is different a push data set handles all the interactions through rest api so if you want to create a model a push data set you cannot just create it with power bi desktop you need to create it through the rest api so you need to call an api to generate the model and the same for pushing data if you want to push data inside the push data set you need again to call a rest api to send data and then there are some limitations not too strict but there are some limitations for push data sets the interesting thing is that we recently published a tool that makes it easier to create push data set so what we could do is think about a different architecture instead of moving the entire solution to direct query what we can do is build the entire solution using import mode the default safest and fastest way of developing your tab resolution and then we pair it with a push data set whose only goal is to serve the real-time dashboard and the tool helps you in creating this push data set out of the original import model let me show you this with a couple of diagrams here on the screen we have the architecture of a regular power bi model we have the power bi data set here and the power bi data set loads data from a database running sql queries and the database is populated from the oltp system the problem is that these steps happens during data refresh we have data refresh operations and data refresh happens once a day typically once a day it might be more frequent but this means that the data that is seen from this part of the architecture that contains reports and dashboards can only access data which is one day old you never have real time by using push dataset you rely on a different architecture we will pair the currently existing model with a new model that is a push dataset let me show you this the architecture with a push data set is slightly different the bottom part is the same but then we add this part the power bi push data set this is a model that serves the purpose of being real-time and provide information to real-time real-time reports and real-time dashboards remember not all of your dashboards need to be real time not all of your reports need to be real time so you need to identify where real time is needed and where it is not and once that happens you build your push data set the push dataset can be created as a copy of the original import model then as we said a push data set sits there it is empty at the beginning that is why you need to have let me clear this a bit you need to have these operations whenever a new row enters into the system you need to move the row from the otp inside the push data set and this requires developing some code that sends the information from the oltp system inside the push data center then we have another detail that we need to worry about as we say the push data set starts empty therefore it does not contain any data and if you load data from the oltp system you will end up loading a lot of information inside the push dataset and push data sets are not designed to contain huge amount of data therefore you typically have a push data set that starts empty at the beginning of the day and it contains data for the day so the otp system sends information only for the current day what if the dashboard does not rely only on data of today maybe you need to make a comparison between the previous month and the current month or the previous year and the current day or you need some statistical information that can be gathered only by accessing historical data therefore in your push data set you will need to have not only the data of today but also some data from historical from historical data that is why the entire architecture is designed in a slightly different way we have the push data set that as we said starts empty but then at the beginning of the day we run a script a dax script that loads data from the power bi data set and uses it to populate the push data set so this is the overall flow of the system we create one push data set that starts empty at the beginning of every day we empty it and then we load it with the historical data that is needed to populate the dashboard then during the day the rtp system sends new information to the push data set so that the push dataset stays up to date with the data of today doing this you have two different models with the advantage that the historical model the real model only relies on import mode so it's easier faster is the standard way of developing tabular models whereas the real-time dashboards they are based on a push data set which is empty at the beginning of the day initialized with the data the historical data that is needed and then it is ready to receive new information this is where the tools that we are publishing can come as helpful because you can use these tools to take the model dot bim file the description of your model and create a push data set out of it the tool does a lot of small operations that are rather important first of all it checks that your model obeys the rules of push dataset there are some limitations in the way you can write dax code and in the features that you can use nothing complex nothing hard but there are some limitations so the tool will take model.beam and check that it is compatible with a push dataset then it has the option of publishing the model so it takes the model.beam once it complies with the limitation of a push data set and it gives you the option of creating the push data set then again the set of tools that we are showing also give you the option of running the dax queries that we have seen earlier this update script so the set of dax queries that are needed in order to populate the push dataset at the beginning you need to write those dax queries the right way because you cannot just take the entire model and dump it inside the push data set there are limitation in the number of rows so you need to reduce the number of rows in the push data set to a number that works with the specific push dataset you are creating and finally the tool also provides you a demo mode that gives you the option of pushing data inside your push data set using random values this is extremely useful whenever you want to build a poc to demonstrate the technology to somebody let me show you some more details about that we can go back here and first of all how do you retrieve the model.beam because we said you need to use model.beam and what is model.b well model.beam is the description of a model here we have a regular power bi file that contains some tables with their own diagram and this is a regular import mode if you want to move this to a push data set you need to extract dot b and there are no ways you can do that from inside power bi but you can use a tabular editor you launch tabular editor here it is and tabular editor gives you the option of looking at model.beam but most importantly it lets you save model.beam we can call this push dataset and then we can just inspect it let me just open it with visual studio code and you see that the result is a json file that contains the complete description of your model it's not important now to understand the details about the model.beam what is important is that you can extract model.bim from any power bi desktop file once you have your model.beam you can run the tool but before running the tool you need to download it that's right i think that i forgot to show you if you want to download the tool you can download it from the powershell gallery here you can find the sql bi push tool 1.0 now sql bi bash tool is a command let a android which is only part of the entire tool the entire tool is published on github and you can download it from here it's open source meaning that if you need any feature you can implement them by by yourself contributing to the development of the tool and here you can find the entire documentation that explains you how to work you have the sql bi push tool the command list that you can download also from the gallery but you also have a power bi push tool which is a command line to make it easier just to run commands without having to change anything and you also have an entire net library that you can use if you want to embed the code of power bi of power bi push data set inside your model and you can find the recommendation both on github and on this article that we publish on sql bi that provides you all the details that are needed in order to build the push data set this is just an introductory video so i'm not going to go into all these details i just want to show you the overall architecture anyway once you have created model.beam run it through the tool in order to check that it complies with the limitations you end up publishing the you need to publish the the new model and again the tools gives you the option of publishing a model.bin file as a push data set so it connects through the rest api for you and it generates a push data set starting from model.bim it is worth to remember that push dataset do not expose xmla interfaces therefore because you do not have the xml endpoint you cannot use regular tools anyway once you have your model.bin and you create your first empty model you need to start populating it and populating a push dataset requires you to obey some further limitations which are not modeling limitations there are limitations in terms of sides the number of rows that can be stored in a table in a push data set is smaller than the number of rows that you can store in an import mode i honestly do not remember all the details you can find them in the article but you need to build a model that obeys those limitations that requires you to reduce the number of rows in the tables the thing is it is unlikely you need the entire historical structure you can reduce for example the number of dimensions which are present in your push data set by avoiding populating them and by reducing the number of cables that you need to load moreover from the fact table point of view you can reduce the cardinality by pre-aggregating values it is unlikely you need the details at the product level at the day level at the customer level so you need to build a set of dax queries that load data from the original data source and populate the push data set with a smaller number of rows to do that you build dax queries because dax is the preferred language to query your import model and let me show you an example of these dax queries i have them here this is a set of dax queries that are used to populate the dashboard we have seen earlier we load the store table we load the date table doing some rename because one of the limitations of push data set is that you cannot use date as a name that is why we call the new table dates instead of data so these rename are needed for this reason and then when it comes to load the sales table we cannot we do not want to load the entire sales table so we only load the data at the day level of the store level and the regarding quantity net price unit cost and unit price we need to compute them pre-aggregating values in such a way that they will still produce the same values so for quantity this is simple we just sum it but for the net price we need to compute a fake net price that when multiplied by quantity still produces the correct number this part is important because you want to still run the same code the same measures the same queries that you were executing against the original model on top of your push data set if you build the data and you pre-aggregate values the right way you will have the advantage that you do not need to change anything in the code the tool gives you the option of running these query and update the model again let me show you that if you run the tool you see that we have the dashboard which starts with some data but while the tool is running the data is changed and it starts to empty some tables because the tables are first emptied and then during the execution of the code some data starts to appear we end up with a data model that contains the historical data but still has no data for today because the tool loads the historical data from the original contoso database in the push data set but it is still missing the data of today remember that the data of today comes from the oltp and it is pushed in the data source at this point when you have the empty model or not the empty model the model with the historical data pre-aggregated with the empty carbon data the data of today you need to start pushing data and this is where development is needed because the data needs to be pushed inside the push data set through rest api and this requires a developer somebody that can work either on the otp system or on some software that connects to the otp system detects new data and sends the new data inside the push data center but what if you are not a developer and you still want to build a proof of concept because maybe you want to show the technology to your boss or to your team in order to investigate whether this is worth doing or not the sql bi push tool provides you also the feature to push data inside a push data set by setting some parameters it is dummy data it is not real data but it gives any user the capability to build a proof of concept that shows how the push dataset works you need to run a specific command in the tool providing some arguments and this starts pushing data in the push dataset again let me show you this as you see we start the tool so we have the dashboard which is empty the tool starts and it starts pushing data inside the push data set and the dashboard answers by showing up-to-date information so numbers are changing while the tool is just sending fake then fake numbers inside the push data set this is extremely useful because these steps can be accomplished by any user you do not need to be a developer in order to build the proof of concept and you can demonstrate the technology to anybody interested in maybe sponsoring the development of a push data set so as you have seen using the tool you have the option of creating a push data set out of an existing data model populating it for at the beginning of the day with the data of yesterday and then running a demo that exec that pushes fake data inside the data set all this is useful to build a proof of concept but if you want to build it for real there you need some developer you need a developer that is able to create the piece of software that is needed to communicate through the rest api and send information to your push data set it's not a lot of work it's a few days of development but you need to find a developer with the right set of skills the advantage of obtaining such a architecture is that you do not have to move the entire solution to direct query you build two models the regular import model and the push data set the push data set needs to be created and then updated over time if you update the original model you might need to refresh also the push dataset again using the tools now from the cost and maintenance point of view there are some important considerations if you plan to move the entire solution to direct query then that means that you will have a slower model harder to maintain and you will need to work closely with the database people in order to make sure that your queries work smoothly so you pay actually a high price in terms of performance and in terms of the relationship between the original data source and your power bi model you will always need to work together in order to obtain good performance so that is why we do not think that using direct query for such a solution is a good choice if you go for the push data set architecture things are quite different because you need to build your model in import mode without having to worry about performance because vertipack is super fast then you have the additional development time that is a one-time cost to build the architecture to push data inside the push data set after that you will only have to update your push data set if any change happens in the original source that requires you to update the real-time dashboard but to be honest i think that if you balance these two topics everything is in power of the push data set as you have seen this was an introductory video so i have not been able to explain all the details but if you are interested in push data set you can find the article at sql bi where we provide many more information and a wall through to provide to learn the details and if you want to know more about the topic we also have the mastering tabula training where we explain into greater details or what is needed in order to build push dataset as part of your architecture if you need a real-time dashboard as part of your entire solution then my personal opinion is that moving everything to direct query is not your best option because you are paying a very high price over time in terms of performance and maintenance that you can avoid if you just pair the existing import model with a push data set and by using the tools that we introduced in this video you can actually make it in a simple way yes you have to pay some developer in order to build the overall the overall architecture but once it is there and it is set you have a very fast import model for your historical data and you have a very efficient push data set that contains the data of today and you obtain a better architecture that can last for longer enjoy push datasets [Music]
Info
Channel: SQLBI
Views: 29,063
Rating: undefined out of 5
Keywords:
Id: vpD_PKtcTj8
Channel Id: undefined
Length: 28min 51sec (1731 seconds)
Published: Sat Oct 23 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.