How to use Dataflows in Power BI // Beginners Guide to Power BI in 2021

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video i want to show you you can start using data flows with your power bi reports in order to reuse or share the same transformation and cleaning steps to your team members or to use for multiple reports we're going to go through the setup step by step and also why you should be using it in the first place all of that and more so without further ado let's get started hi my name is fernando and welcome to the solutions abroad youtube channel where i focus on teaching beginners the wonderful world that is power bi i upload new videos every week so make sure you hit that subscribe button and the bell icon to get notified when a new one is out so to explain it better let me show you an example here so i have created here two reports in the same workspace the northwind workspace and it just has some different types of reports coming from the same place i have a sql database that i've set up which contains the information that we use here in these reports and these reports themselves are quite self-explanatory they are just analysis of the data itself it just gives some different calculations total sales month-to-month change things like this and we have some you know basic abilities here to filter the reports and the visuals and the employee analysis report is pretty much the exact same thing except that it's looking in the lens of the employees how the employees are doing our sales and if we go to the workspace itself and i change the view and let's look at it on the lineage view you'll see that we have a couple of things going on here we have these two reports that we have here in blue and we have two separate data sets for both of them and on the left hand side here you can see that we have two connections to the same database so this is the sql server that i've set up for this demo and although we have two different data sets here that are pulling from the same place we actually have two separate connections to the same database so what do i mean by that so here we have the employee analysis data set right if i hit the refresh button here to refresh the data set that we have here it will make a connection with the sql server and say hey give me the data that you have from your server from your database into my data sets okay so now we go to the customer analysis and we hit the refresh button here and customer analysis does the same thing it gets to the sql server and says hey give me the data that you have on your database so i can import it in my data sets and the problem arises when you start creating multiple reports pretty much pulling from the same place but with different connections it means that you'll need to hit the sql server multiple times just to get the same values and this is where the data flows comes in instead of your data sets connecting directly to your data sources the data flows comes in between the data flow will act as a mini etl between your source and your data set so that no matter how many reports you have in your workspace as long as they're pulling from the data flow itself not directly on your data source your data flow will be the only one pulling data from your sql server so you're only hitting the database once so let's start by creating a new data flow first so you can do that by clicking the new button here from your workspace so you'll have a couple of options here you click data flow here and you'll have some options here don't worry about these i just need to click the define new tables and when you get to this page you will kind of start to understand that it looks a little bit like the power query option that you have in power bi desktop and you'd be pretty much right it's data flows just think about the data flows as a way for you to transform and clean your data i.e power query except it's on the service itself so you can see that you can connect to different places here but what we're interested in is the azure sql database because that's what we have so i'll click there and i'll just put some server details here so this is the one that i created on azure and then we'll put my database password right here so that we can create a connection now you'll see we have the um the views and the tables available for us in the sql database so we'll just click the tables that we want to bring in so i'll click in all these tables and then i'll hit transform data so now if you are looking at this it looks exactly like the power query window that you have in power bi desktop and yeah it's exactly that you see you have some different options for you to you know customize and clean your data the same way that you could do in power bi desktop you'll have the queries on the left here and the query settings and the the steps that you have on the right hand side so now it's created a data flow for us which contains the tables and the columns that we want uh to use for our reports so we'll name this northwind data flow for now and let's hit save now ask you to refresh uh the um the data flow which brings in the data that you have from from the sql database so we'll just hit refresh on it so let's do refresh now oh yeah it's refreshing now and before we do anything else let's go here and let's see what's happened so now you'll see that it's a different color it's a data flow now we have to the sql server instead of the data sets and now what we want to do is instead of hooking them up to the source itself we'll need to change the source of the data set to pull from the data flow itself and that part is pretty simple but it's quite long so let's go through it now so here i have the same reports in my power bi desktop i've downloaded this from this service and if i hit transform data here you will see that if i go up here you'll see that the sql database we're pulling directly from the sql database we don't want to do that anymore just think about this we want our reports to pull from the data flows instead so what we'll need to do we'll need to do new source hit more and then from here we need to type data flows and you'll need to select power bi data flows and if you have access to that workspace where we created that data flow it should show up here as a list so you can see now we can see the north wind workspace the data flow that we created northwind dataflow and you'll see all of the tables that we have imported are now here obviously they're empty because we're still refreshing but we have the schema so we can start importing them into our data data set now so now you'll see it's kind of appended the the names with the two because it's got the duplicate query there um so this is the bit that is tedious and i haven't found a better way to do it but for now this is what i did that worked i'm open for ideas if you have but essentially what we need to do is we need to replace the old queries into the new queries so for example for the categories we need to replace this first step to pull from not a sql database but from a data flow but you can't just replace the string here you actually need to replace a couple of different steps so from here what we can do if we go to the categories the data flow category click advanced editor copy and paste this part and then paste it in the categories so hit advanced editor here delete everything and then paste it so you'll see now you have uh you know the using data flows for that query now and we'll do the same thing for every every other queries so now that we've done that let's delete the duplicates that we have here because we don't need them anymore delete and now we hit close and apply now it will probably break the report which is perfectly fine um while we wait for that to load let's have a look at how the data flow is doing the refresh just refresh this to see you'll see that it's still refreshing um it's taking a while but don't worry about it the whole point is that it's supposed to be our pipeline what's interesting about this refresh history is that it shows you how long it's taking to refresh which is normal but you will have the ability to see each entity speed this is interesting because the normal schedule refresh don't give this much detail so you can see it gives you how long every single table uh is taking in terms of refresh so it gives you the duration for every single table and here as i said it's destroyed our report because the data flows don't have any data yet but don't worry once it kicks in we hit the refresh and this should come back up so now let's look at this refresh history it's already refreshed it took about nine minutes and now that we have that refreshed hopefully when we hit the refresh button here it will pull the data that it we have in the um data flows and you'll see that we have the data back here so all of the functionalities are back so hit publish and see what the difference is well actually what i want to do i want to just save this as a different report and then publish it so we can see the difference against the previous report so publish it in the same workspace here and then let's open it from here so you see it's the exact same report obviously but if we go back to our workspace and the lineage view so this is the report that we've created pulling from the northwind data flows so you see now that instead of the data set pulling directly from the data source you have it pulling from the data flow itself which has its separate refresh independent from the data set itself and that's what you need to bear in mind when implementing data flows is that these are separate entities the data flows and the data sets so if you want to have scheduled refreshes like you did before you need to make sure you have a scheduled refresh set up for your data flows so if you want to refresh it every day just make sure you schedule it as such and then you need to have a separate refresh here in your data set to schedule to refresh to pull the data from the data flows into your power bi report data set so although this lineage is a little bit longer what it means is that if you have to create new reports in the future that needs north wind data you don't need to connect to the sql server directly you can just instead select the northwind data flow as your source let's say i'm a new team member and i want to build new sales reports based on my own data i have some imported excel files that i want to use but primarily i also want to use the data from the northwind database now instead of the team having to give me a server credentials database logins if i have access to the same workspace all i need to do from the power bi desktop is to connect to the data flow itself so if you click power bi data flows here so you'll see you will have the option to select uh you know different tables that you want to use here maybe you know i just want to get the order details and the products for my own report you see i have access to that straight away without having to connect to the sql server directly what's more is that if my team members have already done the prep or transforming the data in the data flows itself i don't have to recreate that with my report i just inherited that because my reports are connecting to the data flows itself this ensures that your entities and how you use your data is standardized across all your different reports making it you know easier to build new ones and make sure you can manage multiple reports and that's really it for this video i hope it helped you understand how easy it is to start using data flows with your power bi report leave a like in this video if it helped you it's the best way to let me know that you enjoy this type of content get in touch using the social media links that i included in the description box below and thank you so much for watching guys see you again on the next one
Info
Channel: Solutions Abroad
Views: 1,395
Rating: 5 out of 5
Keywords: solutions abroad, power bi, powerbi, power bi tutorials, power bi for beginners, beginners guide to power bi, data analytics, dax, data modelling, data visualisation, business intelligence, power bi 2021, dataflows, power bi dataflows, how to use dataflows, data flows power bi, how to use power bi dataflows, etl, power bi etl, power bi scheduled refresh, power bi mini etl, extract transform load, how to use data flows, how to use dataflows power bi
Id: 3mox85RpySU
Channel Id: undefined
Length: 13min 48sec (828 seconds)
Published: Mon May 31 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.