Best tips when working with Power BI dataflows!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so hello everybody and welcome to another power bi video in today's video we're going to talk about data flows i've been setting quite a few data flows lately and the experience is not really quite what i expected so i've created a set of routines that help me work with data flows and you know ease some of the problems that you will encounter when working with dataflows so hopefully they will this will make your experience a lot smoother than it's been for me uh i've i'm i've recorded this two separate days so you will see two different clothes just so you know and mainly because i forgot three tips that i think are important to mention okay so with that said let's get started here are my top data flow tips to get that thing working so the first thing when you you know are working with data flows you obviously need to connect to the source and then you know a data flow moves data from the source to a data lake i found and this is a new experience for me because if i remember correctly i might not remember correctly but if i remember correctly you could connect to a lot of data sources especially cloud data sources without a gateway now it asked me for a gateway all the freaking time all the time you need to get winning though and gateways you know it's another help so i really try to avoid gateways as much as i can and put things in the cloud because it's just they're expensive if you have them on the cloud and they are a pain to manage so i had sources on sharepoint in the cloud and i was required to use a gateway and i was thinking why so the way i solved it that really works for me is that instead of using the shared data flow i created a data lake or the shared data lake i created a data lake and use that instead so you create a data lake and then you move in your workspace you say i want to use my data lake instead of the default one in order to do that though you need to have a clean workspace you cannot have anything in there it won't work so be aware of that and once i set up that then any online source that did not require a gateway by itself i could actually move it to the data lake without issues beautiful thank god because that was really really nice it took me quite a while to actually figure things out on that so we've managed to connect to our data sources and now we're working on the power query online experience the power query online beautiful and i know that a lot of the development has been doing done on power query online on data flows so i thought oh it probably works better than power bi desktop it does not it does not so number one slow i had a million rows excel file a million rows it's not that big a million rows and it was so slow it was like oh my god while working with it on power bi desktop it was oh so if your data is a little bit big it's always going to be slower for what i found no matter the size is just going to be slower and once you start working with it you're going to get a ton of errors and they won't tell you what the error is example i had some queries that i already had on power bi desktop i just wanted to move them to the cloud copy them and then you know runs i said you have an error okay which one you have an error tell me which error do i have it just have like a warning inside and says the data flow has an error how is that useful like really so terrible and tip important tip i do that now for all my data flows i have a power bi desktop where i do the development and then i copy the code to the data flow afterwards once another is working because when i had errors on par on the data flow i could actually copy back to the desktop and then the desktop would tell me what the error actually was and then i could work right because it was i cannot go over 20 steps every time i have an error it just doesn't work that way so far desktop development copy paste and now you may say oh good well not really not as easy as you think here's the thing for some reason the syntax for power query online is not the same as the syntax generated for power query desktop and that flew my mind like what the heck so when you're copy pasting from power bi desktop to power query online you will get a big pop-up like red lights everywhere saying hey this syntax is not allowing power query online and go figure what the sickness is because it just do you think that it highlights the thing that is different no it just gives you the entire row and then you see like a bunch of red it's like okay let's just do it and check the data afterwards come on tell me where another thing that i experienced for coping from power bi desktop to power query online is that if you copy instead of the entire thing in that one city that's when you get the big pop-up if you copy only a line like a step and then you paste it in the data flow because the syntax is different you're going to get an error and if you don't realize that you wait on the web boom boom boom and then it says again you have an error like where it doesn't change it it doesn't change it which is so annoying one of the things just as an example you have step name in power query line the step name is capital letter um capital letter in power query on the desktop it is capital name lowercase everything and i will give you an error on the other line when you copy just the formula oh my gosh i mean these are the small things that it's like why don't they have the same syntax they shoot the same syntax another thing and this is an understandable thing is that for example dates i was copying data that i have with dates on i copied online and then on power bi desktop i could just change type date and it worked but when i put it online depending on what the settings are for i think it's a browser maybe you need to do the original settings so that's another thing it will give you an error it won't tell you that what the error is you have to figure out so if you have this type of errors check dates because that's normally one of the errors another thing that i notice is that all fields how to have data type in power query desktop you just load everything and then no data types it will give you trouble but in data flow that's not allowed so you have to have a data type and it will give you an error and it will say actually that's nice at least i think it will always say that hey you need to have all data types and if you ignore it it will create itself a step and you will try to guess what the data types are and you might want the issues you can have with that so better that you control the data types yourself and you actually you know specify them so you don't have to let power query do that by yourself another thing that i don't think is a good thing at all is that if you get errors on your rows when doing the data connected you know the data type change in power bi desktop power query you actually get an error and then when you load it it says hey we have 23 errors or 2 000 errors whatever and then you can go and say okay what was the issue data flows do not do that though what they do is they convert all errors to null and that is an automatic step that you can't not remove it will do it by itself and that's not good that's not good at all because sometimes there should be no sometimes i did something wrong with this step and i want to know and that's once more do your data flow transformations on power query desktop because you have full control of what's going on those automatic things won't happen and if you need to fix the errors you can do that you'll know that there's an error you have no way to know if there's an error or not that step gets generated always so was it an error or not who knows who knows okay um next thing so we're actually doing the steps we're succeeding we have two tables we want to do some merge good thing good case use case for data flows actually do the merge there and then import it into power bi desktop so it is already computed and nice so when you're doing merge say you have your fact table here and you have a lookup and you want to add that information to the main table so if you load those tables in the data flow is going to say ah this is a computed entity it requires premium so you need to disable the one of the merged queries and then it will work right if it is possible so if you needed both tables you could actually load these disable that and download this again and load it separately right so there are ways to go around that premium thing just so you know there are ways around it so now we have done all the transformations that we need there is a beautiful button down at the bottom that says even close you say i've made it i've made it you click there and then he starts running some tests he says oh we're testing the schema and something something whatever it takes forever just to begin with that no know what they are doing but takes forever and then most often you you won't pass that test and you are not able to save until that thing does whatever the heck is doing which is super annoying i mean you should do you should be able to save maybe it saves i don't think it saves me to save and then download right he says no i'm not going to let you load because but let me save it because i don't have time i have a team school now so i don't want to lose this work so it's super annoying and it's very i don't know what he's doing in that check if somebody knows just let me know down below it's like some schema whatever whatever it's a big test and just a lot of stuff but i don't know what it does so okay so another thing that has been quite annoying is the refresh so in order to refresh once you set up your data flow you press save and close it successfully loads you get a pop-up that says refresh now it lasts for like two seconds if you don't capture that you have to go back and then you have to go forward and you know you guess you want to check something else and i wish there was a refresh button right on top so make sure you catch that re-first button before it disappears otherwise you will be finding yourself going back and forth all the time again this is a small thing but hopefully they'll fix it soon let's imagine that we actually passed a step that test you get a table in the data flow good you are not done yet you're not the end you still have to pass the last refresh right and that last refresh is i would expect that if we've gone through all the steps no error was found you did this whatever test you do when i press save and close it will load but no you can still get another error once you click load and the error message is as horrible as the other one so first of all you have to go to the refresh history there is a button it downloads a file a csv file for all files that you have to you know now i've learned how to read this i don't have to convert to anything but and the error most of the time is not understandable so you're left again like okay what went wrong most of the time that thing happened when i was not using power query desktop as my development tool for data flows once i've done that things have worked a lot better so if you are going to remember anything from these use power query desktop for development that's my tip okay so the next thing is once i was done with the data flow obviously at the moment there's just one person that can edit at the time so i'm done i want my customers to take over and i couldn't find it anywhere uh the information online is a little bit outdated it says that it's not possible but it actually is if you click dot settings or they click they have to do it they there will be a button that says take over and that will allow them to take over the data flow it's a shame that can only be one owner but i think they are fixing that though so on in time it will go okay here is another thing that is actually quite annoying in general for power query and it's the privacy settings i know that it's meant there to protect the data i absolutely get it but just the execution is quite annoying so you are going to have privacy settings issues with dataflow this is the same experience the the thing is that where are the privacy settings where it took me forever to find it so it is actually on the power query editor on the home button is the third button where it says options so options and then things like local settings there you will find the privacy settings and then you can there is not that many options but at least you can fix the privacy settings if you run into those travels now incremental refresh you have actually managed to refresh but now you want to refresh only new data it is a premium so incremental refresh is available for pro nowadays but not for data flows why why i mean the it is microsoft resources that have been wasted because we are not able to incremental refresh it's a waste of energy it's a waste of it's a waste of everything so why just let us do incremental refresh in a pro license as we can do already with pro okay now you have your data flow everything is running beautiful and you are going to connect to it using power bi beautiful now the first thing that you are going to notice is there are two connectors there is a power bi data flow connection connector and there is a platform power platform data flow connector good what is the difference well one of them is in beta which is a power platform one and when you click on it it will give you this big message that says hey we're still developing this second break and i would say like ah here's the thing the power bi connector the the the one that is not in beta or there is not under development you need to be able to you need to log in on the connections data connections you need to be able to log in on the account that you're doing the development for in order for it to work so if i um if i develop a data flow for myself and the data flow has kerbal as an organization account on my user and then i try to connect the data flow from another customer it won't work because it will give me curves so i need to go into data connections and change the login which is it's an annoyance i mean it's not the end of the world but it's an annoyance so the power platform connector fixes that hallelujah and i thought oh how neat so obviously i'm going to use the power plating connector obviously well it broke and um it i it was so frustrating to troubleshoot because when it broke it actually i went to the data flows everything was working everything was refreshing i had no errors anywhere like why the heck is this not working but i noticed that all the data flows have failed all of them like what are the chances that all my data flows suddenly have changed so i tested the old connector and then i could connect so from that you use the old one still because at some point it will break and it took me a while just to change all the connections just maybe like 15 minutes but still okay so i hope these few tips and tricks and things to watch out for will make your data flow experience heaven and i will see you again on the next video on thursday now that we're changing the schedule probably not sure exactly what it will be but hopefully something good so i was here wednesday as always so let me know if you still have any data flow questions or you have some tips and tricks that i haven't covered and i'll see you again on thursday bye
Info
Channel: Curbal
Views: 11,401
Rating: undefined out of 5
Keywords: Power bi, powerbi, Curbal, power bi desktop, power bi tutorial, power bi training, power bi for beginners, ruth pozuelo, power bi dataflows best practices, dataflows best practices curbal
Id: 5fHsRwEM7SY
Channel Id: undefined
Length: 18min 35sec (1115 seconds)
Published: Tue Apr 27 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.