Create custom keys for your Power BI relationships

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yo what's up Patrick guy in the cube and in this video we're gonna talk all about keys right how you use them how you create them power bi desktop stay tuned [Music] talked a lot about data modelling and I get a lot of questions about and people go Patrick you're pulling your data from a data warehouse it's already nice and pristine you always talk about oh you should separate your lookup tables or dimensions into one table and keep all your facts you know you're additive value semantics two values in another table the problem is when people get you know their data in Excel flat files CSV files however they get their data doesn't have the nice keys to perform the joins and they go Patrick well how do you create those keys well in this video I'm going to show you exactly how I do it alright so instead of all this talking you guys know I like to do let's head over to my laptop I have this flat file that I received and there's lots of different ways that I can you know aggregate my data I look up my data or perform analytics against my data by airport name by airline name claim type claim type plain sight item lots of different attributes here but if you scroll across this is just one fat flat table and when I load it up into the power bi desktop I'm gonna get a single table with lots of fields you guys know I like to I like to group you know group things together and so what I like to do is I'll go and you know I'll analyze this this flat file and I'll look for you know common things that belong together in a single table for example Airport code and airport name and this one's kind of nice because it looks like it has unique identifiers but that's not what I want to talk about today the ones I want to talk about are the ones that don't have unique unique identifiers the strings like this one this is a great example airline name right it's just the airline name and if I made it a separate query and then joined it to back to this table I would just be joining those strings really but let's let me show you how you can use these keys how you can generate and create a key to kind of insulate you from you know all the caveats using strings in your joints and just use integers alright that's what this video is focused on so the first thing I do is I start massaging I start getting intimate with the data alright so the very first thing I do is I go to transform I right-click on that column go to transform and I trim it out right so I remove the white space at the beginning and the end just get rid of all the white space then the next thing I do is I go to view in the ribbon in the query editor you know click on column quality and notice there's lots and empty columns there mmm need to fix that because I don't like blanks in my in the list of values I'll do column distribution and show me that's a good distribution of data here and then even my column profile all right show me some things here right so it looks like the only thing that I really need to clean up is blanks but I do one more check I'll do one more check before I'm done so I'll turn all this stuff off right and what I'll do is I'll go home and I will sort this all right I'll sort that value Aysen there's a very first thing I'll do now see all my black value so I'll see all my blank values I'll right-click I'll go to replace values and I'll decide on what do I want to replace that blank value with so I'm gonna say that they didn't supply right the air airline name was not supply BAM there we go so now it's not supply the last thing I do before I get ready to get ready to build that table is I want to take a look and see if I can get a good idea of all the distinct values I've done that with my column profiling but sometime it doesn't reveal you know some like characters in that list right so let me show you what I'm talking about I click on the drop-down and right now it says hey only not supply there but I'm gonna choose load more and it'll reach back grab all that data and give me a distinct list and just as I thought right there's a - I don't want to use that - either I'll probably do some research on and trying to figure out what - mean but I found out that - and blank mean the exact same thing so guess what I'm gonna do right I'm gonna do a right click here replace values and I'm gonna say replace this also with not supply right and so what you'll notice is now if I go here and do this load one more time all right I don't have any blanks so I don't even need dashes or anything like that and my data set it's a nice clean list of values and what you'll also know this is if you take a look at the little profile bar here I don't have any blank values right it's a 100% valid data which is great so the next thing I do to get to get this data set prepared for my keys is I'll right-click on this column and I'll say add is a new query if it's multiple columns you'll need to duplicate that set right duplicate that query remove the other columns and then start didn't take do the steps of cleaning up from there fortunately I'm only looking at one column so my steps are a little shorter so I'll right-click on it and then I'll choose add as new query all right there we go and it's a new query airline name it provides it as a list I'm gonna say to table really simple click OK alright and I'm gonna give this a name I call this airline clean and then I'm gonna rename this table it's an airline two more steps right the next thing I'm gonna do is once this finish it once the query editor finish is doing all its work I'll right-click and then I'll say remove the duplicates right so now I have a complete distinct list of all the airlines that I want to analyze right distinct lists and then finally what I'm gonna do is go to add column and there's a feature people don't talk about this to use this feature it's called an index column I'm gonna click the drop-down and you'll have a choice right you can start from 0 from one I can do your own custom like I've calling an auto-generated number in column ok so I'm gonna go ahead and say from one there you go so it gives me a unique index a unique ID for each one of those values so I'm gonna take that move it over here call this airline ID and so before before I move on right a lot of people why are you doing this it's just easier for me to join in on those integers I could be pinned on the integer a little better since I'm generating it I control it I'll have to worry about anybody changing things mucking up things it can slow your refresh down doing a lot of these because the mashup engine has to do all this work especially if you're going to get to CSV file if you have a data warehouse if you have access to a relational database system you should try to do this you know on the back end or inform the people that's developing the repository the central repository that you're pulling from hey can you guys do this for me this way it makes my life so much easier ok alright so the next thing I do is I'll go back to the main table where everything is that and I'll do a merge sort at home and I'll do merge queries and I'm gonna merge on airline name and I choose my new airline one airline name right and you can wait for the estimator to tell you hey this is how many matches up my own right and we'll just give it a second here through the magic of YouTube Adam will probably cut all of it out so we could quickly see what the match is there you go so 100% match man am I good all right so we'll click OK and then the very last column that you'll see here is one that says airline it says table there's two little arrows that's pointing out like that right click it and what I want you to do is uncheck airline name and uncheck use original column name is prefix and click OK now let's give me my airline ID so I can match back on the ID in this table to the ID in the new table that I created and then the final step and the query editor at least I'll go over to airline name and I'll remove it right sorry cuz I don't need it anymore because I'm gonna look it up from the other table then I'll close and apply Part B I'll do this thing pulling all the data from the CSV files for both of them it has to pull it all do all the work of mashing doing all the preparation of that data like I did there we go this is the important part is to detecting my relationships so if I go over to the model now you can see that I have a one-to-many relationship between air line ID so instead of having all those fields clumped into one table right I can separate them out using this method and then my end users can look at them individually in the field list and say oh okay so I want to get a list of Airlines and I want to filter the claims by those Airlines instead of scrolling up and down this list trying to find you know the column that they're looking for the attribute that they're looking for give it a try let me know what you think are you doing this today or do you have a different method I'd love to know let's continue the conversation we're in the comments below it's your first time visiting a guy in a cube channel hit that subscribe button if you like my video a big thumbs up as always from Adam and myself thanks for watching we'll see in the next video
Info
Channel: Guy in a Cube
Views: 97,212
Rating: undefined out of 5
Keywords: power bi, power bi dashboard, power bi data modeling, power bi data modeling best practices, power bi data modelling tutorial, power bi desktop, power bi desktop tutorial, power bi excel, power bi for beginners, power bi index, power bi index column, power bi pro, power bi relationships, power bi reports, power bi reports tutorial, power bi training, power bi tutorial, power bi tutorial for beginners, power query power bi, power query
Id: Zlu99RUtMRY
Channel Id: undefined
Length: 9min 44sec (584 seconds)
Published: Wed Feb 05 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.