Power Query Power BI: Transform 2 Fact Tables to Star Schema Data Model (Invoice Data) EMT 1498

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome to a video where we'll see how to use power query inside power bi desktop to go from two fact tables into a proper star schema now this is actually Excel magic trick one four nine eight but we're really not going to be using Excel proper even though the data model is going to start in Excel we're gonna be using power query inside power bi desktop now that said whatever we do with power query in this video can also be done in Excel power query and power pivot so everything we do here can be done either in power bi desktop or over in Excel now the problem is we have two fact tables and we have dimension tables or Excel people think of these as look-up tables these are what we use for the final reporting for filtering and this model will not work we have a filter that can work on this table filters that can work on this table and we have numbers at different granularities so this model is not going to allow us to in our final reporting pull criteria from all of the tables if we can transform this model into a proper star schema model where we have one fact table all the numbers are at the same level or granularity and we have many to one relationships to all of our dimension tables then we can build whatever reports we want with the measures we want in our reports and then use any of the conditions criterias or filters from the dimension tables so our goal use power query as we're importing all of these tables to transform them into star schema and then dump it into the data model now I happen to have the tables here in Excel if we go look at the power pivot data model over here in diagram view that's our starting point now for this video I happen to have the data model here in Excel but you easily could have this in an SQL database and text files online wherever your starting point is it's easy and power query to connect so for us we're going to connect to this excel data model now Before we jump over to power bi desktop and import this let's just think about what we have to do we want to bring two numbers shipping cost an invoice discount from the invoice level over to the line level so we'll have to allocate those will first calculate a percentage invoice discount rate and then use that to allocate the discount down to each line then we'll figure out how to allocate shipping cost based on weight and then use that to allocate shipping down to each line then we're gonna have to make sure and pull sales rep ID and date over from this side to the line side so that then we can connect on this many side over to the two dimension tables over here so it'll be a two-step process allocate the two numbers then bring the many date and sales rep IDs over and connect these two dimension tables then we'll have everything we want all of the numbers at the correct line level or granularity and our dimension tables surrounded in a star schema now I'm going to close this now I opened up a blank power bi desktop model we can see up here it's not named for those of you that have not used power bi desktop it is a free download just download it now I want to save this so I'm going to hit the f12 key to open up save as I'm going to save it to the desktop call it something like that that'll be the final name if you want to download this final file now over here in power bi desktop where as power query it's the home external data this get data button this is actually over in Excel dad our ribbon tap get and transform you can connect to whatever data source you want before you do all these power query steps to transform we're gonna connect to Excel so I click that I find the file I have it on the desktop the file you can download this the link is below the video double click navigator opens up now this is looking at a data model so we see that little line above each table these are their tables in the data model those are Excel sheets I'm going to select each one and we do not want to click load we want to click Edit because we want to transform it before it gets dumped into the data model so I click edit here's our five tables now we're going to start at the F line item invoice detail table and actually most of the steps we see in this video we actually have seen the last few videos but we're gonna have to have some extra steps to make sure we end up in that star schema all right the very first thing we need to do is we want to calculate line item sales so I'm going to click on quantity hold control click on unit price up to add column over two from number standard and I'm going to multiply I can see the table dot add columns it's important to realize that table dot add column function is power query M code that's a power query function not a Dax function there's four arguments here I want to edit the second one so I double click multiplication and this will be called sales now in the last few videos I called this line sales because we wanted to distinguish between invoice and line sales because we had lots of steps but this is going to be part of our final data dump and I don't want it to say line sales I just wanted to say sales and there it is it calculated the line sales now we need to look up weight over in the product table we go up to home over to merge merge queries here's the product this is the many side we're gonna select D product this is the one side we're gonna do a left outer join to bring in all of the proper weights for each line click OK we want to expand because the only thing we want uncheck that uncheck everything is wait click OK now that we have weight we're gonna click on that hold ctrl click on quantity back up to add columns and we'll use multiply double click and we're going to call this line weight and enter now our next step is we need to add each line sale and line weight that means for each invoice we're gonna have to add so for one two five four seven seven we're gonna have to add sales and line weight now once we have the total sales for an invoice and total weight for an invoice we can use those numbers as the denominator to create an allocation rate to allocate the shipping and discount that means we're gonna have to come over to the invoice number column and use group buy to make an aggregation on line sales and line weight and guess what we're gonna need the line sales and the line weight and the product column later in our query so we're also when we group I gonna group the actual rows or records so we'll have three group buys aggregate some aggregate sum and please take all of the records and save them for later calculations so I come over to invoice number right-click group by advanced down to new column and the first column named invoice sales will sum and will add up the sales which is the line sales and aggregation this will be called invoice wait some we want line wait and now add aggregation and this will be something like invoice records and we're gonna take all the rows click OK for each invoice we now have exactly what we want invoice sales wait and if I click off to the side not on the yellow table there's the record saved up where I can expand this column later in the query and use these columns now this table is currently sitting at the header or invoice level we're saving these up later because later we're gonna have to go back to for the final fact table back to the line level but now we need to join this header invoice level table to the header table because we need to pull the two numbers at the header our invoice level and the D and sales rep ID on the many side so we can connect later to sales rep and date so I go up to home over to merge merge here's our invoice number that's gonna connect to our header table invoice number this is in essence a one-to-one it's going to pull one record for each one record left out our joint is fine click OK now we have two tables this one we're saving up for later but down here we can see date sales reps shipping cost and invoice so I click to expand uncheck invoice number we only need date sales rep and these two numbers click OK date and sales rep ID these are the foreign keys or the many side we'll use those later to connect to our dimension tables now look at this we have if we click on invoice discount that's going to be used in the numerator I hold ctrl click as the second click on invoice sales I want to calculate invoice percentage discount add column standard and we're going to divide because we clicked on invoice discount first it's the numerator double click then the second argument of table dot add column or we're going to call this invoice percent discount and enter now if we scroll over here that percentage can be used that's how many pennies for every one dollar sales was given as a discount so we can use that later when we expand back to the line level and we'll use that on each line the weight however if I click over on invoice records notice if we're going to allocate this shipping cost of 98 7 for this invoice on this row if we're gonna allocate that we're going to use 743 in the denominator and then each one of the weights a different weight for each line in the numerator so we'll have a different percentage for each line and the total percentages will add up to of course a hundred percent and that will be what we use to allocate shipping cost all right so we're ready to expand we're gonna click the expand' uncheck everything we definitely need line weight sales and we're going to need product that'll be our foreign key on the many side that will connect us to the product dimension table now we have to choose we might want quantity if we're gonna analyze that and by the way this unit price sometimes unit price is stored as a fact which it is here because it changes so often if it didn't change a lot then we'd store it over in the product table now if you're gonna analyze this like the max or min or something like that you might want to bring that in we're not gonna bring this one in and we're not going to bring weight either because we don't need that at all now I click OK and if we scroll over we can see there we're back to the line item level now product will actually use as a foreign key to connect to the product dimension table quantity and sales will be in our final fact table we I want to analyze quantity we're definitely going to use sales and a calculation in just a moment and we'll use it as a final column weight will only be used in a calculation before we get to the final fact table now I want to change the data types here so I'm gonna click ABC icon for data type this is going to be text notice it added a line here we'll do the same thing for quantity as this whole number sales this will be fixed decimal or currency notice they all got consolidated into one step we can see up here in the table not transform column types product quantity and sales with the correct data type now one advantage of doing this step right here is sales we're actually going to use it in a subsequent multiplication calculation and because this is currency table dot add column will correctly apply currency when we calculate discount so yes the next two steps we need to allocate shipping an invoice discount all right let's do invoice discount that simply is invoice percentage discount hold control over to sales add column standard drop down multiply we can see table down add columns they got currency at the end we definitely want to double click and this will be a final column called discount and enter if I scroll over there we have allocated one to three lines in the line item table to split up that hundred and forty four now we need to add a custom column to allocate our shipping cost so we're gonna go up to add columns custom columns will call this shipping and our formula actually I forgot to round will go back and around here we're gonna around so we're using number dot round open parenthesis in our calculation first is the allocation rate so I take line wait double-click there it is that's the numerator divided by invoice weight that's the denominator that will give us the proper percentage for each line multiply it by shipping cost which of course got properly repeated three times when we expanded the records for each invoice now comma two because we want to round to the penny closed parenthesis now when I click OK we can look up in the formula bar there's table dot add column there's the previous step that this line acted on the name there's each calculation for each row but it left the type out so at the very end I'm going to just type it in comma and I notice from the last step we should have looked at it but when we change the type for currency its currency dot type now hopefully I spelled it right and got the big C and the big T and I hit enter there we are rounded with the correct data type now let's come back to insert multiplication right here which is really just the discount I forgot I wanted to round so I'm going to edit table dot add columns after the each number dot round open parentheses comma to close parenthesis and enter one difference between doing this over here in power query is that round function actually does bankers rounding or halfway even rounding which is more accurate over large columns of numbers then is the round function inside of Excel and DACs so if we did the rounding over and Dax and compared it to this column here in power query we be off by a few pennies maybe even up to a dollar all right we click right at the bottom we can see we have one two our ala cated shipping and discount for each line now we want to only have some of these columns in the final fact table now invoice number you may or may not want that I'm going to leave it invoice sales we definitely don't want we don't want invoice wait product they do so I'm holding troll clicking on product quantity sales don't want line date sales rep ID still holding ctrl still holding ctrl I don't want any of these but I do want discount and shipping any one of these columns right click remove other columns click and there is our final line item fact table we went from to fact tables down to a single table and allocated discount and shipping and calculated the sales now I don't need when I load this into the data model I don't want this invoice header to be loaded so I'm gonna come to the left right click and I'm gonna uncheck enable load that way that little italic right there means that was just used as part of this power query transformation now here's the moment of truth we can come up and close and apply we can see the four tables are being loaded and if we go over to relationships view there is our almost correct model there's our one fact table there's sales rep there's product those ones were picked up correctly product and sales rep but this one we're gonna drag date over to date and there we have one two three dimension tables and we might have more also and our one fact table now I want to go and add some measures before we create a visualization selecting line item table model a new measure up in the formula bar total sales equal sum and we're gonna do the F down arrow to sales tab close parenthesis in inter modeling we're gonna add and attach some number formatting so whenever we use this measure number formatting will be applied we can see over here in our list of items in this table that icon that's a little calculator means measure those Sigma's are different than power pivot does not mean a measure it just means that column has numbers in it now we're not going to use quantity in our report if you wanted to add it we'd add another sum here but we definitely are going to add discount and our shippings so I come up to modeling new measure so our total discount sum and we're gonna add up there it is our discount close parenthesis enter add some number formatting a new measure total shipping equal sum F and then down arrow to shipping tab close parentheses and enter add some number formatting now I'd actually like to compare discount and shipping to sales so we're gonna add two more measures which basically divide each of those items against sales so something like percentage discount on sales and I'm going to use the divide function in the numerator now we've created measures for our aggregation so I type a square bracket so total discount I see it there so tab comma denominator square bracket total sales tab close parenthesis and enter new measure something like percent shipping on sales divide square bracket I'm going to get total shipping as a measure comma square bracket and there's total sales tab close parenthesis and enter now I definitely want number formatting for both of these so I'm going to click the percentage and over here in power bi desktop it'll add to decimals by default click over on percentage discount add the same percentage number formatting now I'm going to come over to line item sales and right click date and point to hide and report view I do not want to use that field same with discount hide and report view for product to quantity sales sales rep ID I wish I knew how to hide all of these except for the measures right click hide and report view someone please post a comment below about how to highlight all of those and just right-click or where it is up in the ribbons or wherever because doing it one at a time is a hassle now we have built our model if we go over here star schema now we can use these measures and any one of these dimension tables conditions criterias filters will work on our measures now we go over to report view and we want to make a simple visualization now here's our tables we can expand and notice that we only see our measures in our line item fact table and then here are each one of the dimension tables now we're going to select table and I'm going to pull a condition or criteria from product selecting product we see a unique list there now I'm gonna select with the checkbox total sales total shipping percent shipping total discount and percent discount we can come over and pull the edge here's our simple visualization we're gonna come up and there's the paintbrush for painting the wall of your house or adding formatting to our visualization table style I'm going to click the drop down and select flashy rows the grid we're gonna increase the font maybe to 12 pull down the edge here some more formatting perhaps you can format it high however you'd like column headers background color I'm gonna pick some dark green and then font white alright now we want to surround it with some slicers so off to the side I click on the white and I click on the slicer let's go to the date all I want is year I do not want that slider come over to the paintbrush slider off click and drive now you would think that over here somewhere it would be the ability to turn this into a list but sometimes you have to hunt around especially SXL people if you click the drop-down oh there are some options and list is the one I want that looks like a little slicer I'm gonna drag year over here click in the white slicer from product I want to see manufacturer and you could format these however you'd like and one more slicer down to the sales rep sales rep Chuck this is a big list I'm going to drag it over here click and drag and now here are all the measures from our one fact table I can select the year instantly that works select the particular manufacturer and some employees if I hold ctrl those are the employees on the particular team I want to see 2018 19 gel boomerangs and our one fact table with all of our measures can be sliced and diced with any of the dimension tables alright so in this video we saw how to go from two fact tables into this star schema we learned all about and we can actually access and edit the power query transformation at any time clicking edit queries over here we did all of these steps transforming from two fact tables and dimension tables into our star schema then we created some measures and built a simple visualization alright that was a lot of fun with going from to fact tables into star schema using power query inside of power bi desktop and be sure if you like that video to click that thumbs up leave a comment and some because there's always lots more videos to come from excel as fun alright we'll see you next video [Music]
Info
Channel: ExcelIsFun
Views: 29,310
Rating: 4.9307361 out of 5
Keywords: Excel, Excel 2016, Highline College, Mike Girvin, excelisfun, Excel Magic Trick, Power BI, Power BI Desktop, Star Schema, Two Fact Tables, Excel Magic Trick 1498, Transform Two Fact Tables to Star Schema, Power Query in Power BI Desktop, Power Query Transformation, Data Model Star Schema, Convert Two Fact Tables To Star Schema, Invoice Header, Invoice Detail, Invoice Line Item, Power Query Power Pivot Data Model, Power Query Power BI Data Model
Id: iK0uKo2G8tA
Channel Id: undefined
Length: 26min 10sec (1570 seconds)
Published: Sun May 27 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.