The How and Why of Power BI Aggregations

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yo what's up it's Patrick from Guyana cute and in this video I'm gonna talk about why do you want to use aggregations where do you create aggregations and how do you create them stay tuned okay so aggregations been out for a while a lot of people are using them a lot of people are not using them because they just don't quite understand when when do I use aggregations where do I create them and how do I create them so I decided to do some videos to kind of demystify this whole concept of aggregations all right and there's a lot of blog blog posts out there shopping on Watson has written the blog posts Kasper de Jong Phil C mark there's tons of videos I like I mean blog post Kasper wrote a great one about why right you probably should start there then move on to the other blog posts alright and so I'm gonna start I'm gonna start talking about why you should create aggregations right why you should use it if you have large volumes of data hundreds of millions billions of rows and I'm sure you've seen some of you have seen Christian waves demonstrations that clicky click you drag it drop it a demonstration that he does over so much data right you have lots of data can help you unlock the insights right you can start doing analytics over these large data sets but what aggregations also do is they also help speed up your refresh because instead of trying to refresh hundreds of millions of billions of rows you refresh a smaller data set or write a smaller amount of data then they can also help reduce the size of your models right so keep them small keep those models at a smaller size alright so spring clear reasons and the last reason that a lot of people don't think about I was just talking to a friend buddy of mine Josh Kaplan about this and it's if you're working with a data set today it may be 50 million rows it may be 10 million rows but that data set is gonna grow right he's gonna get larger and larger and larger your refreshes are gonna start slowing down maybe even your queries or your query slow down a little bit and your model grows right if you use aggregation it's like a proactive step to have mitigate against potential performance problems with refreshes and just overall query dry and a lot of people don't even think about that all right so that's why you should use them where do you create them well before you think about creating mean you think about at what grain do you want this aggregation and this is really critical because you need to know make sure that the grain that you select is going to accommodate the queries that are generated by yours by those visuals and we'll talk a lot more about that you know in some subsequent videos all right but today we're just talking about where do you create them so the first thing you need to think about is at what grain right at what grain maybe you want to look ensure that you're aggregating down to the product or to the territory or to the date a date is a really common one and that's the one I'm going to use in this video that's the scenario I'm going to use in this video so let's pretend let's assume that the only aggregations we care about at the date level right the grain is at the day level but I have a date table that has you know month year and quarter and so if I establish a relationship early enough of all this talking you guys know I like to do is head over to my laptop alright so let's take a look at this so I have a table here in my day 2 warehouse that has a hundred and twenty eight million rows right on my fat online sales it's using a control so DW database I just kind of exploded it out all right and I decided that I only want to create visuals that are associated that go down the the lowest grain is at the date level alright and so what you would do is to create your aggregation so I'm gonna write a query that aggregates down to the date level so I'm saying give me the date key and then the measures right that I'm gonna create are gonna be based on sales amount quantity and a count of the number of rows in the table all right so I'm gonna say give me this sum this sum this I really don't need this I was thinking about that you know just a little bit ago because everything I need is contained in that one table and then grouped by the date right so instead of having 128 million rows when I run my refresh and I run this query against my synapse server not my data warehouse anymore query shouldn't take that long to run and then it shouldn't take that long to refresh but what's gonna be great about this is look how many rows right only four thousand five hundred and eighty two rows so I went from 128 million rows to around for four thousand rows four or five thousand roubles are you kidding me so instead of running the Refresh to pull all that I only have to pull four thousand five hundred and eighty-two rows so now I've identified migraine now I needed to decide how I'm gonna create these aggregations the first way is you can persist the table to your database of data warehouse using some ETL process maybe you're using the SSIS maybe using data factors are some of the ETL tool right and so you can use this in synapse you can do a create table ass and then just it'll persist all that data to a table for me and maybe I'll do a drop table before and then I run my create table at right you can persist it to your database which is easy then I can just use import and import it you can create a view really simple right write a query create a view these two or if you have access to the database right if you have access to the database or you know the database developer or the data DBA you can say hey can you create an aggregation for me as part of the ETL process or add a view to the data warehouse that looks like this alright but what Patrick I am access to the database well if you don't have access to the database then guess what you got two choices you got two choices what you can do you can write your query up the way you want it aggregate it go over to your database go over to power bi desktop sorry connect to that database and then use native query remember if you use native query no query folding is gonna happen so make sure you do all your transformations in that native query if you haven't watched the query folding video I did go take a look all right okay you can use that you can do a native query or you can use power query and so what you would do is the first thing you do is only select the columns that you need to build that aggregation so click Choose columns uncheck all I need is the date key the sales quantity and the sales amount and click OK right those are the two columns I need let's check let's see if query folding is taken in the fact right so there you go right view native query there is my query folding happening and then what you'll do is you use the group by function so I'm gonna say group by this date key group by my date key and then I'm gonna have fact online sales count because that's the one I want right and then I'm gonna click advanced and I'm gonna say add an aggregation and I'm gonna say sales amount sum right I'm gonna choose sum and I'm gonna choose sales amount then add one more aggregation because I need to work with both quantity and sells them out and choose sum and then change this to quantity right so now I'm aggravated this down to the grain of the date key and if I click OK right it's gonna take a little bit for it to go out and run the query to pull my data back just be patient just be patient I'm gonna wait BAM and just like that there it is right it's aggregated down for each individual day and it's giving me the count and the sum for each one of those values he's it folding back let's see right click here view native query and the query that it generates it's not as clean as the one that I would have written in T sequel myself but it's essentially doing the exact same thing all right and now if we click close and apply let's see how many rows is this gonna pull back right let's just take a peek and do a refresh and let's see how many rows this is gonna pull back 4580 - if you paid attention fast enough it was four thousand five hundred eighty two which is equal the same number of roles that I ran when I ran there with that rich will return when I ran that query what this is bananas right so you can create it you can persist it as part of your ETL you can create a view you can use a native query or you can use power query alright this is just getting you started priming you up with aggregations in subsequent videos I'm going to show you how to configure them and I'm gonna talk about different other options when you're using aggregations all right what do you guys think are you using aggregations today you have any questions comments let's continue the conversation we're in the comments below it's your first time visiting a guy in a cube channel hit that subscribe button and if you like my video give me a big thumbs up as always for mattering myself thanks for watching we'll see you in the next video
Info
Channel: Guy in a Cube
Views: 53,992
Rating: 4.9656816 out of 5
Keywords: power bi aggregations, power bi, power bi aggregate table, power bi desktop, power bi for beginners, power bi group by, power bi group by month, power bi summary table, power bi training, advanced power bi, azure synapse analytics power bi, azure synapse power bi, dax power bi, how to learn power bi, learn power bi, learning power bi, azure synapse, azure synapse analytics, aggs
Id: EhGF372t0sU
Channel Id: undefined
Length: 9min 1sec (541 seconds)
Published: Wed Mar 04 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.