Advanced Group By Tricks in Power Query

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello people welcome back to the channel in today's video i'm going to talk about the group by feature in power query i'm going to talk about some interesting and some advanced applications of the group by feature if you already know the group by feature we'll try to take your knowledge to the next level and try to solve some interesting and some sophisticated problems let's start all right quick introduction in the video later this month i am actually going to be doing a live training session on power bi and we're going to be focusing on the hard parts of power bi we're going to be focusing on power query data modeling and dax if you've recently started with power bi and you have struggled to learn data modeling power query and dax in a structured way and you need my help you've watched the videos you've liked the way that i teach and you'd like to enroll for the session this is going to be a great learning opportunity note that this is going to be a live training session and if you're the type of person who would like to sit along with the trainer and you'd like to get your doubts sorted um you know get your doubts clarified as you move along in the training get some assignments right after the training session it's going to be a phenomenal learning opportunity now just two more things there is definitely going to be a last date for the training because we are starting later this month so check that out and also uh there are 15 seats available and of which five seats have been filled up so in case you want a spot for yourself do not wait and just fix up a spot for yourself that's all about it and uh we'll just get started with this video all right fellas i'm an excel power query that's where i have loaded the data i have already done a few transformations on the data i've also used group by i'm assuming that you already have the working understanding of how the group by feature works i'm going to build on top of that and share one or two interesting and advanced applications using the formula editor in power query please take a look so i go over to the source step and i take a look that i have four columns the year the sales rep the customer and the sales pretty standard nothing complicated there's a change type step nothing complicated again and we group the data using the group by feature and we say that i would like to summarize the data by the sales rep by the customer and i would like to show my total sales against every single sales rep every single customer that's pretty standard nothing that complicated now if i just happen to sort the sales rep just for a quick second i'm gonna see that appears entry is repeated multiple number of times because abhay has sold the products to all of these customers that are mentioned right here now i'm trying to do something different right so i'm trying to capture all the customers for away in a single cell separated by a comma i'm not really interested in the total sales all that i would want is sales rep after sales rep after sales rep names of all the sales reps and in the next column i would like to take a look at all the customers being concatenated uh with a comma in a single cell that's what i would want how do we do such a thing in power query let's just start all right let's just delete a few junk steps that we have done start to work with the user interface and some work with the formula editor as well so i'm going to get rid of the sorted row step and i'm going to start to work with the group by step that we have already created if i go over to the gear icon right here i'm gonna see that i have two steps right here the sales rep and the customer which is uh the two categories by which i have grouped my data and i can also see that i have a total sales which is nothing but a mathematical calculation of summing up the sales column this is not what i need i instead want a list of all the sales rep that i have against which i would want to concatenate now the problem is that in this particular operation box there is no concatenation option which is available it just does the mathematical operations some count average that sort of stuff you can't really concatenate anything and then separate them out with the comma with the user interface so what i'm going to do is start to play with this a bit and then for further ahead i'm going to use the formula editor the first thing that i would want is that since i just want the name of the sales rep i'm going to get rid of the customer from here so get rid of the customer from here and this is where i'm wanting to have the customer so against the sales rep i would want to have the names of the customers concatenated so for now let's just at least call this as a customer column and we'll do some work later and i certainly don't want to sum i would want to get all rows of the matching customers and say okay now if you haven't really worked with the all rows you will see that what does the all rows operation do i'm going to click on ok this just produces a two column there table first is the sales rep obviously you understand that and next is table table table these tables are all the matching records for the particular sales tab so if i peek into the table i'm going to see that since we have sales at persha here these are all the matching records for russia now from this particular table that we have been able to get using the user interface i would want to extract the names of the customers from this column and remove the duplicates and combine them using some concatenate function and separate them without with the comma how do we do such a thing now i'm going to start to make changes to my table.group function which power query already created for me once i was doing this particular step now once i'm working with the table dot group function i will also try to explain to you the syntax how this works and what changes can you make in this formula so let's just start so table dot group the first part right here is the name of the table and if you take a look at change type it is nothing but the table on which you're trying to initiate the grouping operation change type is nothing but the step which was previous which was giving us the table on which we were trying to build our grouping next is nothing but the names of all the columns that you would want to summarize your data with so i just wanted to summarize my data with sales rep in case i wanted to summarize my data by more number of columns i would have written those columns right here remember that these columns need to appear in a list format and hence you see the curly brackets right here okay now and after that you write all the calculations that you would want to do on multiple columns perhaps but in a form of a list of a list so if you take a look uh after this you can see that i'm trying to create a customer column and there is some piece of code right here but there are two curly brackets so all the columns that i would want to create in terms of calculations or summarizations or whatever that might be they need to be in a form of a list of a list so every single column that you create is going to be a list and all of those lists are going to be packed in an outside list we'll see once we create more columns okay so for now i'm just going to get rid of all of this data type settings so that you have type table number and things like that these are not really the compulsory inputs we can just get rid of these things so for now i'm just going to get rid of all of this and just keep the operation that we have done so now if you take a look at this particular simple function right here i have created a customer column which is where i'm keeping the table as it is which is nothing but the underscore uh right here in this context now what do i do with this table from this table what i would like to do is extract all the customers right here now the underscore is giving me a table from the table i would want to extract the customer how do i do that so i'm just going to open up the square bracket right here and start to write the name of the column that i would want to extract from the table so if you have a table uh and if you would want to extract any particular column of the table you can just write the name of the column in the square brackets and you would extract the column in a form of a list now if you take a look at this list these lists are nothing but all the customers that belonged to our shop pretty sweet now what do i want to do i don't really want to have duplicated customers right here i don't really want to have that i want to have unique customers perhaps so i can remove the duplicates now this produces a list and i have to work with a list so there happens to be a function called list dot distinct i believe and list.distinct function asks you hey could you give me a list from which i can give you all the duplicates removed and a unique values sure enough this actually produces a list the underscore produces a table the customer column produces a list and this when fed inside the list are distinct is going to give you all the unique customers and the duplicates have been removed now what do i want to do with this list i want to combine all the values and put them out in a single cell how do we do that i'm going to write something like text dot combine i want to combine all of these text first text second text third text fourth text and if you take a look at text dot combine you can see that it is asking you that hey can you give me multiple texts that you would want to combine but please provide those texts in a form of a list sure enough we have lists right here and it is also asking you for a separator which is nothing but a comma a pipe symbol anything that you would want to have so i'm going to say hey here is my list and please take this list and separate every item with a simple comma and a space and i'm going to close the parenthesis right here come into the formula press enter and voila what do we have we have all the customers right here in combined in a single cell separated by a comma and this is something which is not available in the user interface of group i but you can certainly go ahead and tweak the formula and carry out more advanced operation all right ready for the second one let's go all right let's just continue the problem further and try to solve for another interesting problem that we cannot solve it using the user interface let's just say that we have been able to create these two columns sales rep and the customer the next column that i would want to create is i would want to find the year in which the sales representative got the maximum sales i repeat myself once again i don't really want to find the maximum sales of the sales representative that is something that we can easily get in the group by future but instead i would want to get the year in which the sales representative got the maximum sales how do we do that so what i'm going to do is i'm going to start to write more code in this advanced editor and hopefully you'll get the understanding of how you do that now the first thing that that we have to kind of keep in mind is that every single column that we create needs to come in the form of a list as of now we just have one column created which is nothing but the customer column and as soon as we are creating one more column right here that also needs to come in a form of a list so let's just do that i'm just going to move this list down so that you can see it clearly let's just format the code a bit and now what i'm going to do is i'm going to put a comma and i'm going to create a second list on which i'm trying to work a list starts with a curly bracket and let's just first write the name of the column that we would want so i'm going to call this column as max sales year the next thing that i would want to have in this particular column is the entire table once again i just want to take a look at it get comfortable and then start to write the code so i'm going to write something like each underscore i saw that that underscore actually gives me the entire table that matches all the records for versha that's what we saw in the last time so let's just commit to this formula and let's just see what does give so i'm going to press enter i get that new column which is max sales here there is a table right here which is nothing but the table for the entire data for version in the entire data set now i want to find out the max sales once i do find out the max sales from this table i want to get the corresponding year and then put that here right here as a text value how do we do such a thing now let's just start to work in the same approach that we have worked in the past so the one thing that i understand is sure enough i would want to extract the sales column because that's where i would want to do the mathematical operation of finding out the max so i know that if i am getting the table within underscore if i happen to write the name of the column in the square bracket i'm gonna get a list and what will the list have all the values of the sales right here but in a single column as a list now if i commit to this particular thing that i have written right here i get a list and i get all the sales values which belong to varsha but the problem with the list is that as soon as you committed to the list you have lost the year column that you would also would want to have if you want to fetch the corresponding year so this approach certainly is not going to work because i want to have the max sales but i also want to keep the corresponding year so i would want to do something else so i would rather not write this i would rather write something like table.max in the table.max function the first part is hey which table are you trying to find the max for and i'm saying that hey here is a table underscore actually gives me a table and for all of those tables i'm trying to find a max now you can't really find a max for the entire table you have to find a max for one of the columns of the table okay sure enough so i'm going to say that in this the underscore gives me a table and the column for which i'm trying to find the max for is let's say the sales column so i'm just going to maybe write the sales column in the uh in the inverted commas and i'm just going to close the bracket this is a list everything is good i'm just going to press enter and we get a record that means that what we have been able to get is the record or the row item which is where the sales value was the largest the benefit of getting the record is that the record is going to contain both the values it's going to contain the sales value which is the maximum value and the corresponding all the columns that we have right here not only do we have the sales rep and the customer but we also have the year and from this record what we want we want to extract the year right here okay so what do we do i go over to my function right here and i'm gonna say that this this little thing that i have written produces a record from the record what do i want i want the year column how do i write that i'm again going to just write the square bracket and write the ear and that's it if you extract one column of a record record simply means one row of data what you're going to get is one single value i commit and that is the max sales year that we have been able to receive now this is a conditional lookup so find one value and then get the corresponding value which you're able to get using the group by and a bit of tweaking around in the formula editor that is not available in the user interface all right that's been it uh in the end i'd like to give a big shout about my upcoming live power bi training program that i'm doing for a bunch of 15 folks only if you haven't checked it out i'm gonna give a link in the description below and you should definitely take a look at that particular program it's a live program in case you're the type of person who would like to sit along with the instructor the trainer and then get your doubts sorted get clarified get homework and get hands-on you know solving real-time problems in power query dax and data modeling it's going to be a phenomenal learning opportunity not from just the perspective of learning but also meeting some great people around the world that are going to join in the training i have a few spots left yet please do sign up for the training in case you are interested and of course if you have any questions around the topic that we discussed today feel free to drop in a comment and i'll be glad to reply thanks so much for sticking around all this while and i will catch you guys in the next one [Music] you
Info
Channel: Goodly
Views: 78,435
Rating: undefined out of 5
Keywords: Power BI, Power Query, Excel, DAX, M Formulas
Id: jLpgt-wptH4
Channel Id: undefined
Length: 14min 36sec (876 seconds)
Published: Wed Jun 01 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.