4 METHODS to Filter by a List in Power Query | Excel Off The Grid

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
filtering is a common task in power query we might filter by a single value or maybe a limited number of known values however whenever we do that it always hard codes the value into the M code which means if we want to change that filter we've got to go back into the query and edit the query but what if you want to be more dynamic and what if we want our filter to be based on a list of items how can we do that well that's what we're looking at in this video we're looking at four methods that we can use to filter power query with a list so if you're ready let's get started for all of the examples in this video we're using the following data set we have a table here from cells A1 to E17 and then we have a list this is the list that we want to filter the item column by then finally for another example we have a column list and we'll look at how we can filter The Columns that we have in our output now for the example file I've loaded all of these into Power query already but if you wanted to do this you would just click inside your table go to data and then click from table slash range if you did that on all the tables that would load them into Power query so once you've done that let's head over into our first example so here we are in power query and let's work through this first example of seeing how we can filter by a list to start with I'm going to click on the item column and then I'm going to filter this so that it includes Alpha and Charlie and then I'll click ok now let's have a look what's happening in the formula bar so we're using a function called table dot select rows then the previous step is change the type so that's the table that we're working with and then we have H so this is to show that what's happening is going to happen for every single Row in our table and then we're filtering where the item equals alpha or the item equals Charlie and this is what we see in the preview window below What's Happening Here is actually it's a true or false calculation so if we just say that each is true click away it will return everything in that list if we say that each is false it returns nothing in that list so that means that as long as that section of the function evaluates to true or false we can determine which rows are returned well let's start by creating a list here I have my filter list query I'm going to select that query and go to transform and convert to list the alternative option is to right click on the header and then go to drill down that has now converted this query into a list and you can see that in the icon in the queries pane okay now back to our data and to create our true or false result I'm going to use the list dot contain function list dot contains then Open Bracket we can see the arguments there the first argument is a list then the second item is the value and then the third item is the equation criteria our list is our filter list that we just created the value is whatever value we have in the item column so let's get the item column now the equation criteria is optional so we'll come back to that in a few moments time there you go that query has now run and it's filtering to show the records that have either alpha or Charlie as their values currently our filter list has alpha or Charlie and they both have uppercase characters at the start by default power query is case sensitive so if we wanted to include uppercase and lowercase characters this is where the equation criteria comes in now there's lots of options around the equation criteria such as how we deal with various country specific special characters but for this we're just going to think about uppercase and lowercase so if we want to treat uppercase and lowercase characters as the same we can use the comparer ordinal ignore case in that case uppercase and lowercase letters would both be treated exactly the same we don't have that issue in our example so let's just delete that okay let's test this out so from home close and load now select close and load I've already got this loaded as a connection only so I right click on that go to load 2. and then let's load this as a table on a new worksheet click ok so now if we change our list so it includes Delta and Charlie you can come to the result of our query go to data and click refresh all and that query Now updates to just show Delta and Charlie and equally if we have more items in our list they will also work with our data set so there we have Charlie Alpha and Delta all included in our filter okay let's just head back over into Power query here we've drilled down into our filter list but what if we don't want to drill down into this list but we want to leave it as a table in that scenario we can drill down inside the list dot contains function so filter list there is the name of our table in square brackets we can put the name of the column that we want to use and that column in this scenario is also called filter list we'll click away oh actually I need a space in there filter space list fantastic there we go so now we've got our filter list as a table but we've drilled down into that list inside our table dot select rows transformation if we want the list of items inside the filter list to be the items that are excluded we can use the word not so in there we've got where each is not the list dot contains so we click away you should just see the items that are listed as Bravo so that's how we can use list.contain to filter by a list if we have a reasonably simple filtering scenario we don't have to mess around with the m code at all let me just delete this filtered row step instead we can use the merge transformation so from the home ribbon I'll click merge queries the first table is the data table and then the second table I can set that as the filter list I'll select the item column in the first table and the filter list column in the second table and then change the join kind to a right outer and I'll click ok so that now filters so that it only includes the item that were contained in that list so you'll see that Bravo is not included in that item column we don't need this filter list anymore so we can simply delete that column so you might be wondering which method should you choose list.contains or merge well it depends if you've got a reasonably simple scenario merge is significantly faster on a hundred thousand rows filtering by two values when I tested it it turns out that list.contained took 18.2 seconds but using the merge method took 1.6 seconds so it's over 10 times faster but if you're using merge it does mean you can't create any more complex filtering scenarios but you can only filter by The Columns that are there so it depends ultimately merge is significantly faster but might not be complex enough for your scenario so we've looked at how we can filter rows by a list but what if we want to filter columns by a list so filter the output of our query here in our colonist query you can see we have the word date item region and value and here in our data query we have date item model region and value so let's say that we want to filter to only show the columns that are contained in our column list how can we do that well let's start by selecting some columns and then from the home ribbon you can go to remove columns remove other columns in the formula bar you'll see that this retains a list of the columns that we want to keep So based on what we've seen so far we can call our column list query and the column name in our column list it's column space list there we go I'll click away and now we just see date item region and value let's close and load that into Excel and let's change our column list tables we've got date let's say we then want model then region then value come back to our query click refresh and as you can see those columns then update and they also update in the same order as the list that we provided to it now I would recommend performing this as one of the steps towards the end of your query because any other steps that are based on column names that no longer exist will cause an error so we want to make sure that we maintain the common names until the last moment when we can then filter down to just show The Columns that we want for our final example let's suggest we want to return the rows but only the rows for the three models that have the highest values so first we need to group by model work out what the top three models are and then only return the rows for those models now over time as our data changes so will the list that we want to filter by so this needs to be completely Dynamic I'm going to start by duplicating my data query I'll call this dynamic and then I'm going to remove the previous steps let's begin by calculating the top three items so I'll select the model column then from the home ribbon I'll click Group by our new column name I'll call that total the operation will be a sum and the column will be based on the value then I'll click ok to show the summarized values then with the total column selected I'll sort these by clicking z2a so we now have a descending order for this example we said we only wanted to keep the top three so we'll go to keep rows keep top rows select three and then click ok then from the model column go to transform and click convert to list so that gives me a list of the top three models so in our applied steps change type is the table that existed before we started the group and find out what the top three models were and model is the step that shows us what those three models are so using these two pieces of information we can write a formula that will filter to just show the top three models I'll click on the FX icon and now let's write our formula so if you remember the formula to filter a list was table dot select rows I've just been caught out by the by the bug there that repeats the word table then I'll enter an opening bracket the step I want to use is to change the type step then we use that each keyword to indicate that we want to do this for every single Row in our table and then let's use the list dot contained and the step name for our list was model so that was the previous step let's model and then the name of the column from our change to type step was also called Model we put that into square brackets to indicate that it is the column fantastic we've got two close brackets and click away and now our model column shows only Picasso Rembrandt and Picasso Pro because they were the three top models well that's it that's how we can filter by a list in power query we've seen that we can use list.contained we can use the merge transformation we can filter column names and we can also create a dynamic filter now if you want to take your power query skills to the next level why not check out our training academy over at exceloffthegrid.com forward slash Academy where we've got a complete power query course to help you take your skills to the next level thanks for watching and I'll catch you next time
Info
Channel: Excel Off The Grid
Views: 30,485
Rating: undefined out of 5
Keywords:
Id: B4i8XHkcAfU
Channel Id: undefined
Length: 13min 13sec (793 seconds)
Published: Thu Apr 06 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.