They're coming, and they're here to stay!
Two new functions called GROUPBY and PIVOTBY that are going to change the way you write your
formulas. These are functions for everyone, so make sure you watch this video,
no matter what your Excel level is, so you're prepared once they
show up in your spreadsheet. So, imagine you get to the office, and you
start to enjoy your cup of coffee. Then, the boss comes along and says, "Hey, I have
this Excel table. It's called TSALES. I want you to quickly give me the total sales for
each sales manager by division right now." You take a look at this and you think, "Well, I
could create a pivot table." If you know how to create a pivot table, if not, don't worry;
I have lots of videos on the channel that can help you out. Your other option starting
now into the future is to use a function that gets this done. That function is the
new GROUPBY function, and it's super easy to use. All you need to do is type in =GROUPBY.
It only needs three mandatory arguments. The first one is the Row fields. These are
basically your categories, so the boss wanted Division and Sales Manager. I'm just going
to select these columns. The next argument is the Values. In this case, it's my sales. If I had
multiple values, so if I had Profit as well, for example, I would just select that as well. Next is the Function.
How do we want to aggregate these numbers? Take a look at this; we have a lot of pre-built functions
that we can choose from. One is the SUM function; this is probably going to be the most used one.
Then we have AVERAGE, MEDIAN, COUNT, MAX, MIN, and so on, right? Lots of things to choose
from, plus you get the ability to write your own Lambda function if you can't find a function
here that gets the job done for you. But for that, you have to be a bit more advanced, and I'm
going to show you an example in a little bit. So in this case, all we need is the SUM.
You can either type it in or select it, close the bracket, press Enter, and that's
it. You can go back and enjoy your coffee now. Another great feature of this, though, is
that we get a total automatically as part of this formula. That's the default behavior. The
total right now comes with it. As you can see, the sum of these values is this number,
which is our total value right here. Now, let's go and take a look at our optional
arguments. We know these arguments are optional because they're in square brackets. So, the
first optional argument is the ability to show the Field Headers. In my case, I didn't select
any headers, so there's nothing to show. But if I had selected it, I could select
3 here to show the field headers. Now, just to demonstrate how that looks, if I
had selected the headers as well, where... So, I'm just clicking and clicking again to make sure
I select the headers. I have to be consistent, and if I select the headers for my Row fields,
I have to also select it for my Values field, right? And I choose a 3 here to show
the field header. This is what I get: Division and Sales Manager, right?
But I could also just put in Division, Sales Manager. I could change this to
Manager or Employee or whatever I want. You can just always hardcode this, but that's
one of the optional arguments that we have. Personally, I would have preferred if this
argument comes all the way to the end or somewhere later because I find the next two
arguments much more useful than this one. But let's just move on to the next argument. This is
Total Depth. This allows us to choose whether we want to show the grand totals on top or on the
bottom and also if we want to show subtotals. So in this case, I'm showing two different
columns, right? I could select Show Grand Totals and Subtotals as well. So here, I get Health. I
get a total for Health, total for Productivity, Utility, and then a grand total. You have
the ability to reverse that as well. So if I put a minus 2, I show the grand total on
top, the subtotals on top, and so on. Let's say boss changes their mind and they say,
"Well, I also want to include Product in there as well." It's really easy to update this. All
you have to do is just remove this and select these three columns, and that's it, right? We
get the total, we get our subtotals, and so on. If at any point in time you want
to turn off the totals altogether, you can put a zero and don't show any totals.
Okay, I'm just going to press Ctrl+Z to go back. Another optional argument that you have
is the ability to decide the sort order. So let's say I want to sort everything by the
fourth column, by my sales value. So I'm just going to put 4 here. When I press Enter,
it sorts my sales values in ascending order, right? So that's the default; it's ascending
order. Obviously, it excludes the totals and subtotals. If I want to sort everything in
descending order, I have to put a minus in front of the column number. Now I get all my
categories here sorted in descending order. Now, the advantage of this over a pivot
table is that everything is dynamic. The moment something changes in
my data set, so for example, let's say Jesse Pinkman's FitTrack is right here
in my report. If it really has explosive sales, everything updates automatically in my report.
So that's one advantage. Let me show you the second advantage of this using GROUPBY over
pivot tables. That advantage is the ability to return text in your Values field, which
is something you can't do with pivot tables. So, for example, let's say boss wants the
names of the sales managers that work for each division. We could quickly use the GROUPBY
function. The Row fields is my Division, the Values fields is going to be my Sales Manager
field, and as for the Function, I have to select something that can work with text. So CONCAT is an
option, but that's just going to stick everything together. ARRAYTOTEXT is better because it's
going to separate everything nicely with a comma. So when I press Enter, I get my sales
managers separated with a comma. But notice, because my data set is detailed and sales
managers are repeated, I get the repetition here too. Optimally, in my example, I want to
get a unique version of this ARRAY TO TEXT. Now, there isn't anything inbuilt
right now that you can use, but you have the ability to write your own
function. So here, I could write a Lambda function. For the parameter that's going to
be a representative of the sales manager, it's going to be any sales manager. I'm going to
go with A for any. Then, the calculation is going to be the unique version of the sales manager
should go into ARRAYTOTEXT. Inside ARRAYTOTEXT, I'm going to go with UNIQUE of A, right? Any
sales manager. So let's just close these brackets, and when I press Enter, I get my report.
That's another advantage over pivot tables. Let me show you the third and my favorite
advantage over pivot tables. Over here, I have the same data set; it's just not formatted
as an Excel table, and my colleague has gone in, and for some reason, they've added calculations
in the middle of this data set. So here, I have Total Quarter 1, down here I have Total Quarter
2 with the sum of the cells above it, then I have Total Half Year, and so on. I can create a pivot
table based on this unless I clean this up, which I could do with Power Query. But you can also
clean up your data using the GROUPBY function. So let's say we want to get the total sales
for each sales manager. My Row fields are going to be the sales manager. Now, this is
going to be a hassle to select, so I'm just pressing Ctrl+Shift+Down Arrow until I get to the
bottom. Now, Ctrl+Backspace to jump back up. Next, I want to get the Values fields, which is
right here, but let's just type it in. Last, let's go with SUM, close bracket, press
Enter. What do we get? Those empty rows; those calculations are all going to add up
and mess up our data. We want to exclude these. We can do that by using the last argument
of the GROUPBY function, which is Filter Array. So I'm just pressing the commas or the
Excel separator just to get to the last option here. Now, we want to exclude
anything that is in the Sales Manager column that is a blank cell. To do that,
you basically treat it similarly to the Include argument in the FILTER function. So
we want to select this column, that's 313, and include anything that doesn't equal to blank,
right? An empty cell. And when I press Enter, I get my proper report, excluding all
these other calculations in the middle. If, for some reason, you couldn't do this,
if you had some random data in here, and you wanted to base your logic, let's say, on another
column, for example, to take a look at Column A, and if anything starts with the word "Total,"
I want to exclude it. Can we do that? Yes, we can. All we have to do is just
find the logic. So in this case, it would be LEFT of the value that we
have in the A column. So we're going to take a look at the entire A column,
and if the left five characters, right, so that's "Total," five characters, if they don't
equal to "Total," then they should be included. If they are equal to "Total," then they should
be excluded. And we get the same thing back; pretty neat way of excluding things you
don't want to show up in your final report. Now we've been taking a look at the SUM a lot.
You can, of course, change this and take a look at the AVERAGE sales for each person or the one
that's new and pretty cool is the PERCENTOF. So when I press Enter, I get the percentage of
each person as compared to the total. So let's just format these as a percentage, Ctrl+Shift,
and Percentage key. If I wanted to sort these so that I can easily tell who sold the most, I'm
going to go to the Sort Order argument here, and let's sort the second column in descending
order. I'm going to put a minus 2, press Enter, and we can see the salesperson that has the
highest sales as compared to everyone else. And remember, all of this is fully dynamic. So
if Christopher, for example, has a crazy sales number, and he's going to jump all the way to the
top. But if we decide to ignore it because this is the total of whatever quarter, then it's
going to be ignored from this calculation. Now it's time to take a look at how you can handle
dates in the GROUPBY function. So let's say you want to take a look at total sales for each year
and month combination or for each year. How can you do that? So I'm back at my original clean
table called TSales. What you're going to do is start off with GROUPBY. Then we are going to
select a date column. But let's say we want to get the year; you're going to put it inside the
YEAR function. That's it. Then, you're going to select your Values. In this case, it's sales and
your aggregation, SUM. Press Enter, and that's it. What about year and month combination? Well,
here you can just use anything that can give you that combination. So one thing that comes to
mind is the TEXT function because a TEXT function allows you to format a date or any number or any
text into the format that you want. So we want to format this date as, in quotation mark, "yyyy-mm", quotation, Enter,
and then I have the year and month combination, and I get the total sales for each of these.
And we can double-check quickly if it really works. So we have 87,400 for February 2024. This
is February 2024; let's just sum these, 87,400. Now, what if I wanted to expand this and add
another category in addition to period? So I want to have the year-month combination,
and then I want to have sales manager and then total sales. How would I handle that?
Well, let's copy and paste this over here, and let's see how we can update this argument.
So until now, we've been highlighting, like selecting a column to include as the Row
fields. For dates, we put it inside the TEXT function so that we could grab the year and
month combination. But now, I also want to add the sales manager column. To be able to do
that, I can use the HSTACK function, which is the horizontal stacking of ranges. So I'm going
to put this inside the HSTACK function. My Array1 is the first column that I want to show; this
is the year-month combination. Then, my Array2 is going to be my sales manager column. Close the
bracket for HSTACK, and that's all I need to do. At the very beginning of the video, I promised
you that I'm going to show you PIVOTBY too. But until now, I've just been talking about GROUPBY. The good thing is that once you understand how GROUPBY works, you already know how PIVOTBY works. You only need to know when will you use PIVOTBY instead of GROUPBY. Whenever you
have a case where you feel that your report, your data analysis is getting too long, you're
scrolling a lot more than you would like, and you'd actually rather prefer to
have one of these categories in the columns instead of the rows, that's
when you're going to use PIVOTBY. So, for example, let's say we wanted to get total
sales for each sales manager and have the year not here beside the sales manager, have the year in
the columns. We would use PIVOTBY. So let's do that right here. You're going to start off with
PIVOTBY. The row fields – that's what you want to have in the rows that's going to be Sales Manager.
This one, the column fields – this is the extra argument that comes with PIVOTBY because you
can decide what is going to be in the column. So, we want to have the date, but not the date
like this because otherwise, we're going to have to scroll a lot horizontally. We want
to grab the year from the date. For Values, we're going to have sales, and we want to
aggregate these using SUM. That's it. We get a nice report with the Sales Managers here,
we have a total on the bottom, we have the Years in the columns, and we have a total on the side
as well. You can expand on this, so for example, let's say you want to have the Sales Manager and
the product. You just include these in the rows, and you press Enter. We have Sales Manager,
Product, and then the years on top. Because we have this additional column argument,
you also get a lot of optional arguments, not just for the rows but also for the column.
So, we have the Column Total Depth. Let's say we don't want to show the totals in the columns,
we can turn them off by putting a 0 here. You can decide on the column sort order and the filter
array. In this case, let's just hide the totals in the columns, and when I press Enter, that
is gone. So, these are pretty cool functions. Let me know your thoughts about them. Add
your comments below this video. As usual, thank you for being here. Thank you for watching,
and I'm going to catch you in the next video.