GROUP BY with Aggregate functions in SQL statement

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello friends welcome back to Sigma with Manoj today I'm going to talk about the group by clause in my previous videos I talked about the selected statement and the where clause you can check the links here ok so let's talk what exactly is a group by clause first of all so group by clause is an optional clause that goes with the Select statement and it can be used to summarize the data it basically summarized the identical rows into single groups and returns a single row with the summary of each group ok and how you can summarize the data you can summarize the data by using some aggregate functions and we'll get functions like count some minimum maximum and average okay so these are the five aggregate functions that sequal gives you to summarize the data in by using these functions you can use the group by clause and you can summarize the data by this way you will get a summary of the data how your data exactly looks like and you and you can do analytics on top of that right so the group by clause works on the rows returned by the Select query right and this is the Select query that will look like select star from the table name ok and this is a where clause that is optional flaws right so if you don't have this where clause your group by clause can directly start here right like select the star from then group by if you don't want to filter the records but if let's say if you want to filter the records of where clause is a necessary right like this and after the where clause your group by clause can resume and with the column list right now let's see how exactly this group by clause works let's see a hands-on example I'll take you to SSMS okay here I have this table employee populated with some data right and that this data I have populated from adventure works database so this is the adventure works database that I've used and this is the employ table that have populated okay and the data looks like this we have employee ID the first name last name middle name date of birth marital status gender job title hire date Department name and the salary okay now let's see how we can use different different aggregate functions and how we can use this group by clause okay so first I'll go with the found aggregate function okay so I just so let's say I want the total number of records I just want to get a count of total number of records okay in my employee table how can I do that so it's very simple it will be select count start from the DB dot employee table okay on executing this statement you will get there are 290 records and you can also get the count of 290 records over here you can see at the corner at the bottom corner 290 rows are their same count you can get with the Select count stuff from table employ right now let's say I'll go again back to this table now let's say I want count of employees according to departments right I want the count of employees in each department so how can I do that right so it's very simple you just have to apply the department name here department name comma count stuff from em count star from employee group by and same column name department name right so it this will what it will do is it will give you select this column department name and the count of each department and how it will get the count it will get the count by this group by so what it will do is it will group all the departments into one row so now here like you are seeing engineering you are seeing five rows so instead of five rows it will give you one row as engineering it will give you a count of five so in this whole record set they may be injuring many times repeated like so it will give you the count of injuring wherever it is repeated right so let me execute this statement and we'll see right so it shows that in my engineering department there are seven employees right and you can see in other departments in production department we have 179 employees okay now let's say I want to get the count of employees with respect to the gender right if I execute this statement select star from employee I have gender male and female working I just want to see how many male employees are working and how many female employees are working here right so it's very simple like I had department here so similar way I will have the gender column here gender count star from employee group by gender it's very simple it will give you the count of it will give you only two rows because there is the gender is only male and female so it will give you two rows and will give you 84 number of females and 206 number of males are working in this particular company right now let's say I just want to extend this I just want to see in Eve in every department how many employees works according to the gender right so I just want to extend this let's say I want I want in engineering department how many I want an engineering department how many female were females work and how many males work so how I will do that I have the department name as it is here what I'll do is I'll add a gender column over here okay and in the group I statement I will repeat the gender so let's say if I don't do that I will get an error right so if I execute this statement I will get the error as column employed or gender gender column is invalid in the Select list because is it not because it is not contained in either an aggregate function or the group by clause so I have to add this gender column over here in the group by clause and it will give me the details the summary of the department and gender wise right so this is coming haywire so let me just give a order by clause order by I will repeat these like this or it will order all of these things and put together the engineering female and male employees right execute see so I have total seven employees in my engineering and if I again you know I expand this into gender so three females and for males work in engineering department right it's very simple to use the group by clause okay now let's go back to the Select statement right here we saw here we have the hire date okay now let's say I want to know how many employees were hired each year every year how many employees were hired right so what I can do is I can fire this query select I can use the EEO function to get the ear from the hi date okay and I can have the count star okay I'll give the column name as hire ear count from DB o dot employ okay group by okay now what will I grow by if I group by hire date let's say what happens if I grow by the hire date right so I just get I just get one one one one one because it will group by exactly the hire date not by the year okay so if I have to group by with the year so and get the stats of you know how many employees were hired every year so I have to do this the same column that I'm having here has to be present in the group by clause if it is a calculated column the same calculated column should be present in the group by clause right so let me execute this and if I get very few rows right and you can see here let me order by order by this okay and let's see sorry order by count it is sending in order by count descending means it will give me the maximum number above okay so the maximum number of employees were hired in 2009 then 2008 and very few employees were hired in these years okay so this is how you know this count aggregate function works and there can be many more examples okay and there can be many more scenarios you can use this count aggregate function okay now let's go ahead and see how the how the men how the max and how the AVG average aggregate function works okay let's say I have the same query okay I'll copy this query over here so that I have to refer it every time okay so I have executed this query okay and let's say I want to know the minimum salary the maximum salary and the average salary across all employees what kind what can I do right so it's very simple right you just have to give minimum of salary maximum of salary and AVG average of salary right so here we don't need to use the group by function because we are not grouping these columns by any other column right so I don't have any column like the the department name or something the job title or gender right so I am using all aggregate functions so when I am using only aggregate functions I don't need to use the group by or group by clause right so let's execute it okay so it gives me the minimum salary okay let me just alias it also because it is showing you has no column name across so this is min salary this is as Max salary and this is as AVG salary okay now executing like this it will Ilyas the column is like this and minimum salary drawn was three thousand five sixty one dollars maximum salary is seven nine eight five dollars and the average salary across the organization is five thousand seven hundred dollars okay so let's see again how what what we can do how we can you know expand this let me see I want to see the salary this kind of distribution across departments right so how can I do that I can use the same query right and let me group the salaries by department name so I'll have to use the department name column department name column in the same column I have to use at the group cause right so you have to remember if you are only using the aggregate functions the group by clause is not needed but if you are using any other column apart from these aggregate columns on which you want to do the grouping that grouping column should be present in the group by clause right so now what it will give me is it will give me the aggregates across departments right for each department it will give me like this right so you can see here Department control has a minimum salary of this maximum salary of this and an average salary of this okay you can even you know add the count over here right so let me add a count right so on the first query you can have a count right so there were 290 number of records right here also you can get the count right so here I want to see like minimum salary maximum salary in the average salaries there so if I want to see the count like like the way count I gave here department name right it gave me the number of departments so a similar way let me execute this and show you it showed me that you know document document control department has five employees right similar way I can add the count aggregate function over here and I'll get the count same Department 105 I will get the same count over here Department control five so you can use as many as aggregate functions over here right and the similar way you will get the grouping and the summary of all these records now let me execute the query again and now apart from Department if I want to see the salary according to the job title I want to see like Department is okay right by checking the salary Department wise it will not give me any more inside but to get the more inside like what kind of job title gets what kind of salary it's more it's more meaningful right so rather than using department name I'll use the job title column again the same job title column will go here right so it will give you according to job title all these things right so here it says that you know account trend accountant gets a salary minimum sell rates of highly paid right so it gets seven thousand forty one dollars minimum salary and the maximum seller is seven nine four nine right and if I want to see you know the maximum and minimum salary so I can do an order by min salary okay and let me do a descending order so the maximum so the maximum of the minimum salary I'll get at the top right so here the maintenance supervisor right so there is only one post for these titles so there is only one person in across the company so so you are saying minimum salary is equal to the maximum salary and the average salary all these columns are same right because there is only one person working here and they are highly paid right then you can see marketing assistant three three people are there working and the minimum salary this maximum salary this right if I go at the bottom the least earner the people who are getting least salary so 22 production technicians are there the code is WC 20 they are getting a salary of three thousand five hundred sixty one dollars and the minimum and the math is the minimum salary and the maximum salary is seven thousand eight hundred sixty nine dollars the average comes out five six eight - right so this is how you know you can generate analytics and you can generate the summary from your table and get the meaningful insight right now people are more interested on at the gender level right at the gender level okay so I'll just copy this and let's see what is the distribution across gender okay so I have this gender and the same thing the same column have to provided here okay group by so this will give me the minimum salary maximum salary average salary and the count of people right male or female right so here so as you see here females get a minimum salary of this and the maximum salary of this males get the minimum salary it's a bit higher right so males get a bit better minimum salary and a bit better maximum salary also and but if you see the record counts record counts are also few only few females work here and more males work here and but if you see the average right average is totally different right here the minimum salary and maximum salary are lesser than the male counterparts but if you see the average salary average salaries higher side so females are getting five thousand eight hundred seventy two dollars and the males are getting five thousand six hundred thirty dollars right so in some cases males are getting more salary but in the most of the cases females are getting more salary right now we can have you know this one more column there's one more column that memorizer the status right so let's see married people and single people what is the salary distribution across those marital status I'll copy this over here also right so let's see okay so here the married people get a good minimum salary compared to single people and here maximum salary is bit lower right $20 less and the average salary is more compared to the single right and the count is also a bit more right a 146 people are married and 144 people are single here right so it's very interesting right by using these minimum maximum average and count you are getting all these insights so if I if I execute just this plain select the star from employee table this won't give you any analytics this won't give you any meaningful information although the whole 290 records are in front of you but you won't be able to make out what is the information what is the knowledge base behind these records right but you know with these kind of functions and gate functions this gives of you are very meaningful information and you know HR person like like this this is the very classical case right in gender there is lot of companies there are gender discrimination and HR can really work by is you knows checking these stats like why females are getting bit lower salary right on the minimum side and maximum side but on the same side you can see right average salaries bit higher from the male counterparts so HR can basically work on these stats so this was all about minimum maximum average and the count this is the count that we saw earlier also okay and now let's move to the last one that is some okay so let me again copy the select star from table right okay now let's say if a company wants to know how much you know the company is spending by giving salaries so how they can analyze it so at the record level the company has the salary of each and every person but if they want to Club the whole salary and they want to get the budget for the whole company so they can use the sum aggregate function with the salary okay and they can easily get the insight right so they want this much of money to distribute the salary for their people okay and let's say they want the aggregates based upon the department right so they want to allocate budget for department how they can do that department name right the same department name will go at the group by clause okay and here you go right so this is the budget that they have to think about right so for injuring department they have to allocate this much budget for production department they have to look at this much budget if you can even order by the budget by you know having this last column sum of salary and des C means descending order at the by default order is ascending so after provide deac for descending right so the highest budget that we'll take is the production and sales and then sales and then the finance and the least budget is for these executives tools designed at the bottom ones right so this is our company will you know get inside by this simple query and by using the sum aggregate function every year they have to have a budget of this particular amount so they have to ask they have to generate this the reven you more than this particular amount so that they can they can give salary to their employees and by getting revenue more than this how they can distribute the salaries to the departments they can run this query and they can you know distribute the money according to the according to the departments and by this kind of budgeting one more interesting thing that I just think of is let us extend this query okay and what we can do is we can advocate this based upon department name and gender name so we can see you know in gender we can see you know in every department what is the distribution of salary looks like right so females take how much salary and males take how much salary so this is how it looks like so in production department right so production department was the most expensive department here right but we have bifurcated this into two male and female right so the production department you can see here male takes this much salary females take this much and in the sales department they're only male ok so I have to order it by department name also right otherwise you can see these two sales are coming here so it will all go here why otherwise so let me just execute this right so like this document control engineering right and executive execute Department only has males right and no tools design so the salaries bifurcated like here like males take this amount of salary and females take this amount of salary right so this is very good insight now what you can also do is like the wave you know we had the min max average salary count what we can do is we can you know even extend these queries Li I'll just what I'll do is I'll just copy these queries okay and you can even have the some right so this will give you the count okay how many gender how many males and females are working and what salary they are taking in the whole company right and if I add it here some salary so it will give you by department name and gender so that we already saw so this is we can skip it because it won't be a useful insight I'll copy these ones right minimum salary maximum salary I did so so you can even Club this minimum salary maximum salary average and the sum of the salary right like let's copy this job title right so I'll show you you like the way you know we added count you can even add the sum of salary right so this will give you insight like this right so like marketing assistant so minimum salary is this maximum is this average is this there are three people and you know for for these three people ma for marketing assistant you need to have you know budget of this so that you can distribute the salaries right and what else we can see even the marital status right for marital status you can also add some of salary if you just want to get the count and the sum so it will give you this particular analysis for married people normally it won't serve any purpose but let's just for our example married pay for married people you have minimum salary maximum salary average count of people and you know this huge sum of money married people takes and dis use sum of money single people take so this is all about the group by clause I think you know you would have liked it very much and the kind of examples we went through they were like really real-time examples that a particular company can think about and use right and especially the HR department and the finance department they can really get really good insight by using these it functions and by using this group by Clause right so this is it for today in my next video I'll be talking about the having clause okay how we can use the having clause on top of this group by clause so please stay tuned please like the video if you really like it and please subscribe by subscribing the videos you'll get to know the latest videos that I put on and thank you very much thanks a lot for watching have a nice day thank you you
Info
Channel: SQL with Manoj
Views: 86,759
Rating: 4.9151106 out of 5
Keywords: SQL Server, SQL, Microsoft SQL Server, GROUP BY, Group by Clause, Aggregate functions, Oracle, Hive SQL, Spark SQL, My SQL, PostgreSQL
Id: fsLjuhQ5Ddw
Channel Id: undefined
Length: 25min 0sec (1500 seconds)
Published: Tue May 17 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.