Aggregate Functions in Access Queries - Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
aggregate functions turn heaps of data into meaningful numbers hi I'm Philip from Cole cabinet comm and today I would like to talk about aggregate functions in Microsoft Access well aggregate functions aggregate data and I think what exactly that means becomes clear if we talk about the functions that are built into XS in more detail there are actually seven built in aggregate functions in Microsoft Access and they are pretty much standard not only an axis but in basically any sequel based database engine so let's start with the first one it's pretty easy its count with the count function you can count the number of records in a query or in a table obviously and you can use criteria to reduce the number of records and count only those that match specific criteria and you can either count one distinct fields and you count distinct values or you can count simply all of the records so no matter how many records are in your table or in your query there will be only one line returned by the count function and it will be a number obviously and that applies to whole records it's counting records or unique values so the input can be basically any table or any column then there are the functions min and Max they return the minimum or the maximum of a particular value in a column and that is not limited to numbers you can also calculate the minimum of the text field or a date no matter what what we'll use the sort order of your database to get the first the lowest value of the highest value would if you use the max function the next one is some simple it sums up data and that has only work with numbers obviously you cannot sum up strings in a meaningful way or dates so only numbers here and the same goes for the next function average or AVG in front that calculates the average of a numeric Bell you will call them and once again you can always limit that down by using additional criteria in your query to reduce the number of records that average or sum or whatever is calculated on then there are two functions of statistical calculations it's the standard def function and the dollar function and the these are useful for statistical purposes if you want to determine the standard deviation over a number of records all the variants in a number of records I'm not going to cover these in detail because I do not have the statistical backgrounds in my mind present right now to probably cover them and then there are two more functions yeah I said there are seven there are two more functions that you should not use and that is the deep the first and last functions and you might think they return the first value or the last value in a part a column but but by by which sort order and that is a bit of the problem why you should not use them now Microsoft acknowledged the fact that these names are pretty confusing and will cause problems originally this again you get the first or the last value in a set of Records and that is correct but it is the first value or the last value that the database engine encounters while processing these records and it's almost impossible for you to know which record that will be so do not use first and last aggregate functions they return a random value you can use them if you say you need just any value and you do not care which one it is but in all other cases do not use first and last so and one important thing we need to talk about when talking about aggregate functions is the group by clause you will always or almost always combine an aggregate function with a group level like you calculate the total value of orders that's an example we are looking at soon for one customer then you would group by the customer by its name or ID and then use the aggregate function sum to get the sum of all his orders all the order values so you need to take them always consider when using aggregate functions combined with distinct values from your data you need to apply the group by to all other values that you want to output in your query now there are some helper functions built-in to access the domain aggregate functions these start with a deal like D max D count D sum and you can use these easily in expressions like in forms or in your VBA code or even in queries now show an example for that later and they work basically the same way as the aggregate functions work you put in a column name a table or query name and optionally the criteria and then the function will get the count of records in that query or table matching the criteria or will sum up the the sum of numeric values in a table matching your criteria pretty simple but also a little bit limited compared to the real aggregate functions in theories because you always get one well us output that is desirable if you put that into a control in a form because then you can only display one well you but if you want to calculate these aggregated values for any number of records is not as convenient and they have a bit of performance problem when used in that scenarios so that basically work very similar if you understand one you probably understand the other but you need to know which to use when okay well oh one more remark I did on that the first and the last problem I did a very very extensive text I'll put the link to that in the video description and now made without much further ado we look at the screen and look at a couple of examples and here's a table TBL customer and maybe you have seen that in previous demos it's a very simplified table and several things in there that are not good practices so just as a side note if you see something in my videos and there is not the core topic of the video dealing with that it might have been that I used some sort of simplification to make the sample easier to comprehend and it's not necessarily a best practice to do things that way and that is actually something or a detail of that we are going to put right today here are two columns total order value and last all the time these are two columns that contain additional information to customers but there is a major problem with them obviously they need to be maintained manually and that is always the problem you see it down here this area they are not really maintained in for recent records and the the top records here are probably wrong in this database that is used to record customer orders and process them it would be easy to derive these two columns from the data actually entered into the database and use that instead of these manually maintained columns and that's what we are going to do we are building a query as customer overview to derive these information from the records in the database first let's look at the tables in this database there are only four tables but they already contain a bit of complexity but that is something that you are most likely going to see in a real-world database application if it is properly normalized we just looked at the table customer now let's look at the orders nests not too much in here it's the order ID as a primary key it's the custom ID referencing the customer and order date and the order status not much more so it does not contain the data we need to calculate the total order value for that reason we need to go one level deeper and that is the table order position and that references the order and has a position these two actually are combined into the primary key then there is an item ID and the quantity and even that does not allow me to calculate the order value I need to go one level deeper and include the items and here are a couple of grocery items with a net price in here so I need all four tables to kind of build my customer overview so I'm going to use a divide and conquer approach to get to the final target of the customer overview query so I create a simple helper theory first go to create very design and then I add these three tables to my query and I usually order the queries from left to right in their hierarchical relevance that is the order the order position and finally the items now I need all four columns from the order table to create an order value query and that is what I'm going to do here I just want to calculate the value of an order and kind of eliminate these two lower level tables from the equation so I need all the other fields I got them already and then I'm going to need the quantity and the net price of the items and well we leave it them in here and add another column to do a simple calculation here it is quantity times net price and I just run that so you see we've got the quantity here the net price here and that is the extended price of this order position if I sort this by the order ID you see these four rows of other positions of the order number one and what I actually want to do and I go back to design you is to calculate the sum of the order value for the whole order I go here and activate the totals calculation functions and you'll see there is a new line in the query here that just appeared and that allows me to define aggregate functions in my query for this simple calculations I want to use the sum aggregate function so you see I can choose all the aggregate functions like sum average min Max and so on that I previously mentioned and the group by is simply the grouping level in this case I want to group by the four columns from the order table and use the aggregate functions only on this expression here so I just choose sum here and define a meaningful name that is always very important use meaningful names for your fields for expressions that you do not use that expression 1 than XS defaults to so this is the calculation and I want the sum of the results in the final row and I'm going to make a mistake here just to show that I run that and you see there are still 4 records here for the first order and that is because I still included the quantity and the net price and grouping levels and that is wrong because they are different for each order position so I need to get rid of them from my query just mark them and hit delete and they are gone and now I run this query again and you see there's only one row left for order 1 and that is the total value of this single order that is what I want to achieve but before we close this and move on I'm going to show you how to use a where condition to filter the data if we don't want to see all orders but only the orders that are completed I can change this to a where condition and say we only want to see completed orders now the query runs and it still shows only the data for the order for records and the some of the details but the orders are filtered now by this criteria okay I do not want to filter this query in my customer overviews so I changed that back to a group I and included in the output so just minor detail pay attention to my calculation here that's going to happen something now when I save this close the query yes I want to save it as three ordered value I close it and now I open that again and you see access change this to be an expression now because it is not a real column but a calculation and it put the some aggregate function inside this expression that is valid as well to write it this way and the result will stay exactly the same but you need to be aware that even though this is an expression now it still can contain the aggregate function up here now this was only the first part of our customer overview now we are creating the real customer overview I go to query design again and now I only add the customers table then go here to queries and and the order value theory we just created these two are already linked by the customer ID and now I can add the fields down here I just need the customer name for now and from this table I am going to need the order date and I'm going to add it twice and we should add the order value of course and we can add the order ID and that seems to make not much too much sense but now I'm going to enable the calculations via the totals and now we can do a couple of nice things here I use the minimum of orderdate and that will give me the first time or the first date the customer ordered with us and then I go to the second order date column in here and I say yeah I want the maximum the last date of the order for the order value I once again and the sum here so I get the total value of all the customers orders and I can add one more thing which might be interesting in the context of a customer overview I can count the order IDs to see the number of orders a customer placed with us and there's even one more column that might be interesting to us I drag down the order value once more and I go to the average function down here to see the average value of the customers orders so put that here so that we have some and average close to each other and now I can run this query and you see this nice customer overview and you see these columns get automatically assigned names and they are not too bad actually the min of order date is the first order max is the last order this is the total sum of all the orders and that is the average order value and this obviously the number of orders the customer placed with us so if you want to change the column names you can do that here by doing and defining an alias for these columns so this is the minimum or as I say that is the orderdate and this one would be the last order day to be a little bit more meaningful and if we run that again you see first order date last order date so I'm not doing this for the other columns but you see how easy it is to define your own column names in there I close the query and save this as query customer overview so far this have been the basics of aggregate functions in Access I want to take this one step further because there's one question that frequently arises after you created such sort of aggregate query what about getting data from one particular order let's assume we want not only show the last order date that was easy to retrieve but we want to show the last order status and the last order value in our customer overview and that is going to be a little bit more difficult but I am going to show you how it's going to work so to retrieve any information about the last order we need once again to split that up into two queries we could use a sub query but that is a topic for another video so now I'm going to use two queries which you can easily design with a graphical designer well I go to create query designs and just add the ordered theory up here and I only need the customer ID and the order date because I want to retrieve the maximum order date I just switch to totals once again and yes we need the maximum of the order date and the customer ID and this is just once going to be in this query I close that and save it as very last order date customer and now I'm going to create yet another query and I only add the last order date per customer and the order value query and now I can add the customer ID and these columns from the order table because that's what I am wanted to do and what wanted to know the order date order status and the order value of the last order so what we need to do now is join these two tables or rather queries here because access was not able to do this automatically here so obviously the customer ID will be linked to the customer ID and now comes the clever bit we have the last order date here the maximum of the order date and that will be linked to the order date from our query order value and if I now run this query you see we only see these few records here and sort by the customer ID and this is the last order for each customer so now I can save this last order per customer so you see the difference I named this query last order date per customer because it contained only the date and this one last order because it contains the full order information including status and order value and finally I can go back to my customer overview query open that in design view and add the new query we just created last order per customer and this will be joined by our customer ID and then I can add the order status and order value and now I should definitely add a more meaningful alias here so this is the last order status and this is the last order value so it's not so easy to see you scroll a bit the side and they are both grouped by columns because they are already aggregated on the customer level so I just include them in the group by column list so I run this query and we make this little bit smaller now you see this is our last order value and that is the last order status I did not include the last order date from this query because we got this here already we could have of course created another query based on top of the customer overview because that contained the last order that would have also worked to kind of avoid the additional helper query I created but nevertheless I think this way is more structured so I choose it that way but in the end it would not make much of a difference to wrap things up let me show you a problem with the aggregate functions they are obviously in many cases read-only because if you aggregate a sum of something you cannot change the value of that sum by editing the data in the query because the sum is calculated from any number of detail records so but let's add look at the last order per customer oh and this is an interesting problem a small detour here that is not what I wanted to show but I'm not going to cut this out because this is a problem that frequently happens I am asked by access to enter a parameter value for query one customer ID and you would say this is insane there is no query one nowhere to be seen so this is pure nonsense and it is but I show you how to how this problem came into existence and how to fix it so if I just cancel that you see my query is empty and that is obviously not really what what's it's supposed to be so I go to design view and once again you see there is no query one anywhere to be seen but I show you where it is I go to the property sheet here and now I click into the gray area here to get the properties of the queries not the individual columns and now you see it here this is causing the problem this is the order by I kind of created when sorting the query in design view and while I was designing the query it was not saved and access defaulted to the name query 1 to address this query and when I saved the query it still had that order by expression in here but referring to query 1 and while saving the query I named it query last order per customer but that did not change the order by expression in here so this does not work anymore and so the easiest way to deal with that is just delete it with just save this close the query and run it again you see now it works but actually I wanted to show you something else this table or rather query shows the last order per customer so this is a distinct order record and this should be editable I should be able to go in here and go to that order and write complete it in here but it does not work you see over here this record set is not updatable and that is an annoying thing that X is tough because there is this aggregate query in there access makes the whole query read-only even though the records from query order value should be updatable because they just reflect single records from the order table that is massively annoying part I'm afraid there is no way around that well I said there's no way around that but there is with a small trick and now that is where the domain aggregate functions come into play I mentioned earlier so as a short interlude I would like to show you how to use the domain aggregate functions and the the difference domain is going to become very clear we go to the visual basic environment and now I just use the immediate pain down here you can activate that you environment will probably look like that and if you go to the View menu you can enable the immediate window here or just use the ctrl G keyboard shortcut now I can type in here and cut the D count function here and the first is an expression and for count this can be the star but it is a string argument as you see down here so we need to enclose that in quotes then we need to add the table name here and we can use criteria let's call it this way so we got 33 orders in total and I can easily restrict that to the orders of one particular customer so I write customer ID equals and just choose customer eight and now this customer has eight orders while when I change that to customer ID equals one then the customer one has only four orders and I can use that with all other domain aggregate functions but if I use D max I need to specify a column name here like order date and that is the last the maximum of the order dates for customer one I change that back to custom ID equals eight then I get the last order date for customer eight and now you can just use this expression right inside an access form quickly to that here I design a new form and just enter a text box control in here and I go right to the control source of the control go to the zoom window paste that here and I need to change in in my particular case I need to change the separator here to the semicolon but that is because I've got German language settings on my PC you will most likely leave the comma in there to make it work and I need to put the equal sign here at the beginning to signal to access that this is an expression in the control source that should be evaluated if I now switch to the form view you see this is the last order date for the customer and so you can use the domain aggregate functions in your forms as well so I said there is no way around that but that's not entirely true because there is but you need to know how it works I create yet another query and I just add the table order in there and close that and once again I take all four columns from the table pull that down here and now I need to add a criteria in here we need the last order date and there is another way to get the last order date and that are but main aggregate functions and more specifically the d-max function so in the criteria section down here I entered Emacs and now it gets a bit complicated because we need to query for the order date and that is going to be a string though so I enclose that in quotes and let me go to the zoom window here and increase the font size so that you can see this better so we enter order date because we reference the order date column but that is a string expression as I said then we need to reference the table and it is table order and once again this is a string expression so I enclose it in quotes and finally we add a criteria in yes the third argument to the d-max function and that is the customer ID equals and now it gets even more tricky I close the string here and now I concatenate that with the customer ID from our orders table that we are using as the data source for the main query and now I do not want to reference customer ID as a string but I want to retrieve the value of the custom ID so I enclose this in square brackets to signal to access that this is a reference to a column and I rather want to value instead of that name so I confirm that with okay and we should be good to go to run this and you see we got this query here a new query one then put this data put this over here now this is the original query with the aggregate functions and you see so it contains the same data art but let me reorder the columns here so that you see that and once again you see it's all the same data we don't have the order ID in the other query but it's all the same records and this query I created with the d-max function that is actually updatable so I can write in here completed that is the difference in using the function because this function expression does not make the query read only it is still updatable so I save that as query last order per customer and I say it's editable so the advantage of this type of approach is clear the queries updatable nevertheless I would only use this if you definitely need an updatable query for all sorts of reports and read only forms where only display information I would prefer the approach with aggregate functions the real sequel aggregate functions I showed you earlier because that is usually much faster and there used to be a couple of problems with these domain aggregate functions if they were used on a huge number of Records and this function is called for each record it is not a big deal here because we only got 12 different records that's not much but if you got a large customer base like thousands of customers the function will be invoked for each record and that might the whole operation down quite a bit and it might cause some memory issues in previous versions of excesses might happen that you get an error like no more that cannot be more databases open door or staff and that was actually caused by using such domain aggregate functions in this way on a large number of Records I haven't seen that in a while so I guess Microsoft fixed the hard errors but nevertheless I would still recommend to use the sequel aggregate functions whenever possible ok back where the samples you see it's pretty easy to deal with those aggregate functions in Access and I only use the graphical query designer I was not even I did not even need to resort to sequel you can do that all with the query designer in Access and that is something where you should notice how much more convenient it is to have a database where you can define such calculations compared to an excel sheet where it is much more tedious to constantly update that when you enter new records into your excel sheet and it's all much more of a flow in Access you define it once and that is something that I might have shown a bit more you define all these aggregate queries once then you enter new customers new orders into the database and then you run the aggregate queries I showed you again and they will automatically reflect all the new data and will be updated with with all the new data pretty simple but also pretty powerful and you there is no way around these aggregate functions to really show human meaningful numbers from huge of data it is not limited if you got 100 customers thousands or hundreds of thousands or even millions you can apply the same principles there and calculate meaningful numbers for your customers or even on a higher level for your whole business like you turn over the last month or things like that where you condense huge amounts of data into very few meaningful numbers okay thank you that was everything for today if you liked the video please remember to like the video or subscribe to my channel and thank you for watching bye for now [Music]
Info
Channel: codekabinett.com/en
Views: 5,776
Rating: 4.7402596 out of 5
Keywords: Access, Access Queries, Aggregate Functions, Domain Aggregate Functions, Queries, Sum, Max, Count, Min, Last, First, Avg, Data Aggregation, Tutorial, Calculations
Id: 2WECG3TXxgk
Channel Id: undefined
Length: 44min 27sec (2667 seconds)
Published: Sat Jul 14 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.