SQL Tutorial - Subqueries (Scalar Valued)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to my channel i am bearded dev bringing you another sequel tutorial and in this video we're going to be looking at sub queries more specifically we're going to be looking at self-contained scalar valued sub queries and that will become apparent when we go through some examples you are able to follow along with this video there will be some code examples in the description so do have a look there and also if you are new to the channel and you're interested in t sql programming data analysis or data engineering do subscribe and check out my other videos there's lots of great content on there let's jump over to sql server management studio now and we'll go through some examples okay so we're in sql server management studio and for this video i will be using a sample database that is available online i will be using adventure works 2019 which is the latest version if you have got an older version don't worry too much you should still be able to follow along without any issues the only problem that you may come across is you might see some differences in the data the data here is not what is important it's just writing the sub queries so the actual results are not important but if you do want to check them you might need to get the latest version and if you do want me to do a video on how to restore a copy of adventure works please do let me know in the comments and i will get one up as soon as possible so we're going to be starting off with a simple query here from the sales order header table so they have a sales order header and a sales order detail table in adventure works so we're just going to be selecting these columns to start off with we have a sales order id which is a unique identifier an order date an order number purchase order number account number customer id and a total due so we're going to start off with how we can write sub queries within select clause so first of all what i want to do as an example is let's say i want to identify the latest order date i'm going to open parentheses and i'm going to select max order date and then i'm going to select that from our sales sales order header table i'll just give that a name as latest order date so we can see there we've got another query wrapped within parentheses and we're looking to retrieve the maximum order date from this table and we're looking to return those results within our initial select and if i go ahead and execute that we can see we've got our latest order date now as you can see this doesn't have any relevance to the customer or the sale sales order it is simply selecting the maximum order date from that table which might not be relevant but it might be something you need for ad hoc queries i wouldn't recommend anything like that to exist in production unless you need to apply further manipulation to it so for example if i would do this if i'm looking at order date and i can see here i've got some orders from 2011 and i i need some context i want some context of of the orders then i can include a column like that to give me some context to see well i should have orders here now from whenever the minimum date is until the latest date which is the 30th of june now mentioned in the introduction we'll be looking at self-contained scalar valued now what i mean by self-contained is this court this sub query has no relevance to the outer query there is no correlation correlated sub queries are slightly different and we'll have a look at those in another video so if i execute this query it will give me that result that's what's meant to self contains they're a lot easier to debug because we can run those queries independently now scalar valued what that means is it will only return one value and in this case we can see that it's returned that value however we have multiple rows within this table so return one value per row effectively one value per call of that sub query let's move on to a different example here so i'll just hide the results pane let's say a typical example that what we want to achieve is maybe a percentage of the total now the total here within this table is actually massive so what we want to add in is a sum of the total due from again uh the same table our sales order header table and we'll give that an alias as grand total to start with so if i execute that query we've now got this grand total amount which is some ridiculous number of 123 million now what we can now use is that within an expression to compute a percentage of the total so for each order in this case we want to see what that value is versus the grand total so we can simply add within our same same column we're going to say total due and that's just going to be divided by that grand total and we'll now change this to percentage of grand total and then we will multiply the results by a hundred to give us our answer so we can see there our percent of grand total now as i mentioned that grand total is massive so we're going to get very small percentage amounts we can see the first sales order id in this example is seven 0.0187 of the grand total but that's a more practical example of when we would use sub queries now let's have a look at using subqueries within the where clause so within the where clause to filter our data so within the where clause let's say we want to have a look at the latest order date so we want to filter this table by the latest order date now this is a very good example because unless we know what that latest order date is we can't add it as a constant so it has to it has to be dynamic it has to be a dynamic value so we'll change this to where order date and then we'll have equals and then we'll have our sub query here within parentheses and it's going to be the same subquery that we've used within the select from our sales order header table and because we're using it within the where cause we don't add an alias at this stage so if we execute that query now we can see our results have now shortened to 40. but what is it also interesting and we can see the order date actually matches to our latest order date that we've got with within the query so all the order dates on the 30th of june 2014 but what's also interesting and can be sometimes confusing to people who are new to sub queries is that these sub queries do not have any relevance to that where clause these are completely independent so although the the latest order date is returning that date it's still actually going to the table itself i'm looking for that order date it's not in this case we're not applying the where clause and then running these sub queries the sub queries have ran independently so whatever we had add within the where clause does not have any impact on these self-contained sub queries in this case now there are situations in which you can so don't take that as it will never have any impact but in this example that we're looking at the where clause is not having an impact on the sub queries we've got within the within the select and we can see that as another example within the percent of grand total so we can see that there's again there's very small amounts whereas if it was computing the sum total due taking that where clause into consideration it would be a lot smaller amount now we're talking about scalar value sub queries in this video so one of the things i wanted to show is what would happen if a sub query returned more than one value and this is a common error we might see so at the moment we've got a sub query within our where clause that's returning the latest order date now let's take out max from that and let's just run this query independently so we can see we can see here we're getting more than one result but in this case as we're using equals within the where clause we are looking for a scalar value so let's try and execute this query and see what happens and we'll get an error message here to say the subquery is returned more than one value and it was actually only expecting one value so we can't actually run that query we will get an error but this is this is coming into how the self-contained sub queries are slightly easier to debug because to identify this problem we can just run this query independently and we can certainly identify that yes in this case it is returning more than one value so we'll know we need to add something to that sub query to return one value whether that be a top one or something like an aggregate function such as max or min now another example we might typically go through is where the subquery returns a normal value so we know from our previous examples that the the latest date i think was the 30th of june 2014. we can actually add a where clause to our sub query and we'll say where order date we're currently in the year 2020 so let's say order date greater than equal to the first of the first 2020. now again we can run this independently and we can have a look at those results so we can see it actually returns a blank result now we'll wrap that within a max and we'll run that again and we can see now our sub query actually returns no now let's hide the results grid and let's go ahead and execute this query and we can see in this case we're going to get an empty result set so when our scalar valued sub query returns no we will be presented with an empty result set and again if you're looking at a query that includes a sub query within the where clause and you're seeing an empty result set you have the option then this is going to be your starting point is to see what value that sub query is going to return moving on to another example is how we can actually nest sub queries so what we're going to do is full simplicity uh we're just going to remove the subqueries that we've got in our example if you are following along they will be available in the code description so let's start off with a simple one and how i like to build nested sub queries if we need to use them is we start with the inner query and then move out so in this case we're not going to we are going to start looking at this table we're not going to start with the columns we want to return so what i want to go through is an example where we retrieve orders for the minimum order date but relate to the maximum order id on that date so we're going to start off with in fact i will start off with these columns so let's let's do this so if we say we want to retrieve sales details for the minimum order date so we say where order date equals select min order date and again we're just using the same example here same table for all examples another side to this is you don't have to refer to the same table within the query so if you have a column that you want to select from another table you can add that in within the select it has no relevance to the actual table that you are selecting from and again within the where clause it doesn't matter either i could select this order date and i could select a random date or and sorry a meaningful date from another table to see where it matched so if i go ahead and run this query we've switched it around a bit we're now looking at the minimum order date but now let's say i want to select it for the maximum sales id so in this case it would be 43701 but i want the maximum sales order id for the minimum order date so i need to change this this query slightly so i'm going to change this to where sales order id that's going to be equals i'm just going to go on to the next line and again what i want in this case is select max sales order id from sales order header where order date equals our min order date so i'm just going to align this nicely i'll just hide the results grid just close off the parentheses there just correctly name the table you can see it does get slightly complex here but now we've got that spaced out and aligned properly a line that's how i like it anyway so we always like to add an indentation within sub queries to show that that is the bit that's been executed first so again we can run this to say we want this minimum order date and then we have to run this including the where order date equals there to get the maximum sales id and we get the 4370 and then we can run the whole thing which noticed how i changed the sales order id to equal that and we get the correct information for that particular sales order id now it's worth mentioning the downside to sub queries so if i include the actual execution plan and why you don't want to overuse them if you find you are overusing sub queries then it's time to maybe start considering temporary objects where you can store values in there that you're going to use or having another look at a way to do that query reason being in this example i'm going to actually hit the same table three times so it's probably worth me having a look at how to simplify this process and how to prevent me either scanning or seeking against that table that many times if i execute this query and wait for the execution plan to appear it's quite small on screen but we can see we've we've hit a clustered index scan here of the table followed by a nested loops we've hit a clustered index scan here of the table and yet again now we've hit a clustered index seek so we can see we are hitting that same table three times so again when you're finding yourself using a lot of sub queries especially if you're retiring returning the same value um so perhaps within the select you've got a grand total figure and you're applying different expressions about against that figure so you're using that in three different columns then it's probably worth storing that grand total figure in a variable or a temporary object of some kind and then just hitting that temporary object because then at least you're only hitting that table once and then storing that value so it's something you will need to explore the last example i wanted to go through is where we just simply select from from sub queries so we don't have to we're going to use two queries two sub queries within an expression uh i i haven't explained that very well but hopefully it'll become clear so again we're going to be using the the sales dot sales order header table and here's an example i i want to give you is we want to find out the the customer who spends the most amount of money and all we want to do for that customer is compute the percentage of the grand total and we're going to be using sub queries in this case to do it and i'll show you how i would would go about this task so initially what we want to do is compute a total for customers so i'm just going to step this through so we're going to select customer id and then some total due as total that's going to be from uh again using the same table sales order header we're going to group by customer id and then for a start we'll just order by total descending so this is going to return the totals for every customer um but it's going to be in order so the customer at the top has got the greatest total so we can see customer id 29818 is the individual or organization that spent the most money with us so now we need to what i want to do is adapt this query to get this this total total value so what i'm going to simply do is a top one and i don't need customer id customer id is not something i'm going to be passing in i want to use a scalar value sub query here so i don't want to return multiple columns so if i go ahead and execute that i can see i've got the value that i want to return now the next part is we then want to apply that over the grand total so again we're going to write out another query and this query is going to be very simple we're going to do some total due well alias that as grand total and it's going to be from the same table and as we're applying the aggregate function to all of the data it's only going to return one value there is no we don't need to group the results by anything so we can see there's the grand total of 123 million that we saw earlier so what i now want to do is compute the percentage of the first query over the second query and how i'm going to do that is a select and it's going to be open parentheses so there's our first select we're selecting the total value for our highest spending customer i'm just going to indent a forward slash to indicate division uh and then i'm just going to indent my second sub query and close off those parentheses and then i'm just going to give this an alias as percent of grand total so we can see we're not this select is not actually selecting from a table it's selecting the first sub query as a divided by the second sub query so we can go ahead and execute that and we get naught point naught nor 8 so we have customers it looks like within this data that uh are regularly distributed they spend a pretty even amount the amount of spend is is distributed evenly amongst customers but i hope that was a good example of how we can use multiple sub queries within an expression and once again if you would like to follow along all these code examples will be within the description below really hope you have enjoyed the video if you have feel free to check out my other videos on my channel if you'd like me to do a video on anything you're struggling with then please do let me know in the comments below subscribe to the channel if you haven't already and hit that notification button to be made aware of when new videos are uploaded thanks a lot for watching
Info
Channel: BeardedDev
Views: 1,287
Rating: 5 out of 5
Keywords: subquery in sql, sql subquery, sql subquery tutorial, sql subquery example, subquery returned more than 1 value, scalar valued subquery, self contained subquery, sql server subquery, subquery sql server, how to use subquery in sql, how to use subquery in where, sql where subquery, sql subqueries, sql subqueries tutorial, sql subqueries example, sql tutorials with examples, sql tutorials follow along, sql select subquery, subquery error in sql, when to use subqueries sql
Id: 5c2r97tDAjY
Channel Id: undefined
Length: 24min 47sec (1487 seconds)
Published: Tue Oct 06 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.