SQL Server Execution Plan Basics

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey guys Kevin Hill here and today's video is getting started with sequel server execution plans and all my contact information is there pause it so you can see it and copy down and feel free to stalk me on Twitter as always moving on this video today is essentially a continuation of one from August 24th but I did which was index and execution plan basics that one was more focused on the creating and non-clustered index then on actually reading the plans that change after you do that so there's the link or you can just go to my channel and again the date was August 24th last year you can see it there so because I want to keep this fairly short and jam-packed with Internet information we're gonna jump right in this is my sequel 2016 instance that I use all the time I'm using the stack overflow data dump and I have purged this down quite a bit to not have nearly the amount of information and it just dumped a bunch of records so it would fit better and back up some easier anyhow the stack overflow and we're gonna be looking specifically at the user's table if you use Stack Overflow or DBA Stack Exchange anytime before 2016 you're in this in this table nothing private of course because they're smarter than that so this table and by the way this database is case-sensitive from the dump so if you're having issues on your own tests that's made something to look into so we've got the user's table and its various columns and I don't think you have an index on here yes I do oops I'll make that go away real quick so we'll pretend I didn't have an index there because I don't want to start recording again and the only index on this table when you get the public dump is on the ID field and it's called users ID because users and ID makes perfect sense but the goal of today is to show you some very very simple execution plans in the graphical format and if you don't even know what an execution plan is that's fine this is very simply the roadmap that the query optimizer uses in order to figure out how to get you the data that you've requested in your query very simple it's it's not a whole lot different from if I want to drive down to the Microsoft building in the Irving area I know roughly how to get there but if I'm halfway there or there's a bridge out or something like that then I've got to make a different plan on how I'm gonna go how I'm going to get there I want the most optimal route so I don't have to deal with a lot of traffic same thing was with sequel server queries the optimizer is going to find the fastest route to get you the information you've requested or do whatever you've asked it to do an update or an insert and then move on and satisfy the queries so there's a couple of ways to do this this is a very simple query here I'm just looking for my own record which is Kevin 3nf my display name which is what you username is called from Stack Overflow users where it's a very very simple statement now there's two things I can do here I can run the query itself of course which won't really give me all the information I need unless I hit this include actual execution plan but on something this simple I can show you what I want to show you just by doing estimated execution plans or the ctrl L shortcut and that's exactly what I'm going to do there's not any real need for me to run this so if I just do ctrl L I get this portion at the bottom because I had no actual data it just says one row affected and that was in fact that the query generated this execution plan this is an estimated plan here it's also going to be the actual one but in a new production environment the estimated plan is it's a best guess and the actual plan when it's run could be quite a bit different so the way you read these things is you want to go right to left and top to bottom in our case we only have one row so there is no top to bottom so we'll go right to left the very first thing this did was scan the clustered index reading again right to left on this table which is what we looked at over here because what I asked for was the display name and because there's no indexes or anything on this table the query processor had to go and read every single page of data down that display name column until it found the Kevin three and F now didn't actually run but it estimated when you hover over one of these operators and that's what these three items are called as an operator when you hover over one of these you get a ton of information if you look at the roughly the middle of that screen the estimated number of rows 1.33 well it's gonna be one because there's only one of me fortunately the estimated number of rows to be read because it knows it has to read the whole table six-and-a-half million okay so I'm gonna reach six-and-a-half million rows to get one that's because there's no indexes if you hover over these arrows in between you also get some abbreviated information looking at the skinny arrows makes me happy because that small number they actually get bigger as we do other things there's so many records in this table you see the the yellow double arrow icon on these operators that means that this query is going to run across multiple processors that is your parallelism indicator on an operator so it's going to use multiple CPUs to return the one record that it's going to bring back and and just for fun I'm going to go ahead and run it and show you that there's only one record there's also no execution plan here anymore because I didn't tell it to include the actual one like I showed you right here control image shortcut for that by the way those are good ones to reroute if you do any query to you at all okay again the most basic query and a very very basic execution plan right-to-left and if we had more we'd go top to bottom as well this green text here is a very very wild guess with a horrible name of an index that would have helped this thing go faster which is really it's nice but you don't want to just grab that and start running it you want to look at that as going hey there's something I need to look at all right so I am in fact going to create an index on this table Stack Overflow database create a non-clustered index that's a different topic on this field is my include as my key column and I'm going to include reputation column which is over here for something I'm gonna do a little bit later now this will just take a few seconds to run again six and a half million records so while it's doing that we'll scroll down this next one all right now I have an index and this is the exact same query just copied and pasted if I do my ctrl L I am no longer running parallel and I am no longer doing a clustered index skin I'm doing non-clustered index seek using the index that I just created as you can see index seek non-clustered in C underscore display name is what I called it and when you hover over that estimated number of rows 1.33 estimated number of rows to be read which was six and a half million also 1.33 because this is eight is a one column index with one included column there are statistics on that the query optimizer can use and go huh that's a much better way to do it than reading the entire table it really is that simple kids and then it selects that one row back for me all right again still a very simple query where I've said display name equals this and only this it's not a range or anything weird like that so speaking of weird let's get weird we're gonna change the equal instead of display name equals given three and F we're gonna change it to like anything that starts with Kevin surely there's only one Kevin at the database right we do control ill we see our index eke still nothing has changed here 100% of the time and that's an important thing to notice I'm sorry the cost was spent in the index seek on this non-clustered index notice the arrow it's a lot bigger than it was this tells me estimated number of rows is over ten thousand well that kind of makes sense because Kevin's a fairly popular name worldwide especially in American still quite a bit in Ireland and various other places so it makes sense there would be Kevin wildcards some other stuff more than just the one time let's just do this for fun we'll go up and we'll run well estimate both of them at the same time by highlighting in both control l and make this a little bit bigger so you can see it and we've gone from index seek to index equally the same the only difference is the size of that arrow and the numbers in these tips here showing you that there's going to be a lot more records and because I like you and I want you to see it we're gonna run that it's not gonna take any time at all ten thousand one hundred and thirty nine Kevin's in the database and if you scroll down you start seeing all these other Kevin's that are in there and eventually you'll find two Kevin 3nf if you look for it hard enough alright like I said same operators different number of rows because sequel server know what you're asking for let me move this down just to here alright instead of just the display name we're gonna add this last access date column to the query I want this in the results set so you're gonna see something different ctrl L to show the estimated plan we still have our index seek on the NC display name index great but now we have an additional operator and now we have two rows we're still reading right-to-left but we're also reading top to bottom because we have a top and a bottom now the index seek simply then seeked into the index found the display name Kevin three and F but then it had to go back to the data pages to find the last X access date because that is not in my non-clustered index it's only in the data pages or the clustered index if you will so it found the one record that matches Kevin three and F it automatically pulled the ID over here with it because it does that behind the scenes and then it went into the clustered index and found the last access date that matched up with Kevin three and F and if you hover over this key lookup it's gonna tell you what it was looking up towards the bottom where you see output list it says I've wanted the stackoverflow users table X last exits date field because she asked for it in the Select statement that makes perfect sense go find this then go find this join them together put them out you know output them to the root to the application without a big deal but a key lookup simply means I used your index and it was great thank you very much but I didn't have everything I needed so I went and got it on my own that's what's happening when you get a key lookup alright here's where things get a little not tricky but expensive we talk about expensive I'm talking about cost here the cost of the query in terms of memory and time spent and resources allocated for it etc etc so to go on forever on those control ill but we all we've done to add in order by clause we're still with last X Essex access date and we're going to order by that last access date which is hilarious because there's only gonna be one row because it's Kevin 3nf but notice that we still have the index seek that hasn't changed 18 percent of the cost is there 20 percent of the cost is in the key lookup because it still has to go grab the last access date from the clustered index or from the data pages got to join them together and then it's got a sort that one row and this estimate says it's going to six been sixty-two percent of its total time which is really just a couple of milliseconds and it's gonna spend that time sorting your one row that's awesome I love it and then it's gonna spit them back out that's all the Select operator does here it just says you asked for some data I selected it here it is alright so we've gone from a clustered index scan and a select all the way over to index seeks key lookups bunch of arrows in between sorting I'm not gonna get into inner joins and there's a lot of different drawing operators some of them can get really expensive but this is basically putting these two pieces of information together to show it to you alright we're getting close to the end we are going to change last access date to reputation now if you remember my index that I created early on had an include column for reputation a display name is a key column reputation as an include column let's see what that does again ctrl L because these are all going to work nothing changed the numbers are exactly the same still seeking the the Kevin three and F still doing a key lookup even though I'm not asking for last X estate it's still doing that and if you hover over this the output list still has last access date because it needs that to do the order by later on this happens after the index seek and the key lookup so it's got to bring that last access date with that means it's got to go find it wonderful last I think this is the last one I've got let's change reputation here and here instead of last access date and see what changes ctrl L bang because my non-clustered index included the reputation column it had it no key look up to find anything else because it was already included in it and then it sorts that one row and it selects it out to you so some of the stuff that we've looked at today and that's all I have for you far as samples clustered index scan I'm reading your entire table those are generally pretty bad you might get a scan sometimes when it's scanning for a chunk of rows in that table it doesn't necessarily mean it's reading the entire table a lot of times it is those are generally opportunities to at least investigate where an index might help you an index scan can be kind of the same thing I'm looking for a range because you asked for everything from last month or last year so it's going to be more than one record it's going to give you a chunk of records and index seek is one of the most ideal things you can get it's not always but most of the time in index seek is what you want if you've created a NICs you want it to be used indexes that aren't being used or just taking up space so an index seek is going and drilling in to a very small subset of rows to help satisfy your query we looked at key lookups when there's an index that can be used it uses the IDs and primary keys of that table to go quickly into the table itself when that key lookup happens and pull a couple of extra columns or some extra columns back out and then it puts them back together and moves on down the road and then of course we saw some swords and then the Select operator the key things to remember when you look at these and here's what's really fun is if I just ran every single one of these queries at once and if I do it with ctrl M and do it again let's do with ctrl L that'll work I'll get some stuff sorry about that I have a bunch of different queries in here and they have used statements and all that I've got every single one of them listed because it all ran is one big batch if you've got some 5000 flying query or something very complicated that's running slow and the developers don't know why this is how you do it get some sample parameters go to your test box run the query itself and you'll get a breakout for each individual query notice that this was 3% of the total time 0:03 if you scroll up and down very quickly hold there's 44 or something Bad's happening there 6% 16 percent 16 you can kind of drill into the things that are more important let's go look at that 44 oh this was a bad one this is where I basically had to go get 10,000 records so you can kind of play with it within this most expensive piece I have all the time of it spent right here so that's where I would look to see why did it do that in this case it was because I used the the wild card Kevin if I were to put percent Kevin instead of Kevin percent or even just say percent Evan and take out the K it's going to read the whole table because you can't do you can't make an index that works with wild cards on either end of a string there's a whole different way to search for those effectively but work if that's way out of scope what we're talking about I'm way over time on what I expected to do here but I wanted to show you a bunch of different operators and how subtle changes can generate different plans that the optimizer is trying to do what you're asking in the quickest possible way biggest thing you can take back from this read from right to the left top to bottom and you'll have a pretty good idea of what's actually happening under the hood if you can optimize things on the right it affects everything else downstream over to the left that's all I've got for you today feel free to follow me on Twitter asks me questions post them in the comments whatever you want to do thanks guys back
Info
Channel: Kevin Hill
Views: 49,202
Rating: undefined out of 5
Keywords: SQL Server, Execution Plan, Performance, Stack Overflow, crash, dump, SSMS, Query, DBA, Database Administrator
Id: 8BmZ9TaRC9k
Channel Id: undefined
Length: 17min 25sec (1045 seconds)
Published: Thu Jan 17 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.