Advanced Query Tuning Techniques by Guy Glantser

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] well in this session a group by guy glance er is gonna be talking about advanced query tuning techniques so take it away guy all right thank you so this is me my name is guy I'm a Data Platform MVP I've been doing all kinds of crazy stuff with the sequel server for the past 20 years or so of course I did a lot of mistakes this is how we learn and I'm going to share with you some of the now not mistakes but some of the things that I love to do with sequel server which is query tuning i I run Madeira Data Solutions it's a company based in Israel we provide of course there are solutions there are services everything around data mainly sequel server but not just sequel server and also brand mentioned it together with metahuman we host the sequel Civil radio podcast so visit sequence of radio.com we talk about sequel server about professional stuff about our experience we talk about community events we interview some guests once in a while really fun stuff if you like sequel server this might interest you and this is me more or less and the agenda for today is demo okay so that's it normal presentation I'm going straight to the demo it's not going to be one percent and so on now before I start it's called advanced query tuning okay and it's a it's a huge topic I will not be able of course to cover everything there is no such thing as covering everything in creating but I have lots of stuff I chose some of the topics that I find that are very common and people make the same mistakes again and again and so I think it's important for people to know about them and some interesting stuff as well so I will try to put as much as I can into this one hour or even a bit more but that starts so the first thing we have a phone call that's not me oh now I gotta see if anybody else is unmuted just hear me all right go ahead alright alright so Stud so the first thing is let's say users complain they say that there is a specific report it's used to run for whatever now it's it's it's not as good anymore it's it's low please do something about it okay so the first thing most people do we will either use profiler or maybe extend events to try and track the specific query or the store procedures it is running behind the scenes and see what it's doing and what we can do about it right so for the sake of this demo I'm gonna use extend events because it's like the new thing I use profiler really but you know that events well I use both okay but for this then we're gonna use extend events so here's that it goes create event session I call this probiotic application because you have an application that does problems and I want to track the RPC completed event this is extense for remote procedure call every time the application the.net application is going to execute a stop procedure in circles so we're gonna get when it's finished an RPC completed event and I'm gonna put a filter on the client application name which is my specific application now I have an application here so here is the application I'm going to write so it's running okay you will not see anything it's just simulating all kinds of activities behind the scenes doing some stuff like a regular application against sequel server okay so it's running now let's start the event session and sorry let's create it and now let's start it and then when you use 16 events if you're not experienced with it then this is going to look ugly to you also if you are experienced with it it's going to look ugly to you okay this is how let me put this down alright so this is usually how you query the data in event sessions you need to use xquery because everything is is in XML okay so without getting into all the details but you just get the data alright and okay so we have the the events each event here is an execution of a snow procedure we get the text itself now it's important to get the text itself let me do this because if the stop Accenture has parameters like in this case then we have an execution example because I'm not familiar with the business I don't know which parameter values to use this is of course a simple case just one parameter sometimes we might have ten parameters or whatever so I need an execution example that I can execute myself and then I can troubleshoot and look at the execution plan and so on okay so now I have this and we can also see let's maybe write again we have some more we can see that the execution time is microseconds so it's around eight seconds maybe an average okay so this is a lot of time for the users okay we need to do something about it so we take the execution example and if it is I have it ready here and we have actual execution plan and then I'm going to execute it okay so it takes around four seconds this is still a lot but it's not eight seconds right let me do it again one second again okay so the first time here was a bit slower but we have around one second two seconds for this if we look at the plan then we see a huge scan okay this is a scan of the entire table we have the invitations table we have this thick arrow here showing the many rows in this case we have the number of rows red is five million rows okay so maybe this is white it's taking long but again if we run it here it takes one second if we go back to what we get and even session we get around eight seconds maybe even more something like that so this is quite common I'm sure some people know about it but I know that a lot of people don't know about it why it happens and it's important so I want to show you and also talk about some things behind it so why do we have eight seconds coming from the application when we when you track it with the extent events or profiler and it takes only one second or two seconds when I do it from men in the studio it's the same stuff they do with the same parameter values exactly the same okay so the first thing I'm going to look is it the plan cache so I'm going to look at X like procedure stats this gives me statistics about stoppages in the blank ash and I want to also see the execution plan itself okay so I cross apply to this function and if we run this then we will see that we actually have for the same stop of surgery we have two execution plans in the plan fish now one of them up until now was running for twenty three times okay and the average duration was eight seconds one of them only four times the average was two seconds now you can guess which one is which right so this is coming from merchants to do with this duration and this is from the application okay so we actually have two different execution plans if we look at the execution plan from an in-studio it's the one I just showed you and if we look at the execution point of the application it's a different one but it's weird because this one looks much better right so the application takes eight seconds but look at this we have an indexing we have very thin arrows here okay so nothing compared to the other one it's almost nothing no data so we have your 10 rows used to be the number of rows is 10 so this looks much better but it it takes a lot more time to execute so we need to understand what's going on let's answer this one question at a time the first question is why do we have two different execution plans okay why not one it's the same procedure the same parameter values so what I'm going to do is I'm going to use another function X explain attributes each execution plan is stored in the plan cache alongside with some attributes describing the plan one of the attributes is the set options now reset options let me show you if we look at some and we go to the select operator we do right-click properties then we have here the set options that right here okay so you can see this section here there are seven set options this set options affect the behavior of your Furies within the session okay so you can change them per session and each one has a specific meaning and for example Encinos describes what happens when you compare to null okay so if you have a select from customers where let's say phone number equals null if unseen Isles is true you'll get an empty set always because according to the ANSI definition every comparison with now is false okay so you need to do actually wear a phone number is now but if unseen Isles in your session equals false then phone number equals now we'll of course retrieve all the null phone numbers null values okay so it's an example we have seven different set options each one has some effect on how the session is running and the thing is that because it might affect the results of your queries it also might affect should affect the execution plan for such a queries and then whenever you have a different set of options for two different sessions each one is going to get a different execution plan okay so this is a hint and if we go back to our query we do set options so the set options actually a set of bits okay and we get an integer value describing the the total values of the bit so each set option if it's true you'll get one otherwise you get zero and then you get the integer representation of that so the first thing we can see here it doesn't tell us much what we see there is a difference okay we have two different numbers it means we have two different combinations of set options for this sessions and this ends the first questions why we have two different execution plans okay now we want to see what the difference this or I prepare the query here for you of course I'm going to share all this later so I'm going to use some bitwise operator to extract a specific stet options in different columns and then we can we can look at the different set options and here it is okay so I Rita bolt you can see the real sorry okay everything is the same except for this set a Rita board it's true in when in the studio and it's false in the.net application okay so this is the difference why is there a difference okay maybe first of all let's answer the question what does a writable do so what about affects the behavior of queries when you either divide by 0 or you add out of range range overflow okay so it's rying to do the calculation and put it into a tiny but it's larger than 255 for example something like that okay so for those are either both will change the behavior of what happens if it's true you'll get an error and the query terminates if it's false you get the warning and you get actually a null value in the results of the calculation okay so this might affect a foggy of course the result of the queries why do we have these values let me show you so what I'm going to do I want to look it to Monaco login activity what happens when we open a new connection okay so first lets me open this monetary gains another event session scott mhairi total gains we have an event code login and we have something called collect option text okay so I want to enable this I will show you what it means what it does I want to add it's called an action it's like a global field the client application name and then I want to filter by the application name so I only want to get new connections coming from either my programmatic application or coming from maintenance to do a new query in minutes to do okay and then I also sorry I want to have the sequel batch completed events with the same filter okay so I want to look in event and sequel batch complete event so I'm going to create this event session all right let me see stop the application and then that starts this event session and I want to start the application so when I started of course I opened a connection and then I want to come back here sorry let's see what we get here it is so we have the login event it's coming from our application and we have this thing the network protocol this is a batch running all kinds of statements instead of trying to see what's going on here I prepared one in advance for you here it is this is how it looks like okay so this piece of code is running whenever we open a new connection from a do dotnet using the.net driver from an application ok it's a set of innovation in initializing all kinds of set options including this thing okay so the doulton application initializes there at abort setting to off to false by default now this is part of the.net driver it's been like this for years forever actually it's hard-coded its product driver it doesn't matter from where you're going to use this driver and open a connection and how you open a connection it's always going to run this piece of code ok so this is one thing this is why it's false in the application but then if we go back where are we go back here no sorry so we get this for the application I'm now going to click new query to open a new connection from an in-studio and let's write again so we get some more stuff and this is actually the second looking coming from an in-studio actually when you click new query a minute stood you open several connections behind the scenes it does all kinds of checks so this is why we have several logins let's look at the first one this is my main connection my session so many in the studio does the same thing because it's the same dotnet writer behind the scenes but then it also runs several batches after that one of them is this one and it looks like this ok so after running the dotnet driver code studio runs this to do its own initialization okay and it does this so it's overriding the default setting of a writable forced by setting it to true so it's like saying instead of changing what the dotnet is doing I'm not trusting the dotnet driver I'm going to override everything and do it how I believe it should be and this is practically what's going on in in Microsoft more or less okay the dotnet team implemented develop the dotted driver many many years ago and they put this piece of code this piece of code okay in it and at some point that 2005 Secret Service man in the studio was released for the first time and the sequel server team said the best practice is to use the rita boards on rita board true okay so they changed it and they put this instead of going to the dotnet team and telegin please change it they just added their own initialization and from that point until today we have different settings in the doctor application in the rain studio okay so this is the reason why and by the way if we go let me close something here if we go to the uh-huh here it is you go to MSDN to the page describing rita board they actually put a warning here okay the default Arita board setting for secret seven in studio is on client application setting or is about to off can receive different free plans making it difficult to troubleshoot fully performing queries okay so instead of fixing the problem they put a warning here now it's your responsibility right and this is how it is for many many years and i see a lot of people fall in this trap because it might happen to you you try to troubleshoot something in the studio and you're actually looking at the wrong plan and you might spend a lot of time and effort trying to improve something that it's not really happening in the application it's not a real problem and without even knowing about it okay so it happens once in a while now okay so we answer the questions why it happens let me close some things in here let's stop and drop this one and also this one all right so what should we do about it first of all one thing that is important to say here the the reason we have two different plans we're gonna have two plans anyway because we have two different set options but the reason we have two different plans and that different things is because it's not equal to freighter board in this case okay so what doesn't change anything in my specific rate in this case the reason is something else for some reason when there was a new compilation from the new studio in this case we got a different plan okay so the Ritter board setting only created this situation where we needed to compile again okay it's important to understand it it has nothing to do with the Ritter body itself with its meaning okay so in this case it was for about the sniffing okay which is very common I used some parameter in this in minutes to do which is different from the parameters that the application used for the first time and for each parameter we got a different plan okay now what can we do about it the best practice again according to Microsoft is always you set away the board so on so what we should do actually complain to Microsoft and I thought the driver code it won't help you okay so what we can do is have some kind of a mechanism and see in your application in your dull for whatever to always set a Ritter board to own in your connections okay but if you don't do it almost nobody does that if you can't do it or you don't want to do it at least when you try to troubleshoot something that's run by the application changer with the word - off in - to do okay and by doing that if we run it so again now we're actually reusing the plan that application uses now I can really travel and investigate what's going on in the application why it takes eight seconds or even more in this specific case what's going on yes we have a question from Steve asks do you happen to know what the default or if abort setting is in PHP I have no idea great question I really don't know I can try and take that but I don't know okay so now I get to play the application uses now I can troubleshoot and investigate and try to understand what's going on why takes so long in my case it was even 15 seconds 16 seconds okay so let's read it it okay why maybe the second question how come this uses an index ick and it takes so long and the other plan uses a clustered index scan over the entire table five million rows takes like just one or two seconds okay it looks like it's different thing and maybe the the hint to the answer here is the the thickness of these arrows okay so if you remember when I showed you this plane before it was very thin it looked like very very fast and efficient but now it's very thick and of course the reason is before I looked at the Planck cache and I only showed you the estimated plane and in the estimate plan you'll see the estimated number of rows okay and the thickness of the arrows is according to the estimated number of rows but now I'm looking at actual execution plan and now the thickness is actually representing the actual values if we look here we can see that there is a huge gap between the estimated number of rows which is just 10 and the actual number of rows which is a Miller in the health more or less okay and this is the reason for the pool performance in this case okay so the optimizer estimated 10 rows coming from here from this index ik and for 10 rows it's very very beneficial to use nested loops because we only do 10 key lookups this is great in reality we are doing a million and a half kilo cups which is not very good we get this number of logical reads okay which is very high didn't show you before but this is very very high compared to the other plan so this is the reason in this case it's the problem while the other plan doing a scan in this case the skin is much more efficient than doing 1 million and a half key lookups ok so this is the reason why we have different performance metrics now what we can do about it we can do all kinds of things about it now that we have a poor plan in cash and the application is suffering from it the first thing very easy it won't always work it's just like trying to do like it's like it's luck actually you depend on luck in this case it's trying to recompile this opposition okay by using SPD compiled I'm instructing sequel server to mark this plan as invalid actually mark both lengths we have two plans now okay so mark all plans and caches invalid and the next time a user is going to execute the procedure from the application it's going to recompile again and maybe it's a matter of luck maybe we'll get the good plan the next time okay so it's just gambling in this case let's do this so now if we look at the plan cache you can see we still have okay so the application is running now we have just one is the new one okay it's already initiated and now it's doing the good plan in this case okay so it's worked in this case it will not always work because sometimes you might get again in your compilation with the wrong value of parameter and of course you need to troubleshoot more so I'm not going to go into too much details here because I have other things to show you if we go back to on run it again okay so now I get the same plan if I look now at the plan cache okay so we still get two plans okay because of the same problem because now I'm using array the word on but now in this case this plan and this plan are actually the same but we have still two instances of the same plan because of the different set options right now if we look at the stop position itself get invitations by status let's look at the code where are you okay very simple select to obtain from invitations where status ID equals something okay so we have different statuses I want to look at all the invitations by specific status I know the top ten ordered by creation day time okay so if we look back at the plan where is the plan not this one let's look at the plan again here it is so we're doing it the entire table five million rows and we're adding a predicate okay because there's probably not good enough index in this case okay so we do need predicate not a secret okay on the status so scanning the entire table now we know we have an index because he had an index we saw an index it before we have an index on the status but because now the optimizer estimates the correct number of rows okay compare this one this one so now the optimizer knows it's not going to be a good idea to use that index I'm gonna use a clustered index scam okay of course we can make the index covering index by adding included columns this will make things better but not good enough what I want to show you is something else if the query is this one if we order by this we get the top ten we only need ten rows out of five million okay and we see the distribution we know there are 1 million and a half rows with this status okay or in actually in this case you need to know distribution so I know there are only three different statuses each one has around 100 house values okay so if I do it index on this one if we had an index on creation at the day time so now the optimizer can use this index to not need to sort by creation day time it can scan from the index in a sorted order and then just take the first ten after applying this filter let me show you so if we go back here we clear this index on creation take time okay enough only that then we take so long really hey it could be that overhead of using extended events who knows we're team so we should switch over there but I closed it okay so we have the index now let's try it again let's look at the plane we have a different plan we have an index can it's not a sick it's an index can on our index on the creation date time index and we're actually again we're actually reading 30 rows okay we need 10 now we need we need 30 rows from the table from this index but we do it in an ordered fashion or that equals true so we take advantage of the sort order of the index and we read the first 30 rows then after the key lookup we can adhere the predicate on the status so after applying this filter out of 30 we get the top 10 we get the 10 rows we need and we're done okay so we get down to this number of logical reads which is really really great even if we add an included column so the index we had before in the status we won't get this great result so in this case the better that the thing I want to show you here is that usually when people look at execution plans and they analyze queries we tend to look at the where clothes first okay so we have a filter here we need an index of status ID next to it in the false ID and then see how it affects the performance and sometimes people just stop there and that's it you need to always also look at the order by or sometimes the group by in the query because as you can see in this case adding an index on creation day time is much much more beneficial okay so it really depends on how what a puri is doing and don't just look at the where Clause predicated here alright so I think there's any questions friend no you're good you're good a couple people asked about the code map on the right hand side missus MS and I sent them over to the the web post right click scroll bar options this one right yeah that mode instead of nobody's net mode and now everybody can construct it on what I'm showing to you not don't look at this way anymore look at my code alright so the next thing Sookie just lessons left from here okay that when you troubleshoot next time you troubleshoot the execution plan in an in-studio remember that you need to set a rate the bore to off or at least make sure that you use the same set options as the application otherwise you might be looking at a different plan okay remember that and I showed you what happens let's move to the cost so I'm going to run this so I have a stoppage it's called sin to take invitations I will not tell you what it's doing because it's criminal so it's running I'm going to show you the result of the plan while it's running let me open the procedure that's what it's doing okay so we're declaring some table variables here eligible members potential couples it's a dating website so if we try to correlate members with each other and create potential couples and then what we do here we cover it it we take 200 random members and 200 number random members from here and then we try to match them based on the same country sexual preference the birth date and stuff like that we deletes all the non eligible members and then eventually we insert the results into our invitations table okay those are the fake invitations we're trying to fake invitations on our website to make it look like we have some traffic yes that's what we're doing it's a legit business or not so let's look at the plan so it took 41 seconds okay so we have a bunch of statements here this is the insert into the first table variable this is the delete and so on and so on and as you can see the first one if we look at the cost the first statement is 61% 0 39 % and that's it then we have another 1 0 here sorry and a zero here so we get 61% for the first statement so this one the clustering scan or the hash match aggregate this plan takes most of the time in this stow position right wrong okay so this is what I want to show you this is wrong so a lot of people we look at the cost and based on the cost they will make a decision work should I start okay because we have lots of statements where should I start I start with the highest cost it makes sense let's put our efforts here and see how we can improve this part this statement this insert into eligible members are so on and so on and of course the same goes within a specific plan so in the inside the plan we have the relative cost we have 37% here we have 59% here so again where should I start if we have a more complicated plan like let's say this one this one here okay so we look at the cost and maybe this 42% ok this sort is the problem in this case okay and we need to look at why is it doing a sort how we can maybe eliminate stole and so on and so on and I want to show you something else now let's open state of execution another external events I need the object ID database ID so it's 16 and this is just saying all right so I'm going I'm going to create an event session and I'm going to track the event Espie statement completed in this case so every time a statement we did a stopper seizure is finished I get an event and I only want to focus on this specific store procedure okay the one we're running some pathetic invitations so I don't want to get all kinds of garbage in there I will start the session and I will I will run it again same thing okay so we take about 40 seconds right but now while it's running I can see what's going on every time a statement is going to finish I'll get an event so we have three out of five already this is weird already because I show you earlier that 61% of the 40 seconds was the first one right doesn't make sense in s stomach still executing okay so when it's done I want to show you the numbers here and we've more or less can see which statement out of the five is really the bad guy and we'll talk about of course why so let's just finish almost done and all right so if you look here this one is 39 seconds actually out of 41 all the rest are like nothing compared to it okay so it's actually the fourth statement according to the actual duration the fourth statement takes 39 seconds and again if we go here 61% 39% the fourth statement which is right here is 0% okay so it's a huge difference between what the cost is telling us and the actual duration at least based on external events all right so what's going on the problem here is that when you look at the cost the cost is estimated and it's always estimated this is something that is confusing because we have the estimate plan and the actual plan and in this case I'm looking at the actual plan now all right if I look for example here we can see actual values okay actual values this is the actual plan but even with the actual plan the cost is estimated those are the estimated values that the Optimas are calculated during compilation during the generation of the execution plan there is no such thing as actual cost actually the optimizer never calculates actual cost it calculates the cost as part of the optimization process to evaluate different alternatives and choose the best one with the lowest cost okay so the cost is something that is done only during optimization it's always estimated now there is a products here because if we have performance problems in most cases these for my experience in most cases when there's performance issue it's because the optimizer is wrong somewhere with its of the estimations okay and if it's wrong then when you're looking at those numbers they are wrong they are lying to you this is not the real numbers because they are based on estimations right and you're trying to understand why there is a gap between estimation and actually where is the problem and it's it's a loop okay so you're in the problem here so don't trust the cost okay you can maybe sometime use them I will talk about it when but don't trust the cost don't start with the cost I would start with looking for the operators where there is a big difference between the estimated and the actual number of rows it's not so easy always to look at everything and see where the problem is so you need to see what this opposite ear is doing of course and where to start maybe to look it's not that simple but usually I will just look around and look for the differences and in this case I will show you because I know it is if we look here for example so we talked about the fourth one this is the fourth one okay so we now when we look at the event session we see that the fourth statement this one with the 0 percent is actually running for 39 seconds okay if we look here we can see that the estimated number of rows is just one the actual is this and of course this is a big difference and again the problem here because the optimum are the things is 1 it is good because it thinks it's 1 it's going to do nested loops and only one iteration of this table scan this table scan is going to scan many many rows okay but and this is the number of executions okay it's actually going to do this time of education ending up with this number of rows okay so this is the problem this is why it's taking so long all right so again don't trust the cost look for differences between estimated and actual number of rows and when you find a difference like in this case you're half the way to the solution because now you know what the problem is again it's usually not always okay it might be some hardware problem something else maybe for my experience in most cases this would be the problem you to find where there is a gap and then to understand why the optimizer was wrong with the estimation okay there are all kinds of prisons we talked about Palmer the sniffing before here is another reason okay so again I'm sure a lot of people know about this we are using table variables we table variables the optimizer doesn't have statistics to rely on so which will always estimate one row a single row even if there are 2,000 rows like in this case or two million rows or whatever one row because it's actually blind it has no clue about the number of rows in the table so this is the problem in this case what I'm going to do is very simple I'm just going to your place or the table variables in this top position with temporary tables okay we have to that's the only thing I'm changing nothing else and okay let's run it first oh sorry it's quite long and then let's do it again so now it should be faster all right that's it instead of 41 seconds just seven seconds okay much much better and now I can tell you can of course we need to look and see what's going on inside I can tell you that now there is no difference in most cases not a big difference at least between the estimate and actual number of rows so if there is no such difference it means that now the cost values are more or less accurate okay so now really the first one is taking 57% okay but now it's runs in a different way if we look now for example in there if we look now in the fourth statement now it's 2% but it's really around 2% because now the optimizer knows the number of rows in here so he doesn't choose nested loops it chooses the hash match algorithm okay which is much better in this case we just scan each one of those tables just once okay and if we look if we look here again so forget about the create table from here to here this is the five step into the second execution so now you can really see that the first one takes a lot of time okay so this is like six seconds six seconds out of seven okay so the numbers now are more accurate than what we had before so don't trust that the cost that you stopped this one and look for those differences of course other reasons why the optimizer might be wrong it can be statistics are not updated or maybe statistics are not accurate enough because we're using a sampling rate which is too low you might want to try full scan for updating statistics or maybe even we have statistics that are with full scan and they're updated but still they're not enough for sequel server to estimate correctly because we have all kinds of skills in the distribution of the data this can also happen and there all kinds of other reasons of course depends on and maybe you're using local variables for example okay or all kinds of expressions within your queries like is null if you cease analysis in optimization time it's an unknown value okay just like using local variables so with all of these things and a sequencer will have a hard time estimating the correct number of rows all right let's close this one okay I'll skip this one I want to move into something more interesting okay all right so next thing this is something that I encountered not so long ago I helped clients with some performance problem and this is what I got this query here okay so it's selected from invitations inner join member sessions and then we have two predicates two filters in the where clause one of them is on the receiving member ID okay this one another one is on the status now let's write in first maybe and then we'll analyze it okay let's run it with actual execution plan now in this thing which runs very first but in my clients environment it was a much much bigger table and it used this query in several places so the input was very dramatic okay and what we can see here I want to look at this specific index seek okay let's convert them this one so we have an index seek but we have six predicates and also a predicate okay so first of all let's talk about the difference the SIP predicate is how sequel server uses index ich yeah it uses the index to stick through the index to filter for the specific rows using the index keys okay so in this case we have an index based on the name I can tell you we have an index on status ID and then receiving member ID so this is a composite index composed of two index keys first status and then receiving member ID okay and we have an index ich operation it's quite not it's not so simple to read it but it's actually filtering by status and receiving member ID because those are the two index keys and it's a range it's actually a range scan so we're doing a sick but then it's allocated in the first row in the leaf level of the index and then it's doing the range scan a partial scan from this volume one two three four and status one ending in this volume status three and one two three four okay so this is actually how it translates our predicates okay so it's a rain screen between 1 and 3 which is the first in this key and 1 2 3 4 - 1 2 3 4 which is the second one so if we go back what I want to show you is that the number of throws read this one is around three million and we only need 46 seconds not very efficient so we have to scan a lot of rows to read a lot of rows we have a great index right we have a composite index on the exact two index Q that we use in the query okay so we expect it to be very efficient because it only should return 46 rows and it uses both index keys in the index but still we have to scan three million rows so it's not very efficient and we and the reason that the question is why okay so why and how we can improve it the reason sorry I go back here the reason it happens is because of this range okay because the status is the first column in the index and status is a very non selective column meaning there are only three or four different values for the status and out of five million there around 1 million or even more one and a half million rows for each value okay so it's very non selected while receiving member ID it's a very selective column there are many many unique values in this column and for each unique value there are like maybe ten rows something like that okay so because the status it is the first one in the index and we have a between operator sequel server translates it to n range scan and it actually has to scan all the rows in the index between status ID what was it 1 and 3 right and then applying the second filter which is part of the index stick but it's in the second planes it doesn't help us much so the first thing we can do of course that the best thing to do is to reverse the order of the columns okay so this is very important to show you that you actually need to choose very carefully the order of the of the keys of the columns in a composite index if you can there are all kinds of course of considerations but if you can choose the more selective columns to be first okay so I would create an index on member ID and then on the status this will change things but maybe sometimes I can't okay like in this case in this case I can do it because it's a client it's a production environment it's not something you can do just like that reverse the index maybe they need the index for other reasons and they don't want to create a new one for all kinds of considerations again so this is what we have one thing we can do instead of between we can do this which is actually the same thing right what's the difference between 1 and 3 it's an integer values the optimizer knows it's full 1 2 3 it's the same thing but if we do it then now boom okay this is what we want this is what we expected I have a composite index on the exact things I filter on I want to get only the rows I need from the index 46 rows that's what I'm going to read okay so it's much much more efficient now and now you can see here again it's how to read but we have actually three different seek operations okay so seek his here's one here is sorry it's this actually this one one then we have two then we have three whoops all right so we have actually three indexed operations one after the other and each one is filtering not arranged scan but the specific equality predicate okay so it equals status equals one and member equals one two three four and in the next indexing operation it's 2 1 2 3 4 and so on and so on so now we're doing specific seek operations each time we're treating only the 10 or something like that rows we need altogether it's 46 and we don't have to scan the entire range of statuses between 1 and 3 which is actually almost the entire table okay so the optimizer is not in this case it's not smart enough to understand that both queries are the sex the same actually and to treat between in the same way so we need to do it for the optimizer so this is where you might want to prefer in over between usually people say that between is preferred over the in and and it's more image case it doesn't really matter it's not such a big difference but in such a case where you have an index which is not optimal for the square because of the order of the keys it's a big deal and you should refer in in this case of course if I create just to show you I create a non-clustered index on on the receiving member ID and then status okay the opposite order and then we're going to run it again and now I'm using between again all right but now sequel server can use the new index first receive member ID and then status ID and now we can see here that because we first do and it's a single index if okay but if we first do an equality predicate on one two three four and the member ID and only then we do a scan between one and three and this is much much more efficient and we get the same result of this okay actually reading only 46 rows okay so if you can't create the right index of course this is better but if you can't at least in this specific case prefer in over between Steven asks how about if used or like or equals 1 or equals to or equals 3 alright great questions so or in this case is just like in okay so if you do or the optimizer translates its to in 1 2 3 it will actually do 3 seek operations just like before okay so it's the same thing great question thanks all right I think we're done with this any more questions nope you good ok excellent so let's talk about implicit conversions so in this case we have members we have payments ok and I want to look at all the payments made by members on the website and I want only members who registered in a specific state this case generally first 2012 and why whatever and only payments below $10 let's say ok and I want to order by member ID payment date time let's run this so we get nothing we don't care about the results we care about the plant we don't have a plan so we do it again okay now we have a plan so we have in this case very simple plan not very optimized we have a scan on the members table we have a scan on the payments table we do a hash match join between them so you actually scan everything if we look at the predicate we scan the entire table in this case the table is one wing or 100,000 rows in this case and we have a predicate here here's one OK to apply all the filters in just in this case on the registration date and time and we have another scan here below on the on the payments table and again we scan the entire table we apply predicates in this case on the amount but we read everything essentially ok so first thing I want to do oh sorry I want to show you something else if you look here then we read 100,000 rows but we actually returned only 10 ok so the number of members who registered on January 1st 2012 is 10 10 members ok and if there are 10 members then if I would have an index on member ID because this is how I join between them if I would have an index on member ID on the payments table then the optimizer would be able to use nested loops and then use this index SiC only 10 times which is very efficient but because there is no such index it has to do a scan ok now doing this scan 10 times with the nested loops it's not such a good idea so it's a hash match so I'm going to create it this index on member ID on the payments table and then let's try again the same query oops sorry same plan it's not using the index okay I don't know why we have ten rows a very efficient index city here I would expect to see nested loops and then indexing okay so one thing we can do in those cases and this can be very powerful when it's right to debug things and try to learn and troubleshoot is to use an index hint okay now don't do this in production okay but in your test environment if you try to troubleshoot to learn some things to see what's going on this is a great way to learn and to understand why the optimizer didn't chose to use this index okay so let's instruct the optimist to use this index so we get worse performance than before take some time so there is a reason why the optimizer couldn't use it so now we don't have an index ticket we have an index scan okay so we force the optimal ultimately said okay I don't have a choice I cannot do an index sick for some reason we don't know still why but I can do a scan and then we need to scan the entire table and we need to do a key lookup to get the other rows okay so this is very very inefficient because we get two million rows from here the entire table we have two million key lookups to get the extra columns very very inefficient so yes in this case of course don't use the index but we want an index sig so we can also try to add four sig okay so we can force the optimizer to use C on this specific index and then we get an error Cory processor could not produce the query plan because of hinge defined okay so in this case the optimal says okay I'm done I can do this okay this is too much I can use this index if you really want me to but I cannot use it with an index see now it doesn't tell us why we can't this is where we need to start analyzing the query better let's go back without all the hints and try to understand for ourselves what's going on why the optimizer cannot use index ik on member ID in the payments table okay so we have a hint we have this warning here we should look at warnings if you see warnings please read them okay so in this the warning is type conversion convert implicit we take the member ID from the payments table and sequel server has to implicitly convert it to integer to int and the warning is because of this implicit conversion this may affect two things actually one of them is cardinality estimates okay so the optimal that says I might be wrong basically the number of rows because of this implicit conversion because I cannot use statistics efficiently I need to completely convert each value to look at the statistics it doesn't really help and also this may affect sigplan okay which is exactly our problem I might not be able to use an indexing operator in my free plan because of this implicit conversion now unfortunately again it doesn't tell us where in the query we have this operator then we can either look for it manually or we can look at the XML and do a search fine on this ugly text here but I can tell you where it is okay so if we look here we do a scan again on the payments table and then we have a computes color so this might be a good place to look the computer or if you just look at the hints it will tell you nothing you need to look at the properties and then there is defined values okay so the defined values we can see here then yes we check the member ID and we do a copy in physical version to integer okay so this way we do it so this means that for every row coming from this payments table we need to implicitly convert the member ID to integer so that we can do this joint because of course this is integer but what is this it's the nature so of course we go to the table we look at payments first payments billing payments and unfortunately it's not an integer okay of course this is a design problem because if the member ID is an integer in the members table it should be an integer here if we want to join between the table doesn't make sense to make it something else but I see it's all kinds of occasions so it's a design problem in this case and again the right way to deal with it the best is of course not to do this mistake in the first place or if we can afford it let's correct it now okay so let's change this data type to int of course this can be a very heavy operation in the production environment if the table is very large now all kinds of way to do it in chunks or using partitions or something as things like that but this would be the best thing to do if we can't do it okay we have to live with this mistake and using and worker 20 then at least what we can do we can change the order of things because what sequel server does think about it we have 10 members coming from here with integer values and then we have lots of rows coming from here it's 20 thousand rows in this case and sequel server is implicitly convert each one of those 20 thousand rows in order to get this so if sequel server would convert these 10 rows into and worker into string then we could actually use the nested loops and doing indexing only 10 times just like we wanted before this would be much better but the problem is a sequel server does not have a choice in this case when sequel server has to convert between different types there is a specific order there is a table of data types you can search for it in MSDN and sequel server must always convert from the lower data type to the higher data type okay there is a hierarchy of data types in sequel server so in this case if we always convert from an inverter to integer not the other way around and this is not good for us but we are smarter we know that we can do better so we can actually instruct where am i we can extract sequel server to do it the right way by doing this okay so instead of reading sequel server do it implicitly we can explicitly convert the member ID in this case from the members table to end vertical xx okay so by doing this let's see what happens now okay so now we get the same scanners before the members then we get a 10 rows but look at this now we can finally get the nested loops and the nice index e on our new index that we created which is invoker of course we had another key look up at this chest for the 10 rows altogether we have after applying the filters we have to see oh so it's like nothing okay so it's much much better we still get the warning but in this case the warning is not about implicit conversion it's about a conversion okay because we still have a conversion of course but this time and we control the conversion we could pull the order of which it's explicit okay so it's much better in this case just another example very simple one again for the same payment stable I do something like this like this so again this is invoker this is integer okay I don't tell sequencer with integer but it interprets it it actually interpret it as small ends in this case based on the value it can be either tiny in small it or or int or big in so this is maulings and again now we have two incompatible datatypes and worker and small in sequel server has to make a decision it does not have a choice it always has to convert this one to this one sorry this one to the integer so think about it this is so stupid instead of just converting this to string sequel server has to go through the entire table do a stable scan and convert each value to integer just to see if it's the same value as this okay so the result is plus the big plus in the scan and then a convert implicit of the column itself to see if it equals 5 4 3 2 1 very very inefficient ok so by just doing this ok now I'm telling sequel server this is actually an marker so now sequel server doesn't really have to do any curse because it's the same it's the same data type and why are we using the index now ok so now the reason is that for this specific value we have many many roads coming out and actually this is what I want to show you before I forgot to show you when we do this without the proper conversion then not only we can't use an index we also can't use statistics okay so this is another problems equals several things now 21 rows because this is the average in the table worried what in fact for this specific value there are many many rows okay so this is only also statistics the statistics might affect other decisions made by the optimizer and get even a worse plan okay so it's not just the index also the statistics so the the thing to do here of course is to what through design process be compatible about the datatypes use compatible data types and also when you write your code always make sure that you compare between compatible types okay in your predicates when you define parameter types or local variables wherever use data types use the same types when you need to compare between them and if you have a problem like this one and a redesign is out of the question then at least use explicit conversion to make sure you do it in the right order okay so this would be better um right any questions no Wendy says she really likes that trick so very cool which one sorry which Rick the one you just did about Kat not just now but the one before of casting one side of the joint okay right so I just want to summarize so I lied before big about no more slides I have some summary slides just summarize the the takeaways from this session and again there's so many things to talk about but within this one hour I try to put some one of the some of the more common thing that I see so again one of them remember about the set options the session settings make sure that you have the same session settings between different sessions or at least that be aware of the differences so that you know what you're looking at whenever you have different set options you have get different plans they might end up the same plan or different plans but make sure that you look at the right one if you can use a rittle board on or true in the application your dopant applications if you can't at least when you troubleshoot something even in the studio change your it Awards to false okay remember that the cost is always estimated even when you look at the actual plan the numbers are always estimated numbers and again from my experience usually when there are performance problems you will see differences between the estimated and actual number of rows look for those differences they will tell you what the problem is don't trust the cost values okay they can be very misleading so yes I talked about it I think this is the best the most important for me at least it's the most important tip when you do pre tuning when you need to analyze plans the first thing I do is to look for those differences if there is a performance problem most likely I will find some place where there is a big difference then I just need to find out why again that statistics not updated statistics parameter sniffing table variables and so on and so on those are the reasons why you might have different thanks now nested loops I forgot to show this I think I just told you that you should look for the differences between the estimated actual number of rows there is a problem with nested loops that we try to find I think this one no maybe this one sorry so in this case yes of course if we look at the plan here yeah not such a good maybe this one would be better no same thing good enough I guess I guess to show what I mean another best demo for this anyway I just taught you to look at the differences okay so if we look here the a third number of rows is 1 the actual number of rows is 46 okay for the sake of argument let's say that this is a big difference all right but what I want to tell you here is that this is not exactly a big difference it's not really a big difference it's something that Microsoft did by design when you look at the lower end of the nested loops operator okay so just for the nested loops operator we have the upper table and the lower table when we look at the lower table ok the lower path of the nested loops so just for this path the estimated number of rows corresponds to the estimation of the optima is a fair iteration what the optimal estimates will return in in each iteration while the actual number of rows is the total number of rows for all the iterations together so it's very confusing okay so because we have number of executions 46 okay then actually what you need to do is to multiply the estimated number of rows by the estimated number of executions this number around 50 okay so this is 50 and compare this to 46 which is the total the actual total this is the total sorry this is the estimated total and this is the actual total okay so this is very misleading in everywhere else you just need to compare between the estimate the actual but when you look again at the lower path of the nested loops operator then you need to multiply the estimated number of rows by the estimated number of executions I hope this was clear because this is very confusing okay so sorry let's go back here alright never use table variables I'll have to stain it never use table variables no I I'm well I'm not exactly kidding but avoid table variables as much as you can I think this would be a better way to put it there are some rare cases where you might want you need to use table variables maybe or where a table variable will perform better but those are really rare cases okay there are so many cases where temporary table will perform much better than table variables I can feel it from my experience I work with clients for many years I get lots of calls from clients with performance problems I look at the client environment I look at the scope of city for example and even before trying to understand the business logic and what we can do about it and the first thing I say is table variables and I say okay well hold a minute I just replace all the table variables with temporary tables and boom everything is so fast compared to before okay but just doing this so many times it happens before not even once where I did the opposite okay by replacing a temporary table with table variable I got a bit of performance okay so I got actually two I got two feedbacks from from people told me that they had a case where Table one better and we analyze it and we realize that yes indeed this is correct but this is so so rare the reason is table variables don't have statistics and temporary table do it there are other differences between them but it's the most important one for table variables sequence F will never calculate statistics this means the optimizer like I said before it is blind actually it always estimate single row and if there are many rows it's now going to do a nested loops and it's going to do something on another table if there is a joint like a big table scan for example many many times instead of just one time okay and this can be very very catastrophic for your performance okay so this is very very dangerous of course using temporary tables we have statistics we don't have this problem because the optimizer knows how many rows are in table but of course this also means there are some dis some overhead in the transaction log there are recompilation update statistics there are some other operations associated with it and we need to pay but the payment is so much lower the risk is so much lower compared to the risk of using table variables from my experience from others as well so unless you really know what you're doing and you have a really edge case and you know that you need it prefer to use temporary tables you will be surprised how much performance again you can get by just doing this replace okay I didn't talk about this did you show you did but but it's something that you need to be to know how to to apply I used to look at plans only from right to left I think that most people still do it okay when we analyze a plan we read it from right to left okay from the rightmost operator because first we do a classic scan for example then we do a joint then we do make this sort then maybe talk to him whatever okay and then we get the result so this is the data flow but you should also read execution plans from left to right which will give you the execution flow because actually at runtime we start from the leftmost operator and then each operator calls the method to the operator to the right saying get me the next row okay I didn't show you this because I have some demo but it's I don't have enough time for this sometimes you will have a better understanding of what's going on in the plan if you read the plan not just from right to left but also from left to right and then you realize why it what it did okay so one just a very simple example is if you have a select top ten order by something and sequel server uses a clustered index game which is ordered by the order by key then if you just read from right to left people usually look at class and they scan to say okay we do a scan right we scan the entire table but in reality if you look at the actual number of rows or the number of rows read you see that x equals ever actually read only ten rows and that's it if you read it from left to right then you get it because you get the first row then the second row third row and so on and after ten rows you can actually stop because you get the top ten that you need okay so this is again important is insert in some situations tooltips are just tooltips I see a lot of people they look at education plans they know they can hover with the mouse and they see the tooltips they know what to look for in tooltips but this is where they stop okay so don't just look at this to step down many many things you can look at the properties okay for each operator right-click and click properties there are many things that you can see in the properties that will not show up in the tooltips and they are important so remember that you have the properties also in some rare cases but sometimes you also want you might want to look at the XML okay it's right click show Joplin XML and there are some things that will show up in the XML but not in the properties or anywhere in the graphical execution plan okay again this is rare but sometimes remember that you can look for things in XML itself there are many many traits like again I didn't show you in this session there are many traits legs you can use to extract more information about what the optimizer is doing for example which statistics we were evaluated or chosen using the optimum during the optimization of this execution plan there is a trace stick for some debugging information that you can extract and troubleshoot and analyze there are many many traits like you can use and you can also use the blogger but only if your full white so if you're familiar with paul weitz i really recommend you to follow paul white Paul white is my guru about anything related to execution plans and and this guy and just sits with the debugger and sequel server and and it knows it knows exactly what's going on behind the scenes for every and every step in the optimization process but don't do this it's too much all right Syd Oh again something that I didn't show you because I have a huge demo for maybe like five days I just have one hour so I remove some of them but I left the the tips in the summary because the important figure I can't talk about scale are functions color fractions are also bad for performance because they are a batch its color function is a batch and sequel server cannot optimize it within the query so it has to run this function for every row in the query so it's essentially like a cursor okay so instead of a set based operation now for each row in your query you have to run the function call it to the stack execute the code and return the result and so on for each one so it's really really poor performance and either just avoid these kind of functions and beds the code inside the query where you can in some cases you might be able to convert the scholar function to an inline table function which will perform much better and will still retain the same business logic and then you can reuse the function and also get good performance okay so this is about scale or function and I think that's it any questions awesome no there's a bunch of people talking about how in slack that I guess scalar functions are so CPU intensive because they're mining query bucks they're just like a cryptocurrency continuously mining query box so some suspense says great presentation Breck sighs great presentation nice job JD asks a question he says what do you do if you absolutely must have a scalar function if I must have a scalar function yeah so first it's a good question why or if I really must have a scalar functions it goes from eyes first in many cases I don't really need to have a scalar function there is something that is really it's quite annoying it was we need to choose between good programming practices because using functions is really a good thing and reuse your code you don't hear you don't need to maintain your code in different places and performance because they don't work together so you need to take make a choice I want good maintenance or I want good performance okay in most cases sometimes it doesn't but if you really must have a function then maybe like I said you can convert it into an inline function okay will not always work but you can try and then you get great performance maybe if it's a batch doing some business logic maybe you don't really need it as a function maybe you can convert it to a stop of seizure okay again depends on how you run your code your workload we have to analyze it a bit more deeper there is something you can do starting with 2016 I think yes in 2016 you can create a use different function in forgot the name of it using with memory optimized stable what's the name of the in-memory OLTP here yes yes but how you could just opposite that is compiled yes thank you so you can use a scalar function which is natively compiled forgot the term even if you're not accessing memory optimized tables okay so if you're just doing all kinds of calculations you can natively compiled this function and it will run faster by just doing this okay maybe it's not fast enough but it might be really a big improvement so you might want to try this it's like it's not the the goal of natively compiled functions they're used for in or LTP for memory optimized tables but you can take leverage of that for just better performance for your functions perfect well thank you very much for teaching today guy always a pleasure until you speak I always look forward to hearing the new sequel server radio episodes as well thanks everybody thank you very much [Music] you
Info
Channel: GroupBy
Views: 4,014
Rating: 4.9215684 out of 5
Keywords: sqlserver, groupby, SQL Server
Id: pJIWgOSE6Mw
Channel Id: undefined
Length: 75min 8sec (4508 seconds)
Published: Fri Jun 01 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.