SQL performance tuning and query optimization using execution plan

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone my name is Yogesh in today's video i will explain you how to do a query optimization in sequel so now this session is you can say it's split into two things this session will call only one after it the worst thing is like finding which are the queries which need an optimization and that's a separate video I will be making in future and there are lot of thick you can say not a trigger Gumm mechanism to figure it out which query need an elegant performance tuning so that your server can have impact now understand one almost the main thing it's not about the query which is running for long time is damaging your server in a lot of the case of small running queries can also damage so that will cover in this scenario and will figure it out how it makes a difference ok so so in this video to be very clear I am telling you how to optimize your query not how to find the bad queries so that's a separate topic ok so let's start with now before starting any tuning the first thing is you should be able to know how to see an execution plan and most of the time I have seen people they don't know how to see the execution plan it's and they know the basic rule data from right to left and so on so the execution plan is from left to right but that's really basics even I don't even look into those part in the time it's not necessary so the first thing is just seeing an execution plan suppose you have a plan of like 1000 line of query which having 100 joins you can send 10 20 surgeries you won't be able to figure it out where the problem is so this session will be more focused on that just by seeing the plan you will be able to understand which part is having the problem and you will be able to figure it out of ways that part and then how to fix it now how to fix it will depend the pods you can say query to query because you understand like saccuzzo is something where general programming rule doesn't apply give an example so in a programming let's say if you are dot said well if I ask you whatever the oops concepts what is solid principle you will be able to answer and there is a way of doing certain things but in sequel server there is a way of writing the query it can be possible naming convention can be possible but your database will be different and my database will be different so your join condition will be for sure different than mine so that's why we cannot assume one solution will apply everywhere but I will tell you we so these things what I will be explaining it will be common across all the execution plans so doesn't matter whether you are using any database it will still tell you like where is the problem and how to fix it I will cover few scenarios okay so let's start with the first visiting thing so first thing is like how to read an execution plan and you statistics time and i/o now a lot of people think like long run queries a problem as I told you in the initial itself but a lot of times small running query can be more problematic means suppose you have a long range query which you need to run in a once a month or twice a month then what will happen is you still can afford for two or three hours query but if a small running query which is taking three seconds and I use that with 1 million time in a second then your server has a problem if I that query is not tuned so all the things goes to that like how many you can say times the query is running and how bad it is making to the server so let's start with a basic execution plan now understand one thing if you are a DB or airsoft is equal to L per SIL you need to follow the same statistics so next steps so first is you should always enable statistics IO and time on so if you don't need time mostly you know the time so then you can ignore it but IO is the most important part and why it is most important is suppose your query is so if you are not familiar with a few terms let me just introduce I haven't made them in slides but let me explain you so there is a one term called as physical read and logical treat physical read is nothing but actual reading of data from the disk and the logical reading is reading the data from the memory and now when this thing happens so your sequence we do a lot of i/o operation IO can be logical and physical IO means input/output it has to read data from the memory it has to put the data into some memory and it has to give you data from from the memory or it has to read data from so there is some steps it could be possible you are quitting to table both are in memory it is reading one table filtering it out certain record then reading again table filtering out so written record and making a divine and giving the output so in all the operations there is an input/output involved and there is a time always there so time is the secondary unit the first thing is input-output suppose I have a query which runs in one second and do 1 million logical reads and now I have two mines the query and I said so still the time is like less than one second and still I am able to optimize that query instead of 1 million reads logically it is doing ten thousand logical read believe me even though you don't see change in time but it will be far better for your server because now you can run 100 witnesses banner because still they won't be able to make 1 million read ok so let's start with a practical example so first is I will turn on this set statistics IO on 5 and time on and I will run this query it will take a bit to complete let it all and you don't have to see in the query what it is doing first is like what is wrong with the query let's focus on that and this is the second query half which is the second version of the same let me just put into other tab run the set statistics time on time and IO on here as well and run this query now this query just ran in the fraction of seconds now the total number of recorded gave me 384 and this is the very simple execution plan and if I see these are the statistics I got can let me just drop one index because so that I will be covering in the next session also making this video so let me just drop that and see the performance without an index okay let me just run this query again so again it is taking less than one second the execution plan of it stills now have a drop the index that screen time okay oh I think this square is running that's why let me okay let it run I did run the query because it's in the use and also like this query I'm running it's still executing and let me show you my sequel server performance like my system performance here you see the CPU is going high and see now the query is giving the data let it complete and do you see a certain door drop in my CPU usage as well and this is one single query my system is a TB Ram with i3 processor so just one query can make it go hi okay let it complete it will take a minute to just bear with me sorry I want to show you and this is the only query to be longer after that you will understand even I just give you a hint like this is bad we are coming here right so now the query is complete and you see the sequence is the CPU usage now it was between like 47 to 50 60 now it is just back to 20 you can see a sudden drop in CPU just with one query okay now let's see what is happening in this square net we just drop this index again so that query was using the index let me just drop it it won't take that cross time and rerun this query so here see the output 384 records here the output 384 record how much time taken one point 53 seconds and here it's less than a second if I see the execution plan it's a simple execution plan and if I see the execution plan in here it's kind of a little bit bigger okay so now we will break down the execution plan in multiple things but first let us see how to analyze the statistics because now here I am running this query I am supposed there is a thousand table involved so this statistics will not be that small it will be very big so how you will analyze it you have to go in one website called statistics parson it's very good website it's free to use what you have to do is you just have to copy your statistics paste it here and click on bars so it will give you a need as a more readable format like how much logical reads and you can see the scan logical so this student table has to scan and 100 logical reads and so and so and now let us have a copy from the other query which was taking more times and let's see what its statistic says let me just pitch it and now here you see how many logical read I am doing scan is like sixty nine hundred and the logical read is like seven digitizer yeah it's seven digit and here the logical read is four digit and the scan is two digit a one digit and the thing is the irony's both has the same data in 3d for records and 384 record so less than once again almost negative two times now what is the difference let's see the execution plan and here your first lesson when you come one is using statistics time and ion to determine which query is expensive and which is not which is based on the logical readin physical read usually physically doesn't matter a lot because a lot of times they if you are reading first time the data it may have a physical read but logical read matter not if it is more means every time your query has to do a lot of input-output in the memory so first thing is statistics i are times even three hands in the execution plan now what we will look into the execution plan now this first example is for spooling okay so there is something called as lazy spool there are multiple tables pulling one is lazy spool so whenever you see an execution plan let me just write my icon sorry there are so many angles okay so whenever you see an execution plan and contain any operator called lazy spool believe me that is the problem okay so now if I just do a literate doing one thing because there was index used so that's why it's a different now let me just get in here now here you see there is a lazy spool which says one percent of the sea you get a total cost of this execution plan now while is this pool is bad see it says one percent but if you see after it it says 98 percent of the total cost is spent in this part of your execution because one which is 99 percent should 99 percent of your query performance is happening in this part just this part so whenever you see an lazy spool this is the most expensive operator in total sequel server so whenever you see a thousand line or two hundred line execution just see the part where it is doing the lazy spool see the things which is happening after that one percent and then try to figure it out how to fix it now how do fix it comes with how why it comes now lazy spool is not bad in a way why it is not a bad it is meant for solving a problem but the problem is when there is no problem and we are telling cicles word to solve it now 'lord lazy spool comes when there is a duplicate aggregation aggregation is grouped by some minimum maximum or any of those kind of operators whenever there is a duplicate aggregation happening and lazy spool will come so why am i queer is doing duplicate aggregation if I can fix it without doing it to placate a condition then I am good so here if you see the query what I am doing is I am selecting my students and from the student I am getting their year when they were born and in that here what was the most you can say who was the most youngest student out that year get me that regard so 384 are the most younger student from each and every year that's how I'm getting the data so so one here can have multiple people with the same date which are the most I guess like all people born on 31st December so they will be youngest among all so those kind of thing but what happens is for Easter suppose I have one like students that one like student year they are born now most of the student will be born in 90s so let's say like hundred so there will be hundred here only for which I have to get the data even though there are one lakh students still there are only hundred years so if I am getting maximum data birth of that year it won't change because if you are suppose I am born in 1991 so suppose the most youngest child was 31st December now you are born in 1991 again the most Angus will be the same because maxium DOB is just getting the maximum date of birth of that year so it won't change anything okay so in that terms what happens is every year every time I am calculating the this max do be so I'm doing what's equal so we do say try to maintain a school I will not be going in deep why what is pool and how it works well the concept is lazy spool means it's a duplicate aggregation now how you can fix a duplicate aggregations now in this example what I did is I calculated the ear and there max do you be before even joining with the student once I got that I just join it with the student and I got some output so what I did is to remove the duplicate aggregation I just do a summary before my query I can use a temp table I can use a CT or do kill any use anything so what it helps in where my logical read was sky-high seven digits in one food is it the moment I did you can see I have computed my aggregations before the query so there should not be any duplicate aggregation the moment I read it it came four digit and for sure there will be a difference in the performance that was two minutes and it is around less than a second so you understood so whenever you are seeing an execution plan the first thing is you get the expense you got the expensive queries see which query or which part of the query is having lazy spoons and try to fix that before because that's almost very easy to solve because most of the time it's always because of duplicate aggregation which you can fix by using whatever you want that's the one thing so spooling means duplicate aggregation so you don't have to remember all the things what I will take so by the end of the CD you can say a consideration I will give you this list you just need to remember it is if there is a spooling in the any part of the queries try to fix it before because that will have your message and here you can see two minutes to one second so it's like almost like 200 times the change was there so that is one thing and then after this you will see the hash match let's say what is hash match hash match is nothing but an unsorted data okay now suppose you are still first you have to see the spooling second you have to see hash match okay let's discuss more unspooling let me cover a few more examples so that you can understand now is all spooling wrong No okay so here what I did is I just created an update statement and I have an index on this flight right it is doing it is using in just an update gender there is no join happening there is no grouping and what is the Egan's putting it's nothing as I told you there is a spooling in Wallace spooling is nothing but a temporary the saving of data to make a computer so why there is an spooling involved is just a simple update because what happened your table has a lot of records and this is an index one for this right when I am running so first sequel so what it is doing is trying to filter it out all the records which is having this condition true as then putting into a temporary location and then taking those records and going into the physical table and updating them that's how it is working so what happens is pooling is never always bad so this is one more way of doing it now I even you see if I force it to use primary index it still say the X 50/50 percent is the cost but still in a lot of the cases when you have a huge table eager spooling will be helpful so spooling is never bad but lazy school is a bad ok one more example where are the cases where you cannot avoid a spooling now I created a query I am NOT going into depth of this query this query is getting me any logical data Oracle data is nothing but like you an employee you have a manager that manager has having certain like again sub employees those sub employee can be manager of other people and so on so it's like like multi level MLM multi level marketing kind of structure so suppose you want to get all the city if you are very familiar with recursive city then it's okay else you can watch my CT video which is already there okay come back to this so I am writing the tickets to city now if I run my query you will see a lazy spoof happening here you see you see and this is full happening now in this case you cannot avoid why why you cannot avoid here there is no aggregation happening but there is a recursion so recursion in that case lazy spool will come and you cannot avoid it so first thing is you try to find a query which are expensive second is there any spooling if there isn't spooling whether it's because of recursive query if it is because of Records you can just ignore it if it is not because of recursive query it is because of duplicate aggregation then try to fix it okay once you put query will be half resolved then coming to the hash fetch now hash fetch is nothing but an unsorted data now what do we know unsorted data now in my this query when I am seeing and running I am let's see the execution plan so here you see an hash match am i right what is this hashman what is happening is when I am calculating en and the maximum do V now group by Y it is considered as expensive operator reason group I always include sorting of data so when you do an order by people say don't order by the data if you not require because it's an expensive but when you do group by it becomes order by first and then grouping so it's always double the amount of you can see on sorting time so first time you guys a grouping that I didn't getting they're mixing now in this case data will be sorted in certain way because that's how the grouping works okay now here is the sorted data but when I'm joining with the student table on the base of date of birth there is no index there so student doesn't have any index on date of birth so what sickle server has to do it has to sort the data based on the DOB and then it has to make a join so to do that what you do is it calculated hash and then make a join so how you can resolve it one is there is an index missing first is first cases there is an index missing next say when I create this index so one more thing to remember is here you see my logical read was four digits now let us see I created the index now and run my query again and see whether that brings the logical read it down so let's first see the logical reads so this is my after creating the index app based it so here my logical rady's reads are reduced from four digit to three digit and it's almost like ins a half like twelve hundred to six hundred so hash match has like half the advantage in our scenario and the scanner more scanners okay because the same data it has two scans of scan you can ignore till the time logical reads are going down okay means like in even in that very logical read still the scans are less in here so still this doesn't make any sense so now if i see the execution plan now hash match this is hash with aggregation so you don't have to rehash for aggregation means it has to hash the data before doing the aggregation so that but here the join got changed joined got changed two nested join the reason is now the data is sorted it can it knows okay how to find the data between the index like using the index so you're after spooling hash fetches your target for hash what you have to see whether the missing index that is one second sometimes index is already there but your query won't utilize it and why that case comes let me give you a simple example let me add date and I will say and zero day to may read over now this is you can say logically wrong but nothing has to be done like that but I'm just saying this is a function working on your Tito but like you said it steed what I'm doing is I'm trying to convert my cast and I will do the same for this side fast straight and now I run my query and see the execution plan now one thing is execution plan got increased but still Ashford isn't occurred but now what I will do is I will say date ants and I will say T Commons you know just add a zero date to it nothing else just remove this one and let me just run the same query against output is still the same now here you see hash fetch game again yeah but hash which came again because when I added a function on my you can see column now sequel sequel server cannot determine whether the data will be sorted or not because this function means there is output coming out of the function so sequel server cannot deter mind whether the output is a sorted output or not so in that case sequel server we go for hash fit so one reason is missing index the second reason is indexes there but people has used function on their columns in we're close or enjoins especially this is for the join because we are talking about - join so they have used a function on the column then the first thing is you have to remove that right way to do that because if there is a function still there is an index still it is going for hash method is no benefit up to it and if you see fruit and run it still it is less than that without index thing but still it is more than the without like without hash which when the index locating here properly used so this is the second lesson to you first is find the queries see is fooling see the statistics if spooling is there remove it if it is because of CT like recursive CD you can or it if it hash mash there see if there is a Hindi exes there or not if index is not their creator index if it indexes them then try to see why it is not getting used okay you look up I won't be covering it's very simple like try so now it is not telling the music index okay let me just create it okay see if the hash key look up comes or not but he look up is nothing but a missing column from the index student and I will just remove the last him from the index right now here you see index contain index key is tob and the include is first name and let's try to run the same query again yep now I'll see what happened is it still went to the index because it has the DUP for the join join is nested but as my output included star means output do contains first name and last MTU B and student ID now it cannot find last name in my index so it has to again go to the table to get just one index so just one call so this is the key look of thing your index is missing some column now solution to key lookup is add the column to the index now here I can see this is an extra column so what I will do is I will go into my index and that column and see whether in my key lookup course or not run it and here you see there is no plea lookup happening and reason for that is because now it contains the column now solution to key lookup is if there is a missing data in the index now how you can fix it add the data but sometimes it's not possible and because a lot of time suggestion will come with add all the columns you cannot add all the columns to the index it will be like miserable updates you will have an insert so key lookup is if you have a huge table and huge index and just one column is missing from the index take a chance and that column index to make it far faster and you want to see whether the key look up increase the logical read or not let's check let me just remove my so now let me just duplicate the things now here you see it because there is a key lookup it end up doing more logical reads then the actual query without the index see the irony so because still it is using the index because the filtration is faster just to get the data again it has to again go to the table it has to read more data to get just one column so that's why this key lookup is doing more logical read than the actual query so if I just fix it by adding put that one column again and run this again again it's back to the three digit so all good so far because this is the main part because whenever you are seeing a cui you need to know whether this is pooling or not how to fix the spooling if it is duplicating unsorted data missing index whether index are getting properly utilized or not because I have seen a lot of time indexes are there and people do crazy functions you conversion of date and date time just to make their life easier by writing that small sentence but it end up doing bad once you have done that see whether there is a bad views or now views are often used to simplify the cohesive they are not bad but a lot of time it's performance they become a problem we know like the definition of use it's just a stored query now how it can be a banned in the performance let me show you a simple example so I created a view to the bottom so I created a simple view which is getting me it is joining three tables order order details and product and it is getting me certain like you say six six columns now this is one more query which is exact queries there is nothing but justice max to remove the and if I go to the execution plan okay let's let me just run my CTN see the execute plan for this run and see the execution client now put the place now here you see the execution plan is exactly the same so there is no performance problem the cost is also it is 50% execute plan if I run CT and that same query outside is the same but there is a difference in the data now what is the difference in my see a view I am getting six column but here I am getting one more column extra now a lot of time there will be situation where someone has to write a new stored procedure or function or query where they are using existing CPUs and views are missing one column from the table so what they dentally do is they will add one more join of the view to the same table again just to get that one column out now what will happen let's compare I am using the view going to the table bag just to get one more okay okay now if I run my Buddha quiz now see the execution plan this execution plan is again the quay even though I'm getting that one column but here in this just getting that one column what has happened it it has to go back to the table again to get this that one column okay because here you see it is joining with order details two times because just I have to get the one more column and logical reads if I see the logical read of this let's say it is 2 and 22 and 11 simple and if I run the logical read for this fellow so 11 become 22 okay because it's become doubles because this is order details okay order details as it has to that for two times it becomes the double so bad views are like there is no technical term else bad you the bad news is if there is a view which people are using in all the quiz and if they are missing columns and instead of writing a new you are altering the same view what they are doing is trying to join the same table which are already part of the view then expand so what you what is the solution either create a new view or alter the existing that's the simple thing okay let's see now unnecessary sub-queries now this is one of the best example I have - okay unnecessary sub-queries now I will give a simple example I will give you time to think what will be your query and then you tell me okay I have to get me my all the customers and their most recent order last order what was the amount of that order and what was the you can say shift name with which what this you like from which ship they were shipped to the customer now I'm just getting the last most recent order not all the orders so I cannot do it join with orders because then it will get me all the data I cannot do maximum order ID because if I do a mixie mode already then how I will get them my last Freight I have to get max him right now it can be of old order so it must be right so just by using maximum order ID I won't be able to get so how I should be able to just think about it getting just the first any idea so usually what people do is they will write some queries like they will say for this column get me the order of ID from the table join it on the customer ID from the table orders order it by order ID descending one get the maximum order ID there's a separate thing and then get order from the most recent top one order from the order table join on the customer ID order by order ID descending and get the freight means it is from the same order because I am doing order by order ID descending so I will always get a most recent order and then the ship name seems exactly the same way now suppose I want to get the order date what i will do is i will copy add one more order date so what this approach do is for n number of columns I have to write n number of n number of times I have to write the sub-query so this becomes very problematic because I want to get all the you can say you can say most recent just to me almost so how I resolve it I can dissolve it by using cross apply now in crop will supply if you haven't used supply it's a separate topic then you have to go in cross apply it's it's a small topic but it's a very good and interesting topic you know cross apply you can do a lot of things what cross apply to this for each record coming from the left side it will run this upgrade so here what I am doing is instead of running for some query I am saying get me one record from order where the customer ID matches order by order a descending and in that one record I am getting all the columns I need order date and if I run this I can get the same exact output now if I run both the queries let's see what will be the performance so above query cost from 77% and the below query 23% in the batch above very logical reading slightly resumed 1716 and logical read for the below Corey's seventh five zero two so it's like three times smaller and the problem with the above query if I have to get more columns I have to add more porous and it becomes expensive where like every time it will become more and if I run again and see now it becomes more logical it ecause it is increasing but the below code he is 5-0 it's the constant because it has to get only one time all the recalls all the columns whatever I need okay now here one problem will be like cross apply you will have 89 records the above query will give you around 90 or 91 records it's getting all the customer first cross will play only gets a common you can say output so here is 91 so how to fix it there isn't something called as outer apply so it's similar to cross apply it's like a left turns so it will get all the records even though the page the customer hasn't placed any order so it's so I hope this video is helping you out so the concept is like a necessity sub query using cross apply or any other logic whatever you want to do that's all on you but usually cross apply use for that and sometimes I use temp table also so if you remove unnecessary it will for sure reduce the logical rate if logic already reads are getting reduced everything is good okay I don't know why my slides are duplicated and let me just remove it okay now this is the most important topic understand because this will make a huge impact on your performance and this is partition elimination if somehow your database is partition your company use but leave me you can see your query are not using the partition properly and you make them use it there will be a huge change in performance and that I can just show you now it's okay so partition is nothing but like you subdivide table physically in terms of the partition means you said like I want to create partition yearly from less than two thousand three records will mean partition a two thousand three to four will be B four to five will be in C and greater than five will win default particulates now what happens is your data is partitions but your query is not using those partitions now how to fix it I will just give you a simple example I have a table order details which is partition okay if I run my sub query go here go into my properties just select so here you see there is an actual so this thing actual partition count you won't find in other queries which on other query which doesn't have the partitioning so this will be always on the tables which have the partition so here these two things will come actual partition count and the actual partitions access so it is getting how many it used like one to 25 and 25 over the total use because I am doing start so what I have to do is I have this two parameter started and ended I am getting the records between these two dates and and below code is exactly the same but just it is a casting like I am using casting the date time to to date okay now let me run both the queries and together so that you can see what is the difference so the first query is seems to be little bit complex but if I click here what it has done it has let me yeah so the first query what it has done is it has selected all the partitions means total partition in the table are 25 it still used all the 25 partitions so there is no partition elimination happening means no partying got avoided but let's see the second way see the first query is little bit complex also the second query now if I click on this and see the exit this my properties here you see actual partition count is to only two part it is partition it has to go to get you the data so eliminated 23 partitions total partition was 25 23 it awarded only to partition it reads to get you the little chunk faster so he'll see the performance above query 89% this query 11% what is the logical read first one you use 46 logical rates and second one use only four now what is the difference between both the queries the column data type order date was date time not the date time to just a basic difference the parameter column type datatype doesn't match with the column datatype and this just don't go further you can say partition elimination this can also work with other things also like without partitioning also but here you see in partitioning for you date time to end date may be almost the same if you are not considering the millisecond part of it but for sequencer it's a huge difference so what sequence I would do is if it doesn't match the data type it goes for all the partitions and believe me this thing has made a huge impact in my experience there was a query we optimized from 90 minutes to 10 second how it happened was I was able to get that query from 90 minutes to two minutes by others pooling and other thing but from that 2 minutes to 10 second just this partition elimination brings and why because our table was around 200 gb of data and the moment we did partitioning it has to just scan few record or few partition to get the data and they were like around almost around like 90 partitions and it has to just go in one partition get you get the data like not the mind teeth it's like almost like 100 200 party because it was like around 8 much data and the moment there is a partition you just have to go into one or two partition it was a huge difference to get the data partition elimination white doesn't happen one using a different date type and second there can be different reasons also let's say I added a function date add you remember this is my favorite function to skew an example so what I did is now not here let me do it on the other side so now if I run the same query again and see what happened to the partitioning now here the partition is same 25 now if I go here what happens it only scan 23 partitions it only was unable to LA it was able to exclude only two partitions the reason was it was able to determine the start you can say date because started was very simple but ended it was unable to determine because there is a function used on the column you have to be very sure never to use function on the columns in we're closed and in joins you till the time you can avoid them okay so because if there is no other like wait to do it then you cannot do it but if there is a way always avoid using any function this thing mess up your whole query so what I just did is I just use a function same query means like whole query remains same and it has a problem now I will just transfer it to start date okay so means end date is still deterministic so if I run and go to my query now it becomes one to four you remember it was using three to four three and four partitions but now it is wonderful because now it is unable to determine this start partition so it is using four partitions it is able to determine the last partition but not the starting prediction now if I just removes make it normal again I go it only used two partitions so it matters a lot if your table is partition the reason for partitioning of the table is to run your queries fast okay if that is not happening then what is the use of partitioning because party if you are doing the maintenance part and still your query doesn't use the partition you have to make sure they should use so let's go back to the slide and see what we done first we saw what is the execution plan seeing how to use statistics I on time then once you have the bad running queries see whether this is pulling or not spooling is nothing but to placate aggregation if there is a duplicate aggregation remove it how to remove it different ways by CT temporal or temporary table variable then sing hash much hash machines missing index and because it is unsorted missing index or index is not getting used because of one reason using any function on we're closed or joint should not be there and it applies to partition elimination it applies to hashmat and one more next slide it will be the same for that also then coming to the key lookup aloo Gobi is nothing but a missing data from your index if you can add that if it is one column and it doesn't make a difference just add it to the index and it will be far easier for your sequence able to do it then after that it comes bad views bad views are nothing but when people don't alter the view just to get one or two column extra they make it join to the same table then it becomes miserable sometimes so how you do it alter the view or create a new view and use it then sub query minimization I use just across apply we see the traditional most common example to get the most and they start using sub queries and they become visible so how to fix it yes user cross a play and get the data once and use it last was partition elimination this is most important if your table are partitioned and your queries are not using the partition properly then this is a huge scope of improvement if you just tweak the queries little bit and make them use your partitions it is awesome one more case I will let mean this is it is not covered in the examples now suppose I may give an example of casino industry because I booked there so there is a table called players then there is a player called game and then there is a play table called jackpot now when the player paid a game there is an order it was a game date okay no not in the play like player game table play a game table he played so there is an order date and that table is partition in game you not the order date it's like game date when he played and that column isn't partition key now jackpot also have a game date it also has the same color that is also partitioned now what happened I used game date in a where Clause on even say player and game table but I didn't used on jackpot the reason is jackpot can only happen when there is a game the person cleaner game so I can put the same we're conditioned on game did for jackpot as well as on this one now what will happen is both the tables are different both are partition so partition elimination will happen in both the tables not in just once so you can induce partition elimination if it is part of one condition and the same column exists there and that's the partition column why not use it so even though this case doesn't mind like not making you much clear but avoided but main thing is partition elimination make sure your partition are getting utilized properly else there is no use of doing them okay now last but not the least is the sizable queries so as we have seen already in example now you would identify this I used this index where I added this function so this becomes like non-chargeable chargeable means search argument the movement I used a function on my any column it becomes non searchable because sequel server cannot determine the output while running the query so it goes for crazy operation like it did I did partition elimination it went to hash match in the first example and so on so think so our simple query is nothing but you make the output determine non-deterministic also like you use like so when you say XY ABC percentage this is deterministic it knows it start with ABC but the moment you say percentage ABC like it should because it has to scan through all the data so Sergeevich queries are very common never so the basic example is never you can say let anyone use functions on columns in we're close or joins till the time they can be avoided what they happen do it they end up doing hash match you remember like we already have this index let's see we already have an index so what I do is I just add the state and function just add 0 date so here the join is nested joins unless you loop so you can see if I run now it goes to hash match hash matches again unsorted data and now it even it is not giving suggestion because it cannot determine so what it do is it makes your query to go for hash match in so always make your query searchable as much as possible avoid has many functions as possible in we're close then you can say joins miscellaneous stiffs it's like a word external grouping external group is like a lot of time I've seen people doing same grouping again and again and again like expose you have table order and order details and you want to also get the count of order one customers you can say place so instead of grouping it at the bottom you can do it on them top so there are multiple thing use cross apply and pivot to make your data use row number and fetch offset in fish to get you can say if you don't have to get serial number of the records and you just have to get top ten and top twenty and those kind of records you can do and with pagination you can do with offset and fetch it's very good it's now very better than using the row number and doing it ranked versus some queries like used back these are like which you can use I hope you must have like now you have known like how to solve problems how to figure it out what is wrong with your query if you have any question you can reach me on email id that's Yogesh dot mail at the gmail.com I'm also in Skype you can reach me on whatsapp also if you have any questions put in comments if you need the scripts just put it in the comments I will show you the scripts and the database backup which I'm using if you are a company and you you want it developer to do performance optimization I do provide training to them or secondly if you are in very bad situation just directly contact me or any DBA who can help you out because you won't be able to fix all the things in hurry so if you have any question just put in comments or just mail me I will be very happy to help so keep learning thank you provide
Info
Channel: techsapphire
Views: 61,707
Rating: 4.8956199 out of 5
Keywords: sql, server, optimizing, optimization, performance, tuning, execution, plan, spooling, hashmatch, keylookup, partition, elimination, statistics, analyzing, sql server 2008, sql server 2012, sql server 2016, sql server 2017, sql server 2019, query optimization, performance tuning
Id: t2R0-xcKw44
Channel Id: undefined
Length: 49min 23sec (2963 seconds)
Published: Mon Nov 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.