Django Query Optimization / select_related & prefetch_related / django-debug-toolbar / N+1 Problem

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
sometimes when you're using the Django orm you can encounter performance issues due to the way that Django runs queries and the way that you write your code and a common issue is What's called the N plus one problem and we're going to dive into that in this video and we're also going to see some tools for resolving that problem we'll look at Django debug toolbar which you can use to inspect the SQL queries that are being run on a given page and we'll also look at the prefetch related and select related functions in Django that can be used to solve the N plus one problem and optimize your queries to the database so let's get started we're going to start by installing Django debug toolbar into our environment so let's copy the PIP install command and we're going to go to vs code where I've got the application that we've been working on open and we're going to paste in the PIP install command to the terminal now Django debug toolbar helps you optimize your database queries and it provides some other useful utilities for Django let's go back to the documentation and we're going to follow the installation instructions that are on this page which I will link to below the video so let's copy the debug toolbar up and we're going to bring that into the djanglesettings.pi file and add it to the installed app so let's add that just below the core application and once we've done that we can add the URLs for Django debug toolbar so we're going to copy this path function here and we're going to paste it into the project URL configuration which is this file here so let's paste that just at the bottom of the URL patterns list and if you're following along with this code by the way there's a GitHub repository that's Linked In the description you can clone that and then you can follow along if you want to once we've added the URLs what we're going to do is go back and add the debug toolbar middleware so let's copy this statement and again we're going to go back to settings.pi and it's the middleware list that we're going to add this to and debug toolbar recommends adding it as early in this list as possible so let's paste it in here once we've got that we can go back to the documentation and we're going to add this internal IPS setting to settings.pi so we'll just do that at the bottom of this file and the reason that we're adding this is that debug toolbar will only be shown to users whose address matches what's in this internal IPS list so what we're adding here is just the localhost address and that means that the debug toolbar that you're about to see is only going to be visible to users that are accessing the application on localhost once we've got that we can save the settings.pi file now that's all the settings that we need for debug toolbar but what we also need to do is configure Django to serve static files during development so I'm going to link this page here below the video basically we're going to copy the code from this block here and again we're going to go back to the project URL configuration and we're going to paste this in to that view and the last thing we need to do is append to the URL patterns this statement here where we're using the static helper function in Django and to that we're passing the static URL and the static root for our project so let's add that as a statement after the URL patterns and then we can save urls.pi and I'm going to try and run this application with python manage.pai run server and you can see that that has started the development server if we go to localhost 8000 you can see now that we have this page and on the right hand side we have the Django debug toolbar that's appearing and one of the tabs the one we are going to be focusing on in this video is the SQL Tab and this will display any SQL queries that are executed when you go to a given page in your application and right now when we load this page you can see there are no SQL queries being executed but let's change that now if we go to the views.pi file so let's go to views.pi and we have this index view here and I'm actually going to remove most of the code from this view because we're no longer going to be using the forms but what I'm going to do is fetch all of the restaurants from the database and that's going to be with the restaurant.objects.all query and we need to import the restaurant model at the top here so we'll do that and we'll also import the sale and the rating models and once we've got all these restaurants I'm going to add them to the context dictionary and that's with a key of restaurants I'm going to pass that to the render function that's going to load up this index.html template and this index.html mail fail is the one that you see on this page what we're now going to do is change up the HTML on the page so let's get rid of this form that we don't need anymore and let's just render out all of the restaurants to that template and we can save that go back to the page and we will refresh this page you can see the restaurants are appearing on the page and on the right hand side we have this SQL tab on Django debug toolbar and this is showing the SQL query that was executed in order to get those restaurants onto the page and you can see we have a single query being executed and that's the select query that's getting all of the restaurants from the restaurant table in the database now what we're going to do now is outline a problem that can happen when you're using the orm to make these queries and also to fetch related objects within the same view or template we're going to outline this problem it's called The N plus one problem and this is actually a very common performance issue in Django and in other back-end Frameworks for example laravel and others and you need to know a little bit about the tools provided by the orm and all the to avoid this potential issue from happening so let's see what the problem is if we go back to our template here instead of just rendering the restaurants we're going to look over them and we're going to say for each restaurant in that query set of restaurants and for now let's render a paragraph tag and we're just going to look at the restaurant's name here and we'll just close that tag and end the for Loop so this is very simple we're just rendering out the name of the restaurant and we're doing that for each restaurant in the query set if we go back to the page and again we're refreshing the page we get a different view here but on the right hand side we're still only executing one SQL query now what if we wanted to actually get all of the ratings that are associated with each restaurant in the data now remember if we go to models.pi we have a rating model and that has a foreign key to the restaurant and this basically means that when you have a rating that's created it's tied to a particular restaurant so a user can then give a rating to a restaurant and then they can move on and give ratings to other restaurants the rating is the child mode them and the restaurant is the parent model so when you have a rating you can go to the foreign key and you can grab that restaurant from the database as well what we're going to show is a problem that can occur when you have a restaurant and you want to fetch all of the associated ratings for that restaurant now A restaurant can have many ratings and that's because many users can come in and they can each give the restaurant the rating of their choice so what we're going to do in order to demonstrate this problem is go back to index.html and I'm going to define a UL tag here and we're going to create another inner for Loop here and we're going to say for racing in restaurant.ratings.all and for each rating we're going to render out the ratings numerical value just by accessing the rating field on that rating so let's tab the list over here and again this is a bit confusing the rating is a model but it has a field called rating which refers to the numerical value between 1 and 5 of that rating so for each restaurant in the query set of restaurants we are then issuing another query to the database so to get all of that restaurant's ratings as well so if we go back to the page here before we refresh this page you can see we are only executing one query before now that we've added this nr4 Loop when we re-execute this you can see that we're now getting 15 queries that are executed and this is just for one user going into the page and imagine that that user had given a rating to a thousand restaurants we would be executing a thousand and one queries here that's why it's called the N plus one problem we issue one query in the view to get the query set of restaurants but then for each one of those restaurants we are issuing another query to get all of the ratings associated with that restaurant so the initial query gets the restaurants and that gets in restaurants from the database and then we need to perform n additional queries to get the ratings for each restaurant now the problem with this is that it can be very slow you're issuing a request to the database for every single entity that you're getting back from the initial query and imagine this happening for multiple users are on the same page at the same time this is going to affect the performance of your application if you have enough users and enough objects in the database and if you go to Django debug toolbar you can see that it gives you some feedback on this as well as the initial query that we've seen before that's executed at the top we are now executing 14 similar queries below that as you can see why these are similar it's the exact same query all that's changing is the restaurant ID that's being fetched in the where Clause so this is giving back all of the ratings for example here where the restaurant ID is for and then the next query gets back all of the ratings for the restaurant ID of seven so how can we cut down these similar queries and get rid of them and optimize our querying of the database let's see how to do that now and what we're going to do is use a function called prefetch related so let's just quickly see how that works and then we'll dive into why that works let's go to views.pi and instead of just fetching all of the restaurants and adding them to the context we can actually use another function here and we'll call that prefetch related and then we can pass to that a set of related objects based on the name of the key in the model and in our case the restaurant has a related name of ratings which you can use to fetch all of the ratings associated with that restaurant and we've used that here in the template but what we're now going to do is actually use the prefetch related function and pass that related field to that and then we can see what effect that has on the queries that are being executed so let's save that file and go back to our application and when we refresh this page you can see on the right hand side we are now only issuing two queries to the database now what's happening in these queries the first one is the same as it was before it just gets all of the restaurants from the database but if we look at the second query what the prefetch related function is doing is that it's getting all of these fields from the rating table and it's filtering these down where the ratings restaurant ID is in one of these values now these are the IDS that are pulled out in initial query so when you use the prefetch related function Django takes those IDs from the initial query and when it's doing the second query it's going to use those IDs in the where clause and it's going to pull out the correct ratings in this case for the restaurants and then once it's done that the orm codes behind the scenes when we use this for Loop here is going to get the correct ratings for the given restaurant and this is greatly optimizing the queries that we're sending to the database instead of 15 in this case it's cut that down to two now let's go to the documentation for the prefetch related function and again I'll link this below the video and as it says here it has a similar purpose to the select related function both are designed to stop lots of database queries that is caused by accessing related objects but the strategy is different for these functions now the prefect related function will do a separate lookup for each relationship and it will enter the joining in the python code and this allows it to prefetch many-to-many and many to one objects and that cannot be done with select related and that's what we're doing if we go back to viewers.pi we are prefetching the ratings now A restaurant can have many ratings so this is a many-to-one relationship so when we have a restaurant and we want to get its many ratings we can use this method using the prefetch related function the select related function which we're about to see is designed for one-to-one relationships or one-to-many relationships in other words foreign Keys now to demonstrate that prefetch related will do a separate query for each related set of objects what we're going to do is pass a second argument to prefetch related and that's going to be the sales now this comes from models.pi again so if we go back to that file and we go down to the sale model here you can see we have a foreign key on this model to the restaurant and that has a related name of sales and this related name is what we are referencing in the views file here in the prefetch related function just like the ratings we're telling the prefetch related function that we also want to prefetch all of the sales associated with each restaurant in this quiz and that is then going to have to perform another query to the database to get all of the sales so let's see that in action now if we go back to the browser and go to our application here and refresh this you can see in Django debug toolbar that we are now performing three queries and the one at the bottom is the new one where we're fetching from the sale table all sales with the restaurant ID matches one of the primary keys that we fetched in the initial query now if we go back to a vs code to our index.html template and we paste in another UL tag and this time we're iterating over all sales and we're listing out the sales income here if we go back to the page you can see that on the right hand side we're still only getting three queries but we're going to see that if we go back to viewers.pi and just call restaurant.objects.org and don't do any prefetching when we go back to the page and refresh this page you can see now we have 29 queries being executed so this prefetch related method is very important if you want to do some prefetching from the database which can greatly optimize your app application when you're accessing related objects and this technique can greatly reduce the load on your database and increase the performance of your application which can be very important as you scale up and of course this statement will also work with filtered objects so if we bring back prefetch related we don't need to apply that just to all objects in the query set we can also use the dot filter function and let's say we only want to get restaurants whose name starts with the letter c so we're going to use the I starts with lookup here and set that equal to the letter c if we go back to the page and refresh this page this time we have three queries being executed but in the initial query we are only fetching restaurants whose name begins with the letter c and you can see that reduces the number of IDs that are used in the subsequent prefetch queries and that will therefore return less ratings and less sales which will then be joined up using the python code behind the scenes and the Django orm so that is the prefetch related function in Django and how you can use that to prefetch objects and optimize your queries we're now going to see a very similar function and that's the select related function that can be used with foreign Keys now as an example of this if we go to a models.pi you can see we have a rating model that has this foreign key to the restaurant now to demonstrate the benefit of Select related let's go back to views.pi and I'm going to remove the code that we wrote earlier here and this time let's fetch all of the ratings from the database we can use the rating.objects.all function in order to do that and just below that we can add the ratings to the context and then let's go to index.html and I'm going to remove this code here we no longer have restaurants in the context and this time we're going to iterate over each rating and those ratings that are in the context now for each rating what we're going to do is we're going to render out a paragraph tag and each rating has remember that foreign key to the restaurant so when we access that it's going to give us back a restaurant model and then those restaurant models have a field called name that's going to print out the name of that restaurant let's save this and go back to our page and refresh this page and we're now getting an error that rating has no attribute object and that's my mistake if we go back to viewers.pi here it should be rating.objects.org let's go back and refresh again and this time we're getting back each restaurant's name along with the rating that's applied to that restaurant now on the right hand side you can see we are executing 31 queries here the first query is getting all of the ratings from the database it's a very simple select statement that gets every single rating from that table but then for each rating that's fetched we are performing an additional query to get the associated restaurant with that rating and you can very easily imagine this blowing up in a real life situation where you might have millions of ratings in an application and if each user is issuing millions of ratings when they land on the page that will very quickly become a big problem for scalability and performance so of course we need to find a better way of doing this now because this is a different type of relation it's a foreign key which which means that our rating can be associated with one restaurant we can actually use the select related function and we can pass the restaurant foreign key into that and that's going to tell the database that along with each rating we want to fetch the associated restaurant and that's going to happen using a join statement and in other words because we're using our join it's actually going to do it in a single query so let's see that in action and go back to the page we have 31 queries here if we refresh the page you can see that's reduced down to a single query and this time we're using the inner join and if we expand the query you can see the inner join is happening where the ratings restaurant ID matches the restaurant table's ID column so this is giving us back all of the data we need for the rating as well as for the restaurant to which it is joined and this is much more manageable for the database while issuing a single query instead of 31 queries now you might notice in the query that we're fetching every single column from the rating and the restaurant tables but if we go to the Django template here in the index.html file you can see we're only accessing the name of the restaurant and we're also only accessing the rating too and actually this should be rating dot rating because it refers to that numerical value we talked about earlier so we're only accessing these two pieces of information and yet if we go back to the query here you can see that we're fetching every single column from both of these tables as part of that join now we can use another function in the orm and that's the dot only function in order to limit the amount of films that we're fetching in a query and this is another optimization that might cut down the amount of data that you're retrieving from the database and if you're retrieving a lot of data this can really speed up your application as well and the way this works is that we pass the name of the fields that we want to fetch into the only function and it's only going to fetch those fields or those columns from the table so let's use that in our view if we go to viewers.pi we're calling select related here but before we do that I'm going to use the dot only function and this time we're going to pass the name of the fields that we want to fetch and remember in the template for each rating we're fetching the rating numerical value and we're also fetching the restaurant's name so we're going to use the restaurant and then we're going to follow the foreign key to that model and access the name of the restaurant let's go back to the page and refresh now and if we look at the SQL query that's being executed this time instead of fetching every single field we are fetching the ID and the restaurant ID from the rating table as well as the rating itself and from the restaurant table we're fetching the ID and the name now we need to get the IDS in order for the join to work so that's why they are included and you cannot exclude IDs with the only order fare functions now you need to be careful when using only if you try and access a field that you've not specified as an argument to the dot only function Django will then have to issue another query to the database to get that field and that's definitely not what you would want and Django itself will recommend caution when you're using the fair and only use it cautiously and only after exhausting all your other options but I do think it's quite a handy function to know about if you know for a fact that you only need a certain subset of the columns in a table for a particular View and this can give you a performance boost but don't pre-optimize you might not need to use the only and defer functions but they are there if you need to use them I'll leave a link to this function in the description of the video Let's now move on to another topic we're going to see a prefetch object in the Django orm and we're going to see how we can use that in our applications now let's say we want to find all restaurants that have a rating of 5 stars in the database and we want to sum up all of the incomes from the sale table for those restaurants so to summarize this problem I'm going to go to views.pi and write a comment here we're going to get all five star ratings from the database and then we're going to get all of the sales for those restaurants that have five star ratings so let's remove this line of code here and we're going to see a few different ways that we can construct this query in our Django code so let's see one example of a query we can use we can use the restaurant.objects.prefetch related function and we know that we're going to need all of the ratings and the sales in order to do this query and once we have these pulled out as part of the prefetch related function we can do our DOT filter at the end here and we're going to say we want to only limit this to restaurants whose rating is equal to 5 stars and let's assign the result of this to a variable called restaurants and let's just print that out to the terminal so that the query is executed by Django so we'll print that to the terminal and I'm going to remove the context that we're applying to index.html and we can remove the code in that index.html as well so there's nothing on the page but what I'm going to do is go back to the page and if we refresh you can see that it's going to execute the SQL query on the right hand side here we have three being executed and in the first query we're pulling out all restaurants from the database who have a rating of 5 stars and to find this out we're doing an inner join on the rating table and we're joining based on the restaurant on ID being equal to the ratings restaurant ID and then we're doing that filtering down to only those with five stars so that gives us all restaurants that have ever had a rating of 5 Stars once we have that we get back the primary keys and the prefetch related function will then use them to get the related ratings and sales for those restaurants now if we go back to vs code and go to viewers.pi that gives us back all of the restaurants with five star ratings but what we also want to do is sum up the sales for each of these restaurants so let's say we have five restaurants in the query set we want to then look at the sale table for those restaurants and get a sum of the total sales for each one of the restaurants now I'm going to bring this code to a new line here using the syntax in Python we just use the backslash and bring the filter function onto a new line and then we're going to chain another function onto that and that's the annotate function now I'm going to do a video on aggregation and annotation very soon in this series now when we're performing this annotation which is basically a gripping in SQL what we're going to call the fuel that we get back with the value is total and then we're going to use the sum function from the Django models module and we need to pass a particular field to that that we're going to sum up now we have a set of restaurants here and each restaurant is linked to a set of sales by the related name of sales so we're going to use sales and we can use the double underscore to follow that relation to a particular field on the sale model and that field is going to be the income field and before we run this code we need to import some from the Django models module and we'll go back to the page and refresh this page and we're going to see what kind of queries we're getting executed here now nothing is showing because we've removed all the contacts from the page but if we look here you can see the SQL query being executed is now quite complicated we now have a group by nssql and again we're going to explore this in a later video but we're gripping by the fields in the restaurant table and you can see one of the items in the select statement is this here and this field is taking the sum of the sale income fields and casting that to a numeric data type and SQL Lite and then we're aliasing the result of that as total and this total is coming from the keyword argument that we're passing in to the annotate function so that's going to get the sum of all of the sales and this is going to be applied to each restaurant individually because we're using this group by statement here so that's one way of doing this we're going to show one other way and that's to use the prefetch object that's given to us by Django so let's go to the documentation and you can see that the prefetch object can be used to control the operation of the prefetch related function now we've already seen prefetch related but let's now go a little step further here and imagine that we want to only prefetch sales from the last month if we go through the query that we're executing here we're getting back every single sale for the restaurants that are fetched from the database so if we have three restaurants we're then going to get every single sale that's attached to those restaurants but what if we wanted to only get the sales that are made in the last month let's see see how we can do that with these prefetch objects now I'm going to go back to the documentation and let's look at the arguments that you can pass to the prefetch object the first argument is a lookup and that is the related field the relation that you want to follow in order to get back a set of related objects and the second argument is the query set and this supplies a base query set for the lookup so for example in our case we want to get all of the sales that were made in the past month so rather than sale.objects.org we are going to use the filter function and we're going to pass this query set as a second argument to the prefetch object so let's go back to vs code now and we're going to see how we can do this within the view so let's remove The annotation that we added and I'm going to bring the dot filter up to the same line it was on before so we're prefetching the ratings and the sales for all restaurants who have a 5 star rating in the database now what we're going to do is rather than passing directly to a prefetch related to the sales we're going to define a prefetch object and we're we're going to pass that second argument for the query set in order to only select items or rather sales that were made in the last month so at the top here I'm going to bring an import in we're going to import the time zone module from django.utils and I'm going to Define below this comment a variable for roughly one month ago where we take the current time and we subtract a Time Delta of 30 days so this is not exactly one month ago it's 30 days ago but we're going to use that as a reference value in the prefetch object now what we need to do now is import the prefetch object from Django you can see the import statement here and I'll link this page below the video from django.db.models we import prefetch what I'm going to do is bring that into vs code and we're already importing this sum function so let's also bring prefetch in and we're now going to define a prefetch object for monthly sales so it's going to be an instantiation of the prefetch object the first argument as it says in vs code is the lookup we're going to look up the sales related field on the restaurant model and we're going to pass this second argument to that as the query set so let's do this on a new line it's going to be a keyword argument and this allows us to supply a base query set so it's going to be sale.objects.filter and we're going to filter the sales down using the date time field on the model and we're going to pass the greater than or equal to lookup and we'll pass one month ago into that lookup so this base query set that we're passing to the prefetch object will constrain the sales to only those that were made in the last month and all we need to do now is take this prefetch object that we are storing in a variable called monthly sales and instead of passing sales directly we can reference that object and pass that to the prefetch related function so this time instead of getting every sale associated with the restaurants that we're fetching we're going to limit that now to only those that are part of the base query set so I hope that makes sense let's now go back to our application and refresh the page if we go to the SQL tab of Django debug toolbar we're going to look at the prefetch for the sale table so we're prefetching these sail objects that are related to the restaurants we found in the base query so if we expanded this query you can see this time in the where Clause we have this date time greater than or equal to this particular date time that we're passing in here and that corresponds to 30 days ago and we are now filtering down the sales that we're getting not only by the restaurant ID matching one of the original queries IDs but we're also applying an additional filter here in the where clause on the date time field of the sale object now what if we want to get the sum of the sales for each restaurant let's go back to vs code and I'm going to annotate the results here in the restaurants by calling the door annotate function and again we're going to Define that total field and we're going to sum up the income of each sale for the restaurants in this base query set and when we call the annotate function we need to assign the result back to the restaurants variable here and what we can do then is instead of just printing out the restaurant objects we're going to use a list company engine and Python and for each restaurant R I'm going to print out that total and that's for RN restaurants so we can do that in a list comprehension if we go back to our page and refresh and go back to the terminal below here you can see the annotated values coming back with the total sales for each restaurant and the data so hope that makes sense let's go over this one more time we're setting up a variable called month ago which is set to basically 30 days ago and then we are creating a prefetch object in Django we're specifying the lookup as a first argument and the lookup that we're using here from the restaurant model is the sales and then to the prefetch object we also pass a query set keyword argument which defines the base query set to which we're going to fetch these Associated related objects so that creates a prefetch object called monthly sales we can then use that as an argument to the prefetch related function and we do that here as a second argument and this can take any number of related objects as arguments you can see we've got the directory reference to the restaurant's ratings as the first argument and the prefetch object is the second argument and we are also filtering that query set of restaurants to only those that have a five star rating in the data once we've got that what we're doing is annotating each restaurant in that initial query set with a total and we can use Django's sum function in order to do that and finally we're just printing those totals out to the terminal in order to see the values and you can see these decimal values below so that's all for this video thank you for watching we've seen how to use Django debug toolbar and how we can use its SQL tab in order to analyze and optimize the SQL queries that we are performing in a Django application we've also seen the N plus one problem and how it arises when you try and access the fields on related objects and your Django models we've seen how to solve that problem with prefetch related and select related and how these can greatly reduce the number of queries that you're sending to the database we also quickly touched on the dot only function which can be used to only get a certain set of columns from a particular table and finally we've looked at the end here into the prefetch object in Django and how that can be used to control the behavior when you're prefetching related objects and we also saw a first look at the annotate function in Django when you're performing aggregations or grippings over data we're going to see much more of that in a later video for now thank you for watching if you've enjoyed the video please like And subscribe to the channel and we'll see you in the next video
Info
Channel: BugBytes
Views: 10,520
Rating: undefined out of 5
Keywords:
Id: a3dTy8RO5Ho
Channel Id: undefined
Length: 30min 32sec (1832 seconds)
Published: Mon Jul 24 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.