DjangoCon 2019 - Pushing the ORM to its limits

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
you and everyone here I have slides yeah okay hello I am Seeger and I am a developer at colonial thought I know where we eat to a lot of stuff in Django we built an online grocery store with logistics and we've used deorum for quite a lot so I'm going to tell you a bit about that and you will find the models then I'm going to use in my examples and Django project on my github account which you can see on that URL okay so let's get started quick agenda I'm first going to show you some quick tips and tricks to optimize your code to reduce your repetition and yeah various others minor things then I'm going to deep dive into sub queries which is something you can do in the RM now I will show you a bit about custom constraints and indexes a quick example of window functions using the ORM and finally a bit about how you can actually add your own custom stuff to the ORM and a quick disclaimer unfortunately my talk is quite code heavy so I'm sorry to those who are not developers but I hope you will be allowed to follow along okay so some quick tips and tricks the first one is you can add your own custom query sets and managers to your models which is really useful if you have a lot of views or serializers that access common objects for example we use it a lot for confirmed orders is a single thing that we do time so what you can do is have subclass of model store manager and model toad query set and you can implement any methods you like to do on this so in this example I have a method to create an order which will also for example create the lines for each product or filter out any orders that are delivered in case I'm looking at undelivered orders and this can be instantiated on the order model by setting the objects attribute like this and then you can access this through the objects reference on your model and the query set method can be chained like any other method on default query sets so this is useful if like I said you do a lot of things repetitively ok another quick thing if you are ever wondering why is something doing something weird or just wondering how is this working under the hood you can inspect your queries so if we have a query set and we convert it to a string then it will actually print out the actual SQL being run in the database and another new thing in Django 2.1 is that django has support for explain which will run an explained query in the database show you the output and this is really useful if you're wondering why is my query slow what is it's actually doing yeah and it has a lot of options you can specify in this example I set verbose true which is giving me extra output you can specify what formats you want if you want it to be to do analyze which is a deeper check and yep just a quick note this output is really database dependent so this is example from Postgres in other databases you most likely get different results yeah another quick example if you do something like this where we iterate over each order and then print out the customers name and then we have a set of lines on an order and we print out each line this will generate a lot of queries against the database which can be slow so what we can do here is use select related and prefetch related the first one will in this example fetch all the customers so the first example on adapter will not generate an extract query in the loop and the second one is fetching all the lines so the second example there will not generate extra queries and select related is for references where you have one-to-one or one-to-many so for example an order only has one customer but an order has multiple lines so then we have to use prefetch related which will generate join in Titan instead of the database so it's separate query but it will be run once for an entire query set ok and just a quick note on this don't optimize prematurely if you do this point Lee you might actually end up with slower queries so just be aware of this and know that you actually have a problem before you optimize too much and another quick tip you can also use the database to avoid race conditions for it select for update will for any row in the database returned by that query it will keep a lock for the duration of transaction so in this case when I say product of inventory minus one I can be sure that no other object or no other query has modified the same object in the database and then I can safely save and when the transaction ends in this case 1d with block exits it will release in the database and this is really useful in some cases it's also a bit heavy maybe so if you can model something in a different way maybe consider it but yeah it's worth knowing that it is an option available to you okay so let's have a deep dive into sub queries which is something that the Django Orem got in i think it was one point eleven so it's been around for quite a while but it's really powerful and allows you to do a lot so I will now show you first an example where for each customer I want the timestamp when they place their latest order so I have this query and yeah it's a lot of code there's a lot going on but we use the analyse method and annotate method sorry on the query set which allows us to add new fields basically to the resulting objects and we use the sub query class which wraps separate query set where we can basically do anything we want and we can also use the special class called outer ref which will reference the wrapping query outside of this so in this case I am referencing the primary key of the customer and then I'm so this basically returns a corset with all the orders of that customer and then ordering it by timestamp selecting only the timestamp because that's the one value I'm interested in and so queries only allow you to write return a single value in these cases and then I'm limiting it to the first one because it's going to be added as a column so I can only have one value and the result of this is that I can say latest order time on a customer and my example is wrong it should have been a timestamp sorry yeah and it's also possible to do exists queries where you basically replace sub query with exists and then you can check basically does something exist in the database without actually fetching the data but we can also use sub queries to do aggregation so say that we have a sales target where we each month want to sell for a certain amount and then we try to check if we've reached that target then we want every single order that matches the year and month specified on the sales target we use values list to group by those two fields where we extract the year from the date and the month from the 8th this these two values will be unique for everything that matches the first filter so the group by is basically an old but it's something we have to do and the current Django arm but yeah this is or at least this is how it's documented then we annotate the sum this basically is a group by that does it we play by the two top levels calculates some of the total of the lines and then we return like in the other example a values list and because we've grouped by something this will only return a single values we don't have to limit this to a single row and the result of this is that we get a gross total the object and this is currently the only way this is documented but say that we don't have anything unique to group by what do we do then so let's say we have a database table where we have three values and we want to select only or just placed on weekend date which is seven and one if you use the Django or M so the last one here should not be included okay so yeah we do the same filtering where we say weekday is either 7 or 1 but it can be either 7 or 1 so we don't have anything unique to group by what do we do then well we could try to just select the sum directly but because of how the way sum is implemented in Django this will add group by on the ID of the order objects so basically what you see is that we get the first value which is not really what we wanted well there is a trick we can do here which is sorry yeah so we have the same table same values and by replacing some with just func and specifying that we want to use the SQL function named sum we get the result we want and the trick here is that sum is a subclass of aggregate and aggregate does add group by clause where if we don't already have specified what to group by it will group by the primary key of the current query set which in this case is the order ID but funk is not a subclass of aggregate so this will actually just calculate the sum across all rows without grouping which is what we wanted in the first place so this it's not the prettiest it does work though so it's worth knowing and yeah and a note on this is sometimes it's really useful to inspect the query sets query attribute that's basically how I discovered this I discovered that okay some adds a group by clause but this doesn't so yeah hopefully we can implement this in Django at some point but right now this is a solution that works done let's have a quick look on custom constraints and indexes which Django has had support for unique constraints for quite a while but custom constraints and indexes new starting in Django 2.0 I think you can specify custom indexes and starting in Django 2.2 you can specify either conditional constraints or conditional indexes so let's have a look at that first unique constraint so say we have a sales target where we specify a month and a year here we don't want to have duplicate values we can only have one target for a single month so we use unique together and specify that year and month should be dealt out those two in combination should be unique so yeah this has been around for a long time new now is that we can have partial unique constraint say that on an order we don't want to allow customers to actually have multiple unshipped orders now we can set my did my slight disappear okay sorry okay so we can set instead of setting unique together we actually set constraints to the list of constraints so in this case we say we want to field customer and it's shipped to be unique together but only in the case where is shipped is set to false this will do basically the same as the other example on the previous slide but it will do it only on rows for that condition is true so this is useful for a lot of things if you have any rows with null values one row with null is considered different from another row with null so if you want to have unique constraint saying only one row should be allowed to be null this is basically the only way to do it you say is you filter by the row that is null and say it has to be unique on some other way and you can have a separate constraint constraining other values where it's not now yeah and another new type of constraint in Django 2.2 is checking constraints this is similar to what we've had in Django for quite a while where we have validators on fields that's basically when you set a value it will check run some validators to check if the value you have entered is within some constraint but new here is that you can make the database do this so you can say for example on our monthly sales target we can say the month must be in the range 1 to 12 inclusive which basically it's indeed Julian and Gregorian calendars are the valid months and we don't want any rows with values outside of that this is unlike the normal validators in Django this is well dated by the database so this will protect you if you have a bulk inserts bulk updates if you access the database outside of Django it will still be checked so it can be really useful okay and then we have partial indexes this is it's similar in Postgres it's actually the same basically as partial constraints except it doesn't limit you but it will help you maybe speed up some indexes some queries or if you have like if you have a table with a lot of null values that you don't want to have in your index you can specify a condition on the index that just index the values that don't have no or in this case just index D and shipped orders so those are the ones that we most often access so we want the queries on that to be fast ok window functions is another new thing in Django 2.0 a quick example this is for each order I want to fetch the ID of the previous order from the same customer so yeah our alpha terribly one to use the window class from the or and we use a function called lag which is referencing a value in a different row in the query set so in this case it's using specified one it's the previous row I could specify too and it would go two steps back and similar and then we say partition by the customer ID because wanted an order from the same customer and then I specify that you order it by the creative time to get the previous one and this results in a query set where we have annotated an ID on it and this this is a really basic example but it can be really powerful if for example we want to calculate a cumulative sum of gross amounts for each order where it goes or we want to compare the total sales amount on an order to the previous order for example so this has a lot of uses if we want to extend that erm with our own functionality in cases where we don't really find what we need it's also possible so one example is okay the database has a function that I want to use but the or arm does currently not expose it well we can create our own subclass of funk which you've seen earlier and specify that the function used should be in this example round which is doing our rounding of a number in the database instead of having to do it in Python code so this is everything that's required to use that function use it like any of the other annotations that we have from Django you can also do more complex examples so in this case I'm implementing a function that takes a separate date and time column and combines them together to get a date time if for example I want to compare this to another table where I have the date time as a proper date time value so we take in two inputs arity to the two columns and we output a date time and this one is implemented separative or a different database packet so in this example I'm going to show you the process code so we implement this function it takes in arguments and we annotate the given values with some extra information so in this example we want to add the time zone to be sure that we get the value in the in intended time zone we specify a template which django will use to generate the actual SQL in this case we have the two values we get in and we append the time zone and then we do add SQL cast to a timestamp TC so we will and then we will allow this to be done by Django and we can use it like this at the bottom here so basically we specify the two fields that have the date and time and the output of this will be at date time last example I want to write some custom ORM custom SQL sorry okay first example I want to use the age function in Postgres so i annotate my query set with raw SQL class or this same thing can be done using extra this has a lot more options so yeah check out the documentation if you are interested or if I actually want to write the entire SQL myself I can do that using raw on the Karissa or I can actually build the entire thing myself using just database cursor where I can run my own SQL and get any output so we execute a SQL and use fetch one which will return one row in this case will be a toefl where the value is just okay and there's so much more to the jungle or am that I would what I've been able to fit into 30 minutes so please have a look at documentation specifically the ORM query set API documentation is really useful yeah okay so thank you if you want to contact me you can find me on Twitter I will publish blog posts after the conference with some more details and yeah we also have a stamp so you can find me there if you have any questions okay thank you all right thank you very much Tigger that was very informative I just realized I need to go back to the office tomorrow and rewrite all my query
Info
Channel: DjangoCon Europe
Views: 4,283
Rating: undefined out of 5
Keywords: django orm
Id: MPpPu6c8wsM
Channel Id: undefined
Length: 25min 14sec (1514 seconds)
Published: Sat Apr 20 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.