DjangoCon 2019 - The Ins and Outs of Model Inheritance by Blythe J Dunham

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome friends welcome sorry about the technical difficulties I needed my speaker notes so I'm Blythe Dunham and I'm just venturing out on my own to freelance as snow giraffe I've been doing Django for about three years at rover comm the largest network of trusted dog sitters and walkers and now we support cats and hire humans so prior to that I worked on Ruby on Rails for almost a decade and since I've been in the tech industry for over 20 years you could probably say that I've spent a lot of time hanging out with models yeah jokes get better so as you can tell from the name snowdrop I really love snow sports and giraffes and so I thought I'd include them in today's adventure we're gonna be talking about composition and inheritance the three types of model inheritance that Django supports two alternatives to model inheritance and then avoiding inheritance altogether so who's seen this before okay I think I put it in the abstract so so in 1994 a book called design patterns came out where they advised folks viewing object-oriented design to prefer composition over inheritance because it's more flexible so what does this mean composition is a mechanism to combine objects or data into more complex ones you can think of it as a has a relationship for example a giraffe has a blue tongue inheritance is a way of deriving a subclass from a parent or base class to create a hierarchy of shared attributes and methods you can think of inheritance as the is a relationship so our giraffe is a glorious animal so while inheritance provides a way to avoid repeating yourself it's more obvious and natural for us to build association between objects than it is to try to find commonalities and organize them into a hierarchy so the plot thickens when we think about how objects map to the database via the ORM the object relational mapping we find that composition is really intuitive and has a natural mapping however inheritance isn't even supported by relational databases therefore we have several different approaches to choose from since each has its own ins and outs it's important to choose wisely are better yet rethink the problem using composition so let's look at composition in django we have a giraffe class it has a name field we have a tongue class and it has a one-to-one field back to giraffe now if giraffes had multiple tongues I would use well that would be scary and I would use a fork I'd use a foreign key type here instead for the one-to-many relationship so if we look at the object-oriented UML diagram we represent our objects with a one-to-one relationship and this looks super similar to the entity relational diagram or erd that represents our database schema the objects are represented as tables and the foreign keys are used to show the Association so unfortunately model inheritance is a little bit more awkward it doesn't have that natural mapping so up first is abstract models the point of abstract models is to reuse the parent classes fields and field related functionality so as the name suggests the parent is abstract and not backed by a table in the database therefore each derived class will have all of the fields from the parent and itself on its own table so in Django we have an animal parent class that subclasses model and we have a name field we have a method for speak that returns gibberish and we have overridden the meta class definition to set abstract to true giraffes subclasses animals we add a field for the number of spots that the giraffe has and we override speak to return home because that's what your do you just can't hear it it's infrasonic so the EOD diagram looks like this assuming I've named my Jango app abstract we have an abstract draft table we have an integer auto incremented field for the ID we have the name field from the parent animal and the spots count from the giraffe class if we were to override our sorry subclass animal again with zebra it would have that idea name field and then anything that it adds like a striped count so when we query for our giraffes this is gonna go query against the abstract giraffe table we can't query with animal dot objects all because that animal table doesn't exist and so when we call speak on the giraffe it returns hum because we've overridden that method so use cases abstract models work best when there's a lot of duplicated fields if there's only a few fields it's better to be explicit and just define them on each model so great examples include any sort of base or Core model functionality that all are many of your models inherit for example in two scoops of Django it walks you through the timestamp model which is also implemented in Django extensions and what it does is it adds and added and modified date/time fields that are updated when the record is saved so you could use that with any of your models if it's a giraffe it's the location if it's a customized user anything like that so the advantages of abstract models are that you can easily reuse the parent classes fields and field related lotta logic however the parent class can't be used in isolation so if you have any related models you can't have an animal ID you'll need to have a zebra ID and a giraffe ID okay this is my favorite slide in the whole deck and the photographer atif side granted me permission to use it I could warn you about using multiple table inheritance don't get eaten by the lion so multi table inheritance is defined like this in Django we have a big cat parent class with a name field and we have a subclass lion the adds giraffes hunted and a method called speak now notice that I haven't overridden the metaclass definition this is vanilla out of the box model inheritance in Jenga so first this is called concrete inheritance because the parent class is concrete we have a big cat table on the database with that ID and a name field the lion table has a pointer a big cat pointer which is a foreign key back to big cat and then it adds any fields of its own like giraffes hunted now notice we don't have an ID field here which is not usually the default in Django so the primary key of the lion table is this big cat pointer ID if we subclass big cat with cheetah and cheetah has none of its own fields then we still have that big cat pointer ID so notice here that you could implement this explicitly with one-to-one relationships if you want it what happens when we query let's try to get all of the Lions this execute a query on the lion table join to the big cat table and what this does is it allows you to access the big cat instance the big cat pointer without executing an additional query you can also call any of the fields or methods on the parent directly on lions so you can say lion dot name the problem starts when we try to get all of the species of cats regardless of lion cheetah whatever so it starts out simple give me all the big cats we'll run a query on the big cat table then I want to access the speak method on the child but I don't know if this is a cheetah or a lion because that foreign key is on the cheetah and lion fields so I run a query on the cheetah table and it's not a cheetah so you get an exception and then we can try again with lion so this times it it works it returns roar but we've executed another query on the Lion table so what this means is for each record you have you'll execute up to n queries where n is the number of subclasses so if you add another subclass then your performance might be degraded but wait you say I love to either load and optimize everything okay that's great good for you and second of all you're still going to have to do a prefetch query or a select related which causes an evil left join per subclass so a good use case for multiple table inheritance is the classic shopping cart where a travel store we sell trips with start and end dates and we sell t-shirts with sizing information so the car or the order has a many-to-many relationship with product which means I have a join table here and if I just need that name and pricing information I don't have to follow the pointer to the trip and clothing classes then it's not really a performance problem if in addition if I only have a few products in a cart at a time then you can follow that foreign key and it shouldn't be too terrible so in conclusion the advantages of multiple table inheritance are that all the common parent attributes can be queried easily together however when you start accessing those subclasses it could lead to inefficient queries that hurt performance and make scaling difficult a lot of this is lack of understanding of what's happening under the covers so sometimes it is better to be more explicit because if your coworker adds a subclass two years now the road you might find yourself having performance problems okay last but not least we have proxy models the purpose of proxy models is to override the behavior and functionality of the parent class so we have exactly one table to rule them all so for lack of a better word everyone in middle-earth is a person this is our parent class they have a name and a person type that I'll talk about in a minute and there's a method called characteristic that returns middle earth dweller Hobbit subclasses person it sets proxy to true on the meta class definition and it defines a method characteristic that returns hairy feet so again there's one precious table no matter how many subclasses you add and when we access this via hobbit we'll get back a hobbit instance and calling characteristic will return hairy feet if we access it through person will use the same row in the database the same record but when we call characteristic we'll get back middle-earth dweller so we basically just changed the behavior of the subclass so what cool thing you can do with this is add a custom manager so this elf has an elf manager the elf manager overrides create to set the person type to e when the record is inserted we also override the query set to filter on person type equals e so this means if we query person objects at all then we'll get back instances of person for Frodo and Legolas however if we query with elf we get back only Legolas as an elf instance because Frodo is a hobbit so again one table we're just changing the where clause or the sorting order of the the columns that we select it's against proxy person which is our one and only table the advantages of proxy models are that it's really easy to modify this classes behavior the disadvantages are that fields used by any subclass must be defined for everyone on that one table use cases are things like an ordered model where you change the sorting to sort on like an added field or an active model where you filter out deactivated models if you're doing soft deletes and you can create a custom user model but it might be better to think of that as a one-to-one relationship between user and user profile so one thing you can do with proxy models is down casting and single table inheritance down casting is a way to cast instances into the subclass so normally when we query with person it returns a person in sorry a person in instance when you call characteristic it it returns middle earth dweller with down casting it will return a hobbit and an elf instance and the way that this works is we have a type field and we set it with the class name so we do one query we get that class name and then we instantiate the correct subclass and so when we get Frodo and call characteristic we've never wanted hairy feet so much so down casting is not supported out of the box you can use Django typed models or you could do it pretty cheaply and quickly on your own there's an article called Django STI on the cheap there is also down casting packages for multiple table inheritance but you have to be careful because you will still incur that extra query or select related depending on the implementation so the advantages of single table inheritance is performance performance performance one table means one query the disadvantage is that since each subclass all of the fields have to be represented on that one table it can lead to clutter and blah so some people call this the normalization of all the data on one table for performance the use cases for a single table and multiple table inheritance are really similar the shopping cart scenario would work the way that you can choose between them is ask yourself do most of the subclasses share fields and functionality and if so and you want a performance boost single table inheritance might be appropriate if they're vastly different then multiple table inheritance is preferable so now that we've gone through all this I'm going to tell you that sometimes the best type of model inheritance is not to use inheritance at all and so we have a couple of alternative approaches and then we have some ways to rethink the problem so this guy it's not really good to play a generic explosives all the time we have something called generic foreign keys in Django to implement polymorphism polymorphism is the ability of an object to take on many forms as you saw with inheritance so your homework is to look up how to define this in Django but in short a generic foreign key fakes a real foreign key with two fields the first field is the content type ID and that is a foreign key to the Kengo content types table that holds the name and the app label for all of your concrete models across all of your apps the other field is an object ID which is just an integer so we put the ID of the related model here you could put 0 or nonsense data it's just an integer field and so for that reason we have a very weak relationship to anything you want it to be it could be a blog it could be a giraffe it could be a location it doesn't matter so the advantages are that you can use any model you don't have to do another migration the use cases for generic foreign keys are things like tags and comments where the related object can be anything like a blog post or a giraffe or a location and the object the comment is not usually accessed outside of the blog post or giraffe a related object so what I mean is your typical use case would be like I have a blog post give me all of my comments for this particular blog post if you're asking the question like give me all the comments in the world I don't care what the object related to it is and then I need to go look into this object and do another query to see what it is then you might consider using a single table or multiple table inheritance so the disadvantages are that code can become hard to maintain if you're using dynamic type checking then two years down the road you might not remember what this object is that you're passing around in all your methods another disadvantage is in order to access those objects from the scenario where we have all the comments in the world you need you can't use select related so you're going to have to write custom sequel if you want to optimize the performance the other major disadvantage is that there's no referential integrity I think of referential integrity as a seat belt so you can drive your car down the road 200 miles an hour but you probably want to put on a seat belt so this means that you have nothing to prevent you from putting dirty data in the database if you delete a record it's just an object the object ID field is just an integer so it might not cascade through and you could just end up with a little bit of unclean data so the second alternative to model inheritance is unstructured data we have JSON field for Postgres and there's a JSON fields package that you can use with other databases now what we're doing here is we're taking a bunch of fields and we're serializing and then we're just shoving them into the database so this can avoid the clutter that you see with single table inheritance because each subclass just uses that one database field to jam in whatever it wants and as similarly it avoids the need for related objects as with multiple table inheritance the disadvantages are that it's pretty tough to query against unstructured fields Postgres will let you do it but for the most part you want to just put data into the database and not index against it or query against it the other disadvantage is that you lose data integrity it's not enforced by the database again and so all the validation has to be on the application level and then this can lead to dirty data since we're just putting in exactly what each subclass wants and you make a change so you have another blob and some of your data might be a little bit dirty okay so we've gotten through the two alternatives but maybe we could rethink this a little bit this lady is going into Corbett school or in Jackson Hole and she probably could have rethought her approach to a little bit this is the easy entrance she made it and I went in after her so is it's all good but my point is just because objects share attributes it doesn't mean we should represent them together in a hierarchy like a human and a beetle both have legs but they're not inherently similar or most of them the good news is that most izi relationships can be expressed as a has a relationship so a user is a seller or a user has a seller profile this is that one-to-one relationship I was talking about it with user you can create a profile instead of subclassing it another example is a manager is an employee or an employee has a managerial job and when you rethink the problem with composition you can take advantage of that natural mapping and the last thing I wanted to say was sometimes it's good to be explicit sometimes you can use multiple foreign keys instead of inheritance like proxy models or maybe if you only have one field to repeat you can just add it to multiple models and finally with multiple table inheritance you won't get the bells and whistles that Django provides but it's a lot more explicit to implement it as a one to one with one-to-one fields so everybody in your organization knows what's going on and you will recognize the fact that when you add a subclass your performance will be degraded so your future self will thank you and I thank you - I really appreciate it and a big shout out to all the organizers and volunteers at to make Django Khan so special please feel free to hit me up on all of the normal means I've put the slides and the the code that I used at Blythe Dunham DMI for Django model inheritance on github and thank you very much [Applause] [Music] you
Info
Channel: DjangoCon US
Views: 1,600
Rating: undefined out of 5
Keywords: djangocon, djangocon us, django, python, Model Inheritance, Model, Inheritance, 2019, Blythe J Dunham
Id: BEHM210eR50
Channel Id: undefined
Length: 23min 37sec (1417 seconds)
Published: Fri Oct 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.