PYCON UK 2017: Scaling Django Codebases

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi Ron I'm dan and I work at threat or about seconds so just quick intro is talk this is most its new people who are new to Django or people who've made you've only worked on smaller Django places this isn't about scaling to large traffic volumes it's about scaling on code bases how you write opposed to what on our teams or think interpreters if you want to look at the volume stuff go and look at blog post to Instagram and discuss they've got some great stuff in it so or Guinness people were used to smaller code bases I'm hoping you'll take away the rough structure of how to structure your your Django apps dragon site to grow for intermediate developers or medium sized ok start working your lines for practical ideas you can apply to your places and experts you probably almost already so maybe you'll get my deposit another way right so really quickly who works on a Django site running introduction awesome who has a code style guide a style guide our guide 19 great who has a style guide specific to Django and how they use Django 30 fewer okay who has more than 10 Django apps in their in their site who has more than 100 ok so a little bit about threat we're an online personal styling service essentially he curated a online clothes shop so our tango site includes ecommerce with a lot of personalization payment and delivery integrations CRM light system for our stylists warehouse management order of supplier management loads of reporting we have new heavy we use a t-tests heavily so lots rewarding around that and loads of web scraping and processes of cleaning up theta so our code base is mostly on Drago site 364 first party applications when I counted this about a month ago we added more since then only 368 first part of models though which may give you a bit of a hint of the direction I'm going with with this this is as I said this is not any of the third party django apps that we've installed or the plague i think we've got hundred and forty seven management plans over a thousand year old comps only 65 lines of code per hour on average and that's excluding our migrations tests I think and like empty inner top right and yeah in fact it's by 135 thousand lines of point then at about another 60 kala rates of junk templates so the first thing I want to talk about is apps in Django so this is as distinct from a site a site spilled on multiple apps Django documentation destroyed apps as a web application that does something and there are things like an advanced tutorial on how to write reusable apps they say reusability matters I think from this it's really easy to take away the fact that maybe I'm writing a web application so I'm writing one app right or maybe apps are only for people who are publishing open-source libraries from news I'd say you're making one Django site but you should definitely have multiple Maps why well I think apps are great for set concerns when you have an app doing just one thing and you've nest your apps according to the domain that they're working on it makes it very easy to navigate a lot of project as I said we've got three hundred hundred sixty four apps I think and they are nested around about twenty at the top level and that means that as you go down you can find exactly the little bit of the code that you need to work on and then when you were working on that you only need to really keep that one app in your head while you're writing the code and it might only be a couple hundred lines of code I think they're great for rapid development it's easy to find where new code should live especially if you're writing an a/b test that you might throw away in a few weeks or a few months you can contain that and dropping the code right is easy is just deleting the folder removing a line from your installed apps list and I'm really I think this is the best part of Django I think it sets drago apart from things like Reynolds in rails things grouped code is grouped by the type of comb it is a controller or a model warp things like that whereas in Django it's grouped by domain the the apps that contain all the program wide with one particular domain of your pointers so in there are some potential downsides first one that I hear from people a lot is what about circular imports surely with all these apps or just importing you've got circular actors all over the place in practice this is actually quite rare in our code base in fact we tend to reach up the hierarchy in the app tree or pout to but only two other top-level apps that contain like the really core models of our site it needs a little discipline to to manage this but it is completely possible to to get around circular pause and the box is the other thing I guess what about the global namespace of models as you may know you reference or models by the app name and model a for example or user and that is a requirement in Django there are a couple of ways to get around this you can either just prefix some of your map names with the parent app name as well it doesn't look very pretty it's not ideal but that has worked for a very long time and our code base thinks back to about five years ago so this is what we've gone with the other way to do this is by using app config swear I believe and specify the app name you want to use so that the Python package name doesn't have to encode that tree that's the vein cap so quick example so our topological style means actually predates the name threat which is handy because thread is a bit of an overloaded term for a guy under this we've got top-level axis and all of them are some examples and these contain like the really core models it's like order item and they're unlikely to change ever but these are huge or would be huge so we start to nest things underneath things like step bubble this is an app that just traps the stock levels of items it only it what's up into inventory to gets the reference to the item model and it just has a simple view like one year alcohol I think might have a form for filtering that view to a date range but it's it's really straightforward and very easy to hold the whole thing in your head then we've got a lot of subsystems again with an inventory of a review process where we clean up the data from our web scraping and so we start nesting even further within each of the stages of our review process and under there as you can see these are prefixed with inventory of view because some of these might be names that we may want to reuse elsewhere they either get only import up to inventory of you may be up to inventory inventory review table the morals required from our review process but yeah because we can then isolate these stages of review and it may means that we can create new stages or drop all stages or just work on one stage and keep it all in our heads one at one time in practice we don't tend to nest much further this I think we've got four fourth level apps and they're very very specific so this tends to get most of the most about this so yeah the rule of thumb is divided now when you're making new features making you a performer beginning because it's easier to merge apps together than it is to separate them later on down the line so next thing let's talk about the internals of apps Django has this pretty figured out with the basics we've got all the basic files that you would expect and these are really pretty well defined we've then got a couple of others think about tasks and you're using something like celery or another task cube and then there are some more more uncommon ones that Django has it's it's very easy when you're writing in code in an app too so it may be you want to extract a a to title function or something like that and they are just I'll just think it one little spot pie next to my models that's fine right and it is when you've got Maps we've got half of a model see that's not a problem but as you get bigger it is going to become more difficult when you come to an app that maybe you've never seen Paul or you hadn't touched in six months it's going to be more difficult find where that code is and so I only encourage you to start specializing start finding the things that the files that make sense in your project we use an API doc PI to define the public API that other apps along to call we he lives in one place we have some some systems here that do auto discovery all metrics that reports of site maps so that maps can easily define them in sitemaps and an auto discovery mechanism will pick that up and and read it at the local system and and really once you sort of figured out these things that you reuse throughout many apps you end up with very little that doesn't have a defined place of watching there are still a few things do many until Spotify but well the time you extractable this about Utah supply for us is very often reusable business logic and and so it even even utils to apply it's fairly obvious what will go in there even though it's a little bit of a catch-all kind of thing so moving on let's talk about naming things it's really important as I'm sure you imagine to have a consistent style for naming things of a large project being able to efficiently effectively graph your codebase is really important so here when you've got 100,000 plus lines of code being able to find every usage of a particular thing is really what so Ivan Drago style guides favor insistency I encourage you to find a naming scheme that works all the way from your are names through tube use template your template hierarchy your map names and your module names we have a pretty consistent style with this so it means if I'm in a few I could probably guess what the templates going to be called what the URL name is going to be and so on and so forth and this really does help I'd also encourage you to to do the same thing with models name your models decide going to include the a plane and ma plane for example finds the defined style for how you made your relations or your fields do you're related names and all of the models need to have a related thing do you use the underscore sets names are also generated by Django we define related names for everything and we have some style around that and fields as well I'll give you an example that field making a sound and another one use named URLs everywhere this is critical to being able to grab the code base effectively for all usages of a particular few things like that make sure you Nate URL so that it's it's easy to find them here's an example of a model with some some field names it's entirely reasonable to write a blog post model has an all the fields and published fields this would make a complete sense of the case of wall post but user and created are much more generic things that you can apply to many many more models and when you're joining on to this table from you know three joins deep from the other side your code base predicting what these fields were called on this model can be a real pain so having consistent naming of the user field or the creative field or other things that you might have in your code base is really beneficial and there's one example where I think the consistency there is worth more than the fact that this looks a little bit so next up let's talk about models relational databases are great I think it's worth getting the design right up front and trying to encode as much domain knowledge into your models as possible I mean it's getting the uniqueness constraints right there's jagat matter unique together if you haven't seen it it's a way of finding multiple fields so that you can get the right constraints and there are lots of other ways to help you model the the domain you're working in and I really encourage training her as much as that into your models because it helps maintenance of a lot one django migrations are great but don't you poor things in them jangly migrations are about the time that you generate them not about the time that you run them you might run a migration months and months later when you're trying to rebuild the database or something like that so I really urge you to copy stuff in it may seem a little bit counterintuitive you want to the important reuse things but copying constants name them get the common saying when they pave on and then the underlying code the rest of your project can change and that's fine because that's the only changing but migrations are fixed in time and that helps you don't really want to be maintaining all the migrations that you wrote six months ago we appoint signals that we avoid overriding the same I think we've got three signals in our code base across 360 oddballs there are always many exceptions to the rule but yeah they basically just make it difficult to know what is happening when you're saving a model or change them and and it it can it lead to performance issues and things like that so we tend to avoid them particularly for business logic would rather wrap up the the core parts that business logic into an API that explicitly does the steps rather than relying on symbols that kind of link together all these different excitement and keep all we have a lot of models that are like one or two beans it's very easy to end up with like a profile object that has 100 fields on it but by spinning it out and having lots of small models you can be more fine-grained with your caching you can also locate those models next to all the other code that deals with those models and the apps that's specific to those models we have a model component dress code I think it hangs off or user it's got one field which is the user's dress code but it sits with the other code that deals with that one and and it means that we don't have another field that would end up on on a profile so Jango also wants one it is a really handy little library for automatically creating those sub balls so I think one of the best bits if I go is the architecture it defines for you and but it's not perfect and sometimes when conforming to that view model kind of structure things just don't fit in sometimes so I started learning Django on version 1.1 from a book called the definitive going to Django some of you may remember it and I pretty much I remember one thing from it and that was that Django is just - it repeats as it loads of times and it kept really covering this time so what I doing this is wench I went on your architectures are fitting don't be afraid to create your own are conservatives there are things that don't fit into the Django world so leave that behind create a playful package that maybe isn't Jack go out and define a maker well-defined API into the Django world and you'll hopefully a lot of something that's easier to understand build and maintain so there you may end up with bits called business logic in in views maybe when it's just 29 that's fine but you're not a hundred lines or more and you you might want to extract it makes it easier to extract out but likely later if you don't want to do that but also there are some other things you can do you could separate it from the database which might make the pastor's test so an example of this is we have about five hundred lines of a business object to select which tips to show to users as it is about dressing well and this is a completely self-contained module that's completely separate Django in october's and essentially the API takes name tuples we can stock those out of the Django models but because that API is then separated from the database it's very easy test and it's it's also need to have mystic we'd have to worry about use of sign facts that it might be doing the rest of roster system China also has the couple of backends you may have seen this with database backends and cache backends I really encourage you to create your own where it makes sense you may have a task back-end for a queuing system or SMS back-end or payments back-end or in our case recommendations back end and by extracting these things out it makes it really easy to swap in different different implementations in different such situations so in the local development you might back onto stripes test API rather than that production API so that you can test making payments locally and the other the last the last example of this is don't be afraid to just go build the entire subsystems that are pretty much isolated from Django they can maybe get bought the models that they need to we tend to use management commands as a nice entry point of this and so we've got a straight-in subsystem we did enough that I think about 50 miles of lines of code so it's really quite large and it's pretty much entirely separate from Django very nice API that we actually have contractors work to so so yeah we're able to share the work there and maintain that straight through that welcome I know yeah so the last thing I'll talk about is testing right tests that write themselves this may seem a bit weird but you've got a number of checks and a number of tests that we wrote a very long time ago that are still expanding and serving as well now things like testing to make sure that when you import a point file that doesn't make any database queries or testing to make sure that every HTML file in our code base can be passed as these you only have to write them once but they will keep skating up as you what is your web base grows these ones are kind of Handy but they're mostly just a little sanity checks I think this is more interesting when you start writing these four bits of your own projects that make sense so we do a lot of AV testing on our registration form the way that users sign up to thread there are a number of steps in this and we ami test a lot so we've developed a test for all of our funnels that it will programmatically create and run through each the funnels so every time we add a new fun or we get a suite of tests for free essentially we do the same with checkout we have a number of different types of user and we can run each type of user through our checkout system and check that at the end they've managed to place an order and things like that so I urge you to write tests that will grow automatically as your code base grows in some ways and the other thing here that we do a lot of is customer sessions so there's lots of assertions built into unit tests and also you write the reply tastic using pointers but unit test provides things for assertions forward with the standard library books Django has a few for working with jocular types we have added even more we've got things like assert sent email or send SMS or a certain huge pass and these are really valuable they then I think I don't think that many people really enjoy writing tests I think a lot of people kind of do it as a little bit of a chore so making your life easier means you end up writing more tests you end up working in better tests that assert more about the practice of your system so by writing these little bits of tooling to help you I think it really pays off and again I really urge you to find the places where it makes sense in your project specifically we've got one in our warehouse system assert barcode scans successfully it's like ten lines of code but it's ten lines of code that contains some great sessions that really check the validity of the system at that hard time and we now use that assertion probably 50 times in our in our test code base and so it saves us a lot of time and main our tests a lot more a lot of stricter so to summarize make more apps actual do really just one thing the smallest little unit of of functionality I think be consistent that's code formatting its naming it's the way you use Django which I'm going to use try and automate as much of that as I can lady through knitters but have a style guide not just for paper that the Django as well walk through how you use gifts or the CSS or whatever consistency is at least for us I think in general consistency is almost always more important the reasons for going in any one case and you have foreign Django patterns create your own patterns create your own class-based view hierarchies if you have mix-ins that you use a lot do that create your own test utilities and assertions and remember it's just like them so when it doesn't fit don't force things into the Python as of the Django architecture set that aside build something in point that works really well and then figure out how to make a nice API to two points back again not everything here like I've spoken about will apply to every code base so evaluated in the context of Broadway places find what works for you but stick to it and be be stripped and have a strong opinion because that will certainly really well as your home base grows so thanks for listening Freddie's firing a lot of tango we're a small engineering team of just eight but we deploy changes our four times a day and we have a strong focus on culture transparency we try to minimize this politics and that really maximize the impact of our developers and everyone in the company so yeah I'll be around outside to talk about any of this stuff in space Jango thank you
Info
Channel: PyCon UK
Views: 2,116
Rating: undefined out of 5
Keywords: python, programming, conference, pyconuk, pyconuk2017
Id: jBBcORHhfV0
Channel Id: undefined
Length: 24min 59sec (1499 seconds)
Published: Sun Nov 05 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.