DjangoCon 2019 - Just Add Await: Retrofitting Async Into Django by Andrew Godwin

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] thank you very much Katie CS good morning everyone this is is it just add a wait it's maybe not quite as easy as the title belies but a brief introduction to me first of all I am Andrew : some of you may know me I've worked on things like Jang and migrations and South and channels and I've been around Jango a while my day job is working out of em bright trying to fix their wonderful scaling issues and all the matters of things and such a large on your website and if you want to find me on Twitter or my personal website and my list of national parks I've been to you can go there but please later but what we here for we are here for one thing which is that I can fit Jango on one slide and it's not fast I'm joking it is for this it is for the goal that when I sat down five six years ago and thought about what an async Jango could look like it's this it's that you can sit down the right thing that looks like normal Jango that you can immediately understand even if you've never seen a synchronous code before but that runs asynchronously and while just presenting this to you in the abstract by itself may seem either super easy or super hard there's a very detailed plan behind this and the plan to bring a sink into Jango as amber sort of alluded to earlier in in her keynote so I'm going to go through as well as I can and the time I have the sort of basic async landscape and then going to go through the particular problems that comes from being a big mature framework like Jango and then I'm going to go in-depth my deep dive into what it means to actually implement some of this in Django's requests paths and ORM other bits like that so first let's talk about a sinking brief you heard from amber the basic premise of a sink the idea that threads are preemptive but let's go into a bit more of what that means in python specifically so python has a threading library you can just import threading and run threads but it's a lie Python threads are not actually threads it's not a thing that runs on multiple processor cause it's just an implementation of what's called time slicing in Python when you have threads it just takes each of the threads you have and runs them an equal time slice no matter if they've got work to do or not if even if the thread is idling it will still be context which to go oh it's idling and cause it switched away from it again by comparison co-routines are cooperative they only yield and give back to the main control of the event loop when they hit in the wait what that means is if you're not doing anything you never even get hit but if you're doing a lot you actually can suck up all that different time and kind of waste it the co-operative is part of the name if you want if you are an uncooperative co-routine you can ruin everyone else's day that's an important thing to think about as we sort of go further down into when I talk about dangers of a synchrony and so we have these two different lands and traditionally people in Python have sometimes use threads if you've gone near the threading module you may have some kind of appreciation of how difficult it can be and how dangerous it can be to go there co-routines are different in some ways they are much safer in some ways they are harder to think about the key thing co-routines need is an event loop now amber did mention this this is cool the brains of the operation is the thing that listens on the OS sockets that works out always a bite incoming from the network it's where the program goes when a car routine exits to work out what it's doing next you can imagine every care routine sort of runs gives back to the event loop the vet knew goes oh there's a new thing over here on this socket who's who has a socket is this care routine and then gives control to that care routine that's kind of wit does and the fun thing is the event loop is a thread and this kind of becomes the sort of I have this dawning moment about two years ago I was like oh that's awful when you realize that you can have both threads and co-routines and you can have multiple event lived in multiple threads don't do this so here's a sort of visual illustration right so say I'm running a standard normal Python application that is synchronous by default I can spin up an event loop in my synchronous application and turn that synchronous thread into an asynchronous hosting thread and basically what you do is you call the event loop and it just blocks wherever you call it from it sort of absorbs that thread and uses it until it returns at the end of it and so you can see here that like in this diagram you can basically take that synchronous thread and turn in something that runs multiple co-routines inside it but then of course you may want to call synchronous code maybe legacy piping code or things that just need to be synchronous or they're simpler from your asynchronous code it's now we have to call you have to make a separate thread and they call the synchronous thread from your asynchronous thread and this gets really tricky and we'll see how later I just appreciate the idea that like when we talk about threading in Python a thread is either running a event loop and there's an asynchronous thread or it is not running an event loop and it's a synchronous thread and that's kind of the dual mode you can have Easter eight is one of the two of them now why don't we just use threads threads are slow as I mentioned it's not a real like os-level threading implementation and so what ends up happening is the more threads you add the more Python just naively cycles into each of threads and it gets slower and slower and slower if you try and run 10,000 threads on Python it will just crumble under the weight of context switching constantly and never get any work done and so the goal here is we want to have async as amber eluded async is fast as long as your IO bound luckily we write websites websites are pretty much IO bound the entire time either you're waiting for the user to upload stuff from HTTP or you're talking to a database we're talking to an API like all these things are I oh very few sites in the world spend their time literally a lot in the CPU doing cure calculation template rendering for example is usually that but even in Django template rendering often calls the database and so you've there immediately got some IO in it and the other thing to consider is while async may seem like a wonderful solution there's a slight issue with the way Python was designed and this is no fault particularly of the Python core team like it is the way it evolved because of the way Python I think came out of yield and yield from there's a long history there async functions are different to synchronous functions one function cannot be both of them this is really awkward when you're designing api's and we'll touch on this later but really what it ends up being is that you can't have one function that you can call from the same context and that you can call from an asynchronous context and answer that problem if you do it wrong it's dangerous if you're in an asynchronous thread and as I mentioned it's sort of cooperative so it's your job to do the minimum minimum out of work and then yield when you're sort of waiting for i/o if you have naive synchronous code in that carotene that synchronously goes and does a blocking fetch you've just locked up the entire event loop for that second or two seconds that that remote web call takes and the event loop cannot cut you off it cannot come in and stop you you have literally just ruined the contract of being cooperative and so if you are calling synchronous code in an asynchronous and it is dangerous and it also it's very easy to do by mistake if you just forget to put a weight in that can be a problem and so these are all problems we face and like here's this illustration of that problem right like I have an asynchronous thread accidentally make one of the co-routines synchronous and nothing happens for a full second what you want is do you think I mentioned earlier where you go our I have some synchronous code I'm gonna shove it into a separate thread and then I can return control to my event loop let the event loop do other stuff in the meantime and when my thread is finished it will then signal the event loop to come back and resume my care routine that's the pattern you kind of what to do if you haven't got the impression yet it's very complicated and as well get into async code is great but I encourage you even when Jango has full support for it write synchronous code first understand your logic understand where it falls down and then take the parts of the optimization and take those asynchronous and that's kind of the fuss you'll hear throughout this talk is that a synchrony is great and it should be an optional add-on so let's talk about django django is a big framework I love it it's been a big part of my career in programming and one of the reasons I like it so much is that it's very stable and predictable and this of course brings with it a whole host of problems when we're saying well we want to totally change that one of the paradigms it operates on as I mentioned a function cannot be both synchronous and asynchronous and this amazing presents a problem when you think of any API in django so let's take the caching framework caches are usually on the network let's say I'm talking to memcache or Redis I have to go to a connection ask for the get get it back it's usually like you know tens of milliseconds which as you compared to nor CPU time is an eternity so I should be being hazing here but while Django has the top one here it has the normal cached or get we can't make the bottom line work we can't have cache get also be asynchronous compatible we can have a cache get async and have two different functions but as you can imagine that makes for a very ugly API like the posle we have is actually do cached or async get and try namespace them under that kind of thing but even that's a little bit ugly and a little bit extra code to maintain and that's just to start Django is built on a series of incredible third-party libraries from the Python community specifically databases things like the memcache libraries talked about things like the Python imaging library or pillows and is now these are all libraries that were written in a synchronous world and things like databases we have a standard called DB API to it's synchronous there is no DBA pi3 that's asynchronous there is no standard way for all the different async my sequel and Postgres libraries to present themselves to django even sleep is different like it's not even the same that it's literally a different import from a different place standards are really good and one of the problems you have when you sort of wander into this new world is that they're not there anymore and the principle one of these of course was how Django presents itself to a server whiskey WSGI as it's often pronounced better is a wonderful standard that has maybe been one of the prime reasons Python is so successful as a web language because you can pick any framework and you can pick any server and they will talk to each other like I can take G unicorn and I can run flask in it I can run Django on it I can run anything else in it I can write my awful little app I wrote in a weekend as it did its own framework and we just didn't have this for asynchronous stuff there are a few proposals from different servers and like tight server bindings but I was around before whisky was standardized and I remember the days of oh this web framework has to use this web server you have no other option and like that's kind of you sort of took them as a bundle deal so now we have a SGI and i will not go into SGI in full that is I put another full 45 minutes of presentation that would probably bore you all to death but it is a whisky like it is as close to whisky as we could get it but still being asynchronous and the key thing is it is you make an object a callable that you give a scope which sort of like the environment like oh here's your wrote client address here's your headers here things like that and you get a send and receive callable and what that means is unlike WSGI where you just get given the input when your calls you can sit there and receive packets you can do processing between them you can send back multiple packets it is full duplex as we say in networking and this is really useful for things like WebSockets but also for HTTP as well like modern like HTTP two and three features require more and more communication between the client and the server than just a simple request and response and that's just not all the language itself kind of works against us to some language features that Django relies upon heavily do not have an async equivalent you may all be familiar with the fact that you can for example do dot something on a rated model and get the idea for example here we have instance you can do dot foreign key they give your name and jangle happily in the background pause goku database find the instance load into memory and then give you the name now while we can do in a weight version of say objects don't get so obvious dot filter that works fine what we can't do is we cannot do asynchronous actually access if you are in an async thread and you call something like dot name and it starts running the ORM you've just run synchronous code you just block the entire thread for half a second maybe and you've broken the contract and that's really tricky inside and this is one of the things that like bought me into Django like I saw generally oh I can just traverse things easy via dots and we'll talk about how to solve this one later but it's really one of those little tricky things and kind of finally threads do matter while most code in Python is generally thread safe some things are not and sequel light is one of those things Django and sequel light kind of come to get like most test Suites are run at least its small scale in sequel light and if you take sequel light and you make a connection in one thread and then try need it from another one it will just complain it you and explode and so even though we're trying to get rid of threads we still have to consider them because of the fact that asynchronous code is in itself running in a thread and if you call synchronous code it runs in a different thread and on top of all of this we have to get backwards compatible Django has been with very very few minor exceptions backwards compatible since 1.0 like every release you can go you can open the release notes you can go oh okay I see I just these few small changes and they've declared in advance and I can get there what that means is we have to keep that I can't just give you all a version of Django it's like oh this is totally different here to rewrite from scratch because quite rightly you should all say no Andrew that's stupid I'm not going to use this and so we have to make sure that all of these features are accounted for and that when you take an existing Django application and move it into this new version of Django and had one async view that all of the rest of it runs perfectly fine still and so we have this from like async has to as Django like we can't replace a whole sale we have to play up well you get to add one or two async views to your current project but the rest it still runs fine things should look familiar that util feel Django ish you should still have things like dot objects and the filters and the models and the view should work the same way and the middleware should seem familiar and they need to be safe Django has always stopped you from essentially shooting yourself in the foot we try and make sure that it with things that are safe by default debug is the one exception to that and we really try make you turn it off in production but in general it's very like things like the authentication framework has since like constant time pass for comparison that people don't really think about they get attacked with a password break and this is all the problem of like what it takes to take Django and really pull it apart to make it async so let's look at some of those actual things in detail so if you have never like sort of dived into the Django internals and give you a brief overview of how Django is laid out this is obviously a very simplified version of Django but essentially Django has a couple of different a couple of different pieces it's built around a recruit a request path where you have a WSGI server that calls a thing called the handler which sort of translates WSGI into what Django thinks of as a request the request object we would be all use the hand R then sort of stalks and runs the middle where it then talks to URL Rooter and it ends up with a request it's been through middleware and a view function and it takes those and it runs the view function with the request and the view function is then of course supplied by you the wonderful developer and then you can call the ORM you can call template you can call forms and then when you return a response the handler takes the response object it decodes it back into the network so W layer and a hands it back to the server and this is sort of kind of useful because what we can do is we can take one of the key things I've learned about doing big rewrites which is a phased approach in particular we can go outside in and this is kind of what we I sort of alluded to and amber was talking about like what's in Django 3 + 3 1 there's three phases sort of I broken down to simply Ennis the first one is having support for talking not just dagger SGI for talking to a different back-end protocol a SGI phase 2 is making that core part of the request path the handler the middleware and the views all async capable and the third phase is taking the ORM the thing I mentioned the top of this talk that's probably the best use of async that you get the most efficiency out of and making that a thing as well and so these three each have their own benefit the first phase a SGI support is maybe the least useful to you the end developer and but it's a really important foundation for us to how in place so that not only do we have the ability to build phase 2 but also we we sort of tell the ecosystem hey we are gonna support a SGI when the next release comes around you should probably think about making sure you're gonna work with this and some servers are already thinking about adding support beyond just the ones we have now and in terms of timing this has shifted in the last couple of months my initial goal was django 3 for both the first two phases but it's been quite a few months let's say and so django 3 does have SGI support when that releases in late later this year it will run against an a SGI server phase two which is async views and middleware did not make the cut but that should almost certainly make into django 3 1 and then the our own work is the largest and most difficult and most unbounded part of this my hope is it makes it into django 3.2 and but I'm not gonna hold myself to that at this early stage and the other key thing is I've learned for big rewrites is you have to plan for failure if you've seen some of my other talks about engineering like applying the failure is very important even if we cancel at any point of this project we have concrete benefits if we cancel after phase 1 which we haven't done we still have the support to like has somebody else coming in a future and make it part of Django async if we cancel off to phase 2 and just have async views that in itself is a huge benefit people can now go and use things like a sync requests libraries and go and talk to things themselves they can't use it around the same way but they can do a lot of API calls easily and a lot of modern web development is API driven and of course if we do like half the ORM that still don't bring performance improvements so let's talk about each of those phases in a bit more depth from what it means to be outside in it so first of all a SGI and the file and bit of history here Django predates WSGI when I first came to Django in around the 0.96 era WSGI was this new sort of thing but like oh we could have a standard that wasn't just tied to one server and one of my favorite examples of the almost weird few rory around this back then and they start walkin though people kind of thought django was a bit full of itself back then this is one favorite quote from our very own James Bennett in a 2006 blog post called Django and NIH as he says just so you know Django is a smug arrogant framework that doesn't play nice with others or at least that's the impression you'd get from reading the rants I want to bring this up because like there was a time when SS GI was in its own way controversial right but there's no history here and what is wonderful about this is the fact that Django predates WSGI we kept the fit the indirection between WSGI and Django in there like I sort of dusted it off after basically a decade and went oh we can still fit a new protocol on here we let we left this junction point where like you just had W I hooked up to it we remove the old one years ago but it was still there and that's one of the wonderful like history things that has come full circle and lets us make things more easily and there's other things too like we have our own request and response objects again this is a useful thing because we can adapt those to either protocol we have our own handler classes as I said the perfect place to put this new abstraction and of course we have custom middleware for those of you who went around from PI's in ten years ago which is presumably most of you there was this wonderful grand vision of WSGI middleware and the you wouldn't need middleware anywhere else you just write it all as WSGI apps and then django would sit underneath all of it and there are many reasons WSGI middleware is a bad idea but it was a huge argument at a time and the fact we have our middleware means it actually is now easier to adapt so what this looks like in a sort of zoomed in level well as I said you have the server and the server calls that handler class the handler class basically reads the environment so in WSG is you get a big dictionary called Enver on which has lot the headers and stuff in it it takes that and maps to interrupt a request object and there is a subclass of requests and most of the logic there is in that subclass so you give whiskey requests there are sort of the Enver on it decodes a little bit and does the headers itself it also does things like if there's an input stream from uploaded file or post body it takes that and wraps it in a file object and if your other bits and bobs to clean up the request and make it a single nice request object once it's done that and it has sort of a generic request it then passes control to its super class which is called base handler now base handler has the generic parts of Django the bits that back in the day and now happen on both the different protocols things like if you have an exception that doesn't get caught it has a last chance exception handler things like if you set I want to have transactions around on my views it's the thing that wraps transactions run all your views and once it's done that it also then is in charge of taking your middleware setting loading the classes and then running the requests through them in order so basically it takes a request from its subclass it runs through the middle where it then ends up with a nice request that has all the stuff on it pass it to the URL resolver it gets like a view wraps in a transaction if it has to and then it calls the view and you can take this whole idea and you can add in the a SGI side and you can see here that like there is some duplicate code of course but we've reused most of that right hand side all I have to do is rock like that there's a new request subclass which is a couple hundred lines there's a new handler which passes from the scope rather than an Veron but in general like those two bits they do their specific code and they both hand over control to the base handler and in Django 3 these are the async parts the asua server the handler and the request run natively asynchronously and then as soon as a CIA handler passes control to the base handler it switches and runs in a thread in synchronous mode and that's kind of nice because it means I didn't have to touch the rest of Django we did it you just added the bits on the bottom left here in red and then it just sort of worked it wasn't quite that easy but relatively it was fine one of the nice things as well is that a SGI is deliberately mostly wsg are compatible we sort of made sure when it specified that there's a pretty direct mapping between the two of them for example things that you might have in the scope or the matter have direct comparisons in the scope in the a SGI there are some more tricky parts uploaded files is particularly fun this here is a proceed shortened version of what it takes to upload and ingest a file in wci you get given a literal file object a SJ gives you events and it gives you like chunks of the files it streams in from the server so if you want to go you can actually run before the files for the uploaded and but what that means is in Django we have to take those chunks write them to a spool temporary file which is a Python 3 thing like you can shove bytes into it and they all sort of live in memory and it gets too big it pushes down to disk but it gives you a file object we rewind it to the beginning and we hand it to the thing and all of that lets us just basically pop the control through to the existing place handler and not have to touch it and what that meant was that first part that first patch what it was big was fully self-contained now you may may remember I mentioned in the beginning presentation that async cooling sync is dangerous and it is but thankfully it is a danger you can understand and contain and what we have is a package called ASCII rare for a sui ref if you want it has two things in it a callable call a callable called sync - async another another one called async to sync and as the name suggests they map one world to the other so for example if you call sync to a sink it's asynchronous function like for example the based handlers or a handle response which is the thing we're trying to call it wraps in a thread pool it handles exceptions well it makes sure things like secret lights are happy and then it runs the code and while there's a lot of stuff in here like things like thread locals work - because while Jango tries to avoid thread locals I think about half of all the Django sites have ever seen shove a request into a thread local so we have to handle that - like I I know it's convenient but it it's fine gonna handle it and so it doesn't that stuff for you and like if you peer into the box it is honestly like slightly worrisome but you can close the box and just not think about it and just think about like well and you've done it and there's other people and they've got tests so it's probably fine and that's that's the go all right like I want to make it so like you don't have to think about this kind of stuff every day that you can trust that there's a safe way of going between the two worlds and as all of this the results is janggu three can speak ASG I like when it releases in later this year it will do that it unfortunately won't have a Singh much to my my sadness but it does set that groundwork and if you really want to it will let you write your own handler and start doing async things natively and some some big companies do do that but the really really good part is phase two and I'm excited about this maybe you can't tell so I said we're doing a phased one sort of outside in approach and what that looks like is we take phase one here and we sweep asynchrony through Jango and make it further in so in particular we rewrite base Handler so it is also asynchronous and so it can handle looking at a view in saying is this view asynchronous which in Python is T is co-routine function callable you can tell before you call it if it's synchronous we lose sync to async and pull in a thread if it's asynchronous we can call it natively and how it run in our event loop and what this means is now you've got a fully asynchronous path through from the SGI server all the way through to your view which means you have all those benefits of being natively async you can run very concurrently you can do all that stuff you can handle thousands and thousands of concurrent connections without running out of memory and without exhausting threads with some caveats we'll get to those the other fun thing is what I've said there's only two things here there's really a third part of this test client the thing you used to call Jango in tests is the third entry point into base handler like when you when you try and calls and test it doesn't have to do a full call through the jjang of staff sort of fakes a request object and then sort of quickly pops into the handler with a fake request object and so what that means is we basically forever can have synchronous code in either the test client or the wci handler calling into the newly asynchronous base handler and in fact you may end up with a case where you have sync calling async calling sync and like Django is not dropping support for WSGI and we're not dropping support for synchronous views like those as those are staying around and so we have this thing where like we this is gonna be a stand a part of the way the code runs and the naive ways of doing those two transitions async code run is that one Python recommends for going sync - async and thread pool their way around they are very naive they do the thing I mentioned where the code on the far right there does not run in the same thread isn't one on the left and if you've got middleware that makes a database connection and then leaves it in the request for a thing to go and do later it blows up in the most spectacular fashion and the bugs are the test just oh it's awful and this it's not a sequel light but it's the one you find earliest when you were somewhere in the test suite and so I'm not going to tell you how I do it because it's very very awful but when you do this and when you call aging to sink and sink to a sink there is a mode called thread sensitive if you set that mode it does some let's say magic this is not magic edition it's kind of magic and it runs both those synchronous things in the same thread again as a whole talk in this and you all hate me for it if you want to go and be surprised please read the code it's tested really well I promise so remember there's a caveat to middleware middleware is really annoying the old solid Django middleware is great you had a class that had process view process response you call the middle where he did some stuff it left then you call then you ran the view gave your response you call the middleware your left out of it the new style is also lovely but it has a problem in terms of asynchrony the new style middle where you give your middleware a function says hey call this create a response and so the what happens what that means is the middleware lives on the stack and all the middleware it's sort of suspended above the view while it runs I mean if did you view returns or raises an exception it runs back up through the middleware zip through it and if you can catch the exception you're great now what this means is if you want to have a synchrony you've got asynchronous get response in the base handler you've got asynchronous views you may even have asynchronous middleware but if you've got just one synchronous middleware in that stack and of course all the middleware that exists is by definition synchronous right now you have to use a thread and we've got that one piece of synchrony that one piece of blue in our lovely red end to end async stack and what this means is as long as you have synchronous middleware you don't get the benefits of having thousands of thousand connections without using any threads we're pretty sure the only way around this it's gonna have to be to write all of Django's middleware to be natively async because I think as you can run async middleware on sync views that's perfectly fine but that's a really sort of tricky one of like we may have say like well if you want to have this high concurrency here to limit your middleware or add some warning flags or detection about what kind of middle weights running again this is sort of in progress we're talking about on the forum if you have opinions on this I'd love to hear them but it's a really tricky one of all we want to keep compatibility with some restrictions but make sure you get some of the benefits too of course the main benefit of an async view is you can call stuff asynchronously and things like databases and stuff like that and you still even with a thread being used up it's still faster but it's not quite as good as it could be and sort of other problems to class-based views are a huge issue the whole problem I mentioned with you can't have a callable a function that's both synchronous and asynchronous think of that but the whole generic view stack that's the problem there we have some ideas again some of them are not so pretty but we also might not do that and in the in the first patch templates are fun too templates are synchronous in Django we're not going to touch that yet but like you may think but where's the template rendering your handler path if you raise a 500 error or even a even the worst error errors Kortright the top there's a template handler that spins interaction and renders that 500 and so a lot of the test failures earlier on was me missing that a piece of synchronous code was being called sneekly on the side by an error handler so just go through and just add synched async everywhere we could to that bring down the number of tests and of course trace bats get kind of worse all those synced async calls take two or three lines up in every trace back so we're trying to look at a way to make that prettier but it made us happier thing we live with but the goal of all this is that we have async deaf views in Django 3-1 I have a branch where they totally work and all but two of the tests pass which is why it's not where it's there's two tests to fail but by all accounts you can take that branch you can write one async nephew and do async little things in it and it works perfectly fine you can take an existing Django project just add something to it and it works and that's always been the goal right and then we get to the hard part which is the error Ram now this is much less defined I not have spent too much time here because it's kind of beyond the horizon of the second phase but I want to reiterate the API design and Jack being Django like here is crucial you've got to have familiar but safe api's and what's nice is things like iterating over query sets like the Python did give us a sink for so he can just make query set work in both sink and a sickness modes if you're iterating over it don't get won't work until need async dog yet but we can do things like this that make it much easier to deal with we probably can never do this we can probably never have asynchrony work with calling and traversing models but you should probably be using slated anyway and so a bit like if you want to use the ORM for an asynchronous function you have to use select related it's going to be basically the conclusion there and again this is all optional gonna have to do this and the same kind of phase approach worked with the ORM we start with a fully synchronous one we make query set have an async facade where it sort of looks a sync and you can do stuff but it just runs the rest of the ORM in a thread pool behind it and then eventually we'll try and make the whole thing asynchronous and then we'll have a synchronous facade on top if you want to call it from say a for you in the meantime we get to all of that we did add one thing in Django 3 which is that the Django ORM is now fully aware of a sink safety if you try and call the ORM from an async context in Django 3 it will complain at you non-stop it will be like what are you doing why are you doing this please stop and this is like the important first part like we have to make sure it's at least somewhat safe so that's the one thing we did get the error I'm done but it does need a lot more research this is one the blondes I'd love to have people come and help with I'm I have never worked on the query side of the ORM it is slightly terrifying and scary in there and I'd love some help diving in and fixing it all that and that's kind of sort of what it takes to dive into Django and break it apart a little bit I want to spend a little bit of time here at the end of the talk looking ahead and what this means for Django in the future and what this kind of big rewrite really means so first of all I really want to stress this we are not removing synchrony from Django some things just don't need to be async this is true of people's code like I personally believe 80 to 90% of the site should be synchronous and only a court n 20% should be asynchronous but also to Django like the URL Rooter is CPU bound it does not need updating to be asynchronous we just leave it as it is and not touch it forms are perfectly fine probably unless you're trying to do validation small a small problem there but in general there are parts we can say well this is perfectly fine we can just leave this as is and come back to it and really for me a sink views are the big cornerstone like when we get there when we have a release of Django we can go async deaf view we have unlocked all of this from you have to understand weird inside Django stuff too you can go into a weekend project where you make an asynchronous request library or an a or your own small asynchronous RM and use it with Django we open up Django to the wider aceing ecosystem and honestly for me this is the biggest part the RM would be great but this really opens up the ability to use all those wonderful libraries like amber showed you in her presentation for example but we gotta be careful performance is concerned we do things in Django to really trim down performance like you've got a signal that has no listeners it doesn't get cold for example is like a special edge case doesn't make it faster and this is gonna cause some slow down to normal synchronous Django it is my personal view that if we caused too much slowdown we do not do this and we have to make sure that it is careful balance like we're not gonna ship Django 3-1 but makes your site run half as fast like 5% 10% I might take a penny more than that's really pushing it so get really careful about that and like what it costs us to do all those async switches and it's not just technical like there's people I have at this point now done one and a half big three devices of Django I have some experience with burnout and it's not good it's not an interest of me like like these are projects that are very detailed they are very specific and it's not just about those of us have been in Django forever doing it and buffing down getting stuff done that's not sustainable what I love about this project is it's a perfect example of when you can get involved with Django as an example like we need to rewrite the middleware but if I can give you like hey here is othman where it needs to be made asynchronous you have a precise spec of how it works you have full documentation and you know asynchrony means and you have a test suite that is a very very achievable and approachable goal for someone to continue to Chango for their first time and so my hope here is that we can take this work and not just spread it around and make it more you know reliable and more sustainable but also really bring on new contributors and help with this project and maybe also have some of you learn the true horrors of async when you open the box of course you shouldn't also not be paid for your time funding is a thing that I am sitting on thatis voice on but we need funding and like more on this will come soon like I have plans that are forming on this particular front but like if we are going to do a big project we need to fund it and make sure it's not just people have free time you can work on it like that's not good for anyone and so really like there are async experts want to pay for time but even people who are new to Jiang you're contributing like I don't want you to take a financial hit or take it away from like freelance contract work too to Jango I want that to be a thing that you can go no this makes sense to me and like maybe it's a small small pay cut like I do I open source work but like it makes sense and top of all of this of course this is really big like this is maybe one of the largest changes in all of Django's history and I can think of a few things when I've been around have been very large patches the aptly-named a magic removal was just when I arrived into Jango which as its name implies removed a lot of magic things like settings used to automatically import remember right this is literally I'm really stretching here like things just got magically imported to certain places and like sister modules got fiddled with and it was it was an unpleasant to say the least but this is still pretty big and we haven't done one of these in a while and so like one of the things when I was thinking about 18 was like I need somewhere to talk to people like the Django Forum which we're running now in a test phase is partially launched which I think I need somewhere to have like long conversations that everyone about like weird async stuff but not clutter up and annoy Django developers with lots of weird async stuff that goes on forever and ever and like that's kind of a difficult part of like how do you as a modern open-source organization do this kind of stuff but I think finally the thing that really gets to me is why people often come to me and go like Andrew surely acing is just a flash in the pan right it's the hot new buzzword all the kids the kids talking about it but all the kids are talking about why why does Django need it and it's a good question like there's lots of things I think there have been buzzwords or a flash in the pan or things that just aren't important I think a thing is different and obviously amber's talked earlier showed a lot of those advantages but we live in a world of applications where pretty much everything we do is I owe bound I can't think of more than one site I've worked on that was CPU bound that sat there and like used a hundred percent of its server if you log into pretty much any Django like physical server or virtual server but like onto the OS and run top it is not at a hundred percent CPU usage it is full of memory we are memory bound because all those threads use a memory after all the processes we are not CPU bound it is my personal belief that a well-written asynchronous django app could get a 5 to 10 X efficiency and performance improvement if it was heavily i/o bound based on some numbers and some tests I've run obviously not for every app obviously do different things different people but it's such a huge advantage even putting aside things like the fact that we live in a world of api's and micro-services right like how many big sites are not just Shango anymore like you call like two different Amazon services and maybe Google Cloud Service and maybe over here less icon from Mazur and then there's like an API servers up here and then there's 3 micro services like if you did all those in parallel you'd be a lot better off your users would have much lower page latency and lower page latency is a better user experience at the end of the day like asynchrony directly comes back to user experience like we want sites that are responsive and quick our users enjoy using like Django is always in my mind been about that right Django is there to give you the ability to write beautiful amazing websites that people love and that you do not put too much effort on us and again like when we add a Singh to Django the goal is that you can understand as much or as little as you like that you can use as much or as little as you like to get those benefits and that's for me really is is the pitch if you're curious about more there are some links here I'll post the slides on Twitter just after this if you want to not jot them down hardly I'll take a picture of them but there is a blog post I have that goes more into synchrony versus asynchronous and what it means to switch threads and it has code samples and like some of the nasty things I talked about with threads dp9 which AMA mentioned which is the novella length I would say proposal to get a zinc in a Django and then we have a page on the wiki which has sort of links to the forum and where to go and help and ideas a project you could help with and if you do want to help please come talk to me here come to the sprints or even just come to the forum and chat and we'd love to hear from you thank you very much [Applause] [Music]
Info
Channel: DjangoCon US
Views: 4,663
Rating: undefined out of 5
Keywords: django, djangocon us, djangocon, python, Async, backwards compatibility, Retrofitting, sync, Andrew Godwin, 2019
Id: d9BAUBEyFgM
Channel Id: undefined
Length: 44min 44sec (2684 seconds)
Published: Fri Oct 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.