django Architecture - Connection Management

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Django is a very popular python based web framework it's been used by a lot of companies a lot of developers love it especially because you can get up and running really quick Django is also an RM which means it connects to the back end as a database connects to the database and it simplifies the way a developer can connect to a database instead of writing their own SQL they can just use objects which is very familiar and in this video I want to actually explain that back-end portion of Django because this is the backend channel the end of the day and I like to talk about the architecture of Django when it comes to connection management to the database at the end I'm gonna talk about the advantages and also the disadvantages of this architecture let's jump into it so if we if we took this for example we have a back-end database right here we have our Django framework which is spun up and we have three worker threads heads of execution available for us here and you can control that of course right you have many clients that try to connect to the web framework here since HTTP request get post stuff like those requests will turn around and connect to the database to execute certain queries get the results and then build the response for the server classic three-tier architecture so now we get into the nitty-gritty details here first of all the client will need to connect to Django as a TCP connection it means to establish a connection I am not talking about the front end here in details because I want to leave it to another video but there must be a listener here that listens for connections whether this listener is a single threaded or multiple thread listening on the same port using the socket option reuse port and they'll kernel would load balance the the connection to multiple threads it's it's all of the you know this is out of the score of the discussion but let's say we have one connection that is owned by this thread and this client is sending an HTTP request to that thread to that server so what the HTTP request here is like as a get request it goes to that thread the thread picks it up it parses it understand that second request and then it turns around and then oh we're trying to fetch certain API right and I need to make a database query so what Django does here is upon receiving the request and upon needing to connect to the database and it's executing a query Django will establish a connection to the database arrive there and then only it doesn't establish it on Startup and by default right so in this particular case the TCP connection will be handled to the database right so there is a three-way hashject if TLS is is configured also we're gonna do a TLS so the cost of established connection establishment is incurred on the request itself right by default so after that of course we're going to send the query and what will happen here Django will send the query the database will process the query whatever the SQL query is it's going to take x amount of time and then meanwhile the client is waiting here right and technically not only the client is waiting right let's be very careful here when we speak not only the clients waiting the threat is also waiting the threat is doing nothing here so you can actually serve stuff here with this threat because it's doing nothing sending an i o right it's not like just doing computational stuff right so when the database responds back it gets the response and then Django will write the response to the client socket right so it will build effectively it will whatever you wrote the code here it will write down their response as an HTTP response to the client very simple stuff right and once that response is written what Django does is closes the backend connection it was designed to do this which is very good uh initially looking at this right because the goal here was we need to minimize the number of connections to the database because a lot of database connections to the database will increase the load of the database right the usage here that might be true maybe 10 years ago I don't think that is as effective here because people are now moving to something called persistent connections right the HTTP model if you think about it right HTTP 1-0 was exactly like what Django's described every request you send HTTP was designed if you send request to an HTV server you establish a tcv connection you send the request you get a response you close you must close the connection that's how HTTP 10 was designed right but then quickly we we throw that away because because that was so expensive right because every time we send a request we establishing a TCP connection and we're doing the the TCP and followed by TRS right so that was very expensive to do so we opted for keep alive HTTP 11 and then HTTP 2 the connection is always alive and we're sending tons of streamed Multiplex requests and the same connection so persistent connection is the way to go that's the model today again because we it depends all and what we're trying to do here and for this scenario especially if you're chatty with the backend with the database versus the connection is the way to go of course if you can opt in for this approach if you know that the number of connections to the database are very very low but sometimes it's not the case that's why you really need to study your use case and your requirements here and think about your architecture when you configure these things so this is how persistent connections are done in Django you can do this today you can configure something called connection max age equal none and when you set it to a none it will become persistent and the first client that connects will establish the connection and it will remain alive or effectively Forever Until it fails right so if the connection fails for like I don't know something happened to the network you know socket here you lost connection the Django will close it and the next request will basically open it and you can see that you can send many requests and these threads will take these requests and serve this connections effectively and this is pretty good persistent connection we use them all the time right but here is one kind of limitation when it comes to Django Django according to the documentation I didn't make this up right I'm going to reference this documentation Below in the comment section right Django is one connection one thread model so if your thread if you have one thread you get one connection only and that's it I didn't read anything else and the entire documentation so that's must be true and so what does that mean it means that if you have three clients here three TCP connections are spread around the threads again this is not necessarily true it depends how did you how your front end listening model is and this is It's on video by itself right but assume it is assume is every thread is it is is accepting a connection for every client right so in this case these this is this client sends a request to this and the thread takes the request opens the obviously the back end TCP connection to the database sends the query and then the query is just spinning on the database because queries are not cheap right sometimes you have query that is expensive one two three seconds yeah then you have another one at the same time concurrently same thing another one so now you have the three threads has sent three queries to the database and all of these queries are just being executed in the database their clients are waiting for a response the threads are also as we explained earlier are also waiting which is not good you never want your threats to be idle in this particular case yeah the first can still do things you know because this is an asynchronous call right the thread is free to do other stuff but what exactly right let's say if you if this if this client sends a request right connects to the thread it will accept it because hey it's not doing anything so this third will accept this client if that request let's say this request that that this pink request okay is um I want to connect to the database so what the third rule does is like we'll take this request and say wait a minute you wanted to connect to the Davis I'm sorry but this connection that I have is already busy I cannot send multiple requests on the same connection this is a discussion by itself you might say no why why not well you can try that not all database supports this pipelining concept where you can just pump in multiple SQL statements one after the other on the same connection without waiting for a response it's very dangerous to do that because if you send a SQL statement right here right if you're using the SQL query and then the database style process and you in the same connection you sent another SQL right and let's say the second query was faster you the database immediately responded with that with the second response right so the database responded with the second SQL response how does the thread know that o is now is this what are yours is this what you're sending me is this for the first query or the second query it doesn't know there is no identifier when it comes to SQL statements right when you send back results it all depends on the protocol of the databases most databases don't have this concept if Django built something like that and the database supported postgres just I think in posca 13 supported pipeline where you can send multiple requests on the same database but most of the time you can't do this you just say hey you know what just be safe establish another specific action but the problem with Django it's it's a one correction pair thread model so that connection is now busy you cannot use it for something else right so in this particular case this client is waiting is not doing anything so let's let's say this other client now let's not say he's waiting let's say um I'm sending a request right but this request has nothing to the database it's just uh I don't know it's reading from the cache a file from the cash and responding it's an HTML file the thread will happily process that for you why because the thread is technically not busy it's just waiting for stuff right as long as the communication is asynchronous and I wrote a medium article about this right asynchronous communication the third is is not doing anything it's just hey send a request I'm just sitting here not doing anything right so it can do CPU intensive operation you can send it to do compute hash that's fine the thread can do this work but if you are IO bound where these guys actually want to talk to the database you're stuck these guys are will be blocked so what can you do well you can you can simply just spin up multiple threads for Django right and yeah because if you spend another thread each thread will get another connection that's fine right so you have more threats to serve uh clients the problem with this is the you now established you can you can go up to a num a certain number of thread after that you will see severe degradation and performance let me explain right the number of threads are usually right what nginx NHA proxy and void recommends because technically if you look at Django's acting like a proxy right talking to the database right so the num the recommended thing is every one thread pair core pair Hardware thread to be specific but one thread per core right so if you spin up four threads you have four cores right right or four or eight Hardware threads to be like if you have multi-threaded uh if you have hyper threading on your CPU enabled then you can have every thread set on a core why do you want every thread to be pinned to a core because context switching right if your thread lives in that CPU and just sits there and and and all the processing happening there in that thread then the CPU will not move it out of the CPU you know to put other stuff in it in the CPU right if you have hundred threads then what the CPU will do is like okay I can only have one foot at simultaneously right it's running on my on my CPU right so I I will execute this and then okay one other thread comes in and start to do stuff it will put in the thread and we'll move the other thread right to execute stuff so you the CPU will be shuffling the four chords in this particular case we'll be shuffling 100 threads left and right removing removing context switch contact switch and that the cost of context switching will actually kill your performance eventually and that's why they they say don't don't spin up 50 or 100 threads right it's not going to become faster you're bounded by the number of CPU cores at the end of the day yeah if you have 48 cores go nuts right but if you have like 16 or 8 or 4 then you can only go by that right and that's assuming you have Django's only running on this machine that's the only thing that is running on this server right that's why I absolutely love system architecture and back in architecture all right guys uh that's it for me today uh if you enjoyed this content hit that like button and check out my udemy course database.nosore.com fundamentals of database engineer I talk about this stuff uh you know I absolutely love this stuff and check out my networking course if you're interested in not working network.hose.com thank you so much I'm going to see you in the next one guys stay awesome goodbye
Info
Channel: Hussein Nasser
Views: 43,716
Rating: undefined out of 5
Keywords: hussein nasser, backend engineering, django python, django architecture, django web framework explained
Id: D-3WMlcv2i4
Channel Id: undefined
Length: 15min 9sec (909 seconds)
Published: Thu Oct 13 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.