How my backend leaked Postgres database connections and what I did to fix it

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so yesterday i posted this comment on my community post and it got some some traction people seems to be interested in this problem that i ran into my nine-to-five job one one problem of my back end and people seem to want to know more about this and how i debugged and fixed it so how about we read this comment try to understand what happened and then show you the process of how i fixed it and show you some example code that i try to reproduce on my personal laptop here so i wrote this new back-end i built hangs after few requests turns out i was leaking database connections from the pool so new request we're just stuck in the database pool queue waiting which is which is a very a very simple fix for a very frustrating you know thing that happens right so a lot of people a lot of you guys saw okay how did you fix it did you use you're not stat or server logs all that stuff uh nothing that fancy to be honest right it really all depends on how how well you understand your code so you can find problems faster but how about we actually show you what the problem was so here is a simulation i'm not going to show you exactly what was the backing because it's from my work but i built something very similar with a to-do app right so here's the back-end that i'm reading the list of to-do's and i'm displaying on an html file here html page so we refresh the page you get the to-do's you refresh it again you get the to lose you refresh it third time you're good to lose fourth time get stuck more time get stuck there's no results so what is going on exactly so you keep refreshing and you don't get any any results so the first thing i did was okay what is going on so i went to tools your friend in this treasurer's time so when i refreshed i clearly saw that the request is being is just pending on the back end so what happened in this case it was pending in my actual original repro case i was behind a reverse proxy so the reverse proxy which was nginx timed out after i believe 60 seconds which is the time what i configured for the back-end if the back-end didn't respond to nginx for x amount of second please time-out so i actually quickly understood that it was a timeout not that the back end wasn't available the backend was there it's just it wasn't responding so the requests are received by the backend but the backend is getting stuck processing it doing something i don't know so uh the the next thing i did was just let's go ahead and restart the back end here's the back end let's go ahead and restart and behold i'm getting my responses back again one two three and four and we're stuck again it's always consistent you want things to be consistent once you have consistent stuff reproducible stuff you can easily debug what is going on the backend is doing something fancy so here's here's the simple backend that i try to reproduce here all right i have postgres on the backend as a database i have express as the front and http server and uh we have just one endpoint root that says okay to do's query the database and give me all the to-do's very bad query by the way right because there's there's no bounding it's if you have a million to do is you can return all of them bad idea okay this is just an example showing you stuff so here's what i'm doing here in my particular use case i want i wanted to do a transaction so i need a connection of the database and i need to begin this transaction i need to execute multiple queries and this particular one that is one but you get my point and then i commit and then here's the thing this was the problem this didn't exist before i was just executing the query and then immediately returning the client that i reserved from the connection pool was never returned back to the pool and what is this pool if you if you if you ask i made a video about connection polling check it out here if you want to learn more about it but the idea of a connection pool is is very useful if you want uh to to scale effectively right so you would you would create this idea of a pool and this is nothing but a group of connections of database connections if you don't use this then the alternative is to create a new connection every time a request comes right hey every time you come let's create a new connection to postgres execute a query and then destroy it bad idea because you have you have now the overhead of creating the tcp connection the overhead of establishing a database connection right that's an overhead on top of that and if your security if you have security on top of your database that's another tls handshake that idea you don't you never create connections in the request itself so we preheat them instead right and the the pg library in pulse and node allows it allows you to do that you create a pool and you say hey this is your server this is my password all that stuff the port and i want four connection maximum and this was exactly my configuration i had four connections so as a result you cannot exceed four connections in parallel that are all in use in the pool so what you do is you reserve one connection from the pool execute a query and here's what i did i never released that connection back to the pool so the pool thinks that i my connection is being in use as effectively i leaked the connection so if we debugged now and let's restart the back end here and then go back you refresh you get this debug request right hey give me a connection from the pool got it and then execute it all right let's do it again and then do another request give me a connection from my pool right got it executed notice that we have now two requests in flight that has been fulfilled but the database connections that we reserved we never actually returned back to the pool we never released them so what happened here is you have now two only two connections left from the four because you you leaks them effectively so the third one now says okay give me another connection right you can get it because we have four and then let's do one more that's the final one get it and then execute it we get and that's it probably we executed five now from the debug session but now any request now look at what it is it will get stuck right here it will never come to this next point we are now waiting for the pool to give us a connection and the pool says shut up wait all my connections are being used by other clients which is not true because those clients are done right but effectively because node.js is a single process single thread you didn't it's never terminated so the pull state is out of sync so the problem was we never actually released those clients back so the solution is so simple it's just you have to release those connection back to the pool whenever you're done with them and obviously i this line didn't exist in my code and then immediately when i saw the client connect up i saw that okay that's the problem immediately i know i know that was the problem so it didn't take me much more to to find out the problem but uh i'm not and i'm not saying this because i'm a great engineer or anything like that no because i happen to have a maximum connection if this was not set right if this i didn't if i didn't set this then the maximum will be i believe 20 by default then this would have existed on my back and i wouldn't have never known another bad thing that i didn't do is i have not set a timeout uh for waiting on a connection from the pool there should be another timer that i said i forgot what it's called but it's one of these properties does it does does this tell you there you go i love not just look at this connection strength database idle timeout idle is very interesting idols how long should the connection stay there idle before the pool kills it there must be a wait there's a query timeout statement timeout there's all bunch of timeouts here it's probably connection timeout right so there's so much timeouts i should have set these timeouts correctly so that when when a leak like this happens in the future i will never get into the situation again uh ideally when you have single statement like this that you don't need atomic transactions for then you can alternatively use pool dot query instead and then immediately do your select what this does is it immediately uses the pool to pick a connection randomly and then execute the query and then return back the query immediately bring back the connection back to the poll immediately so that's that's one of the best way when you have like a single query like this right but if you want to do transactions this won't work it's actually bad if you try to do transactions like this so let's say you try to do you try to do insert and and then update and another insert right if you start to do this using pull.query bad idea because this will use a connection to pick one connection to execute that query and then immediately commits this will use a transaction immediately commits you cannot roll back this will not be an atomic transaction if you do if you do this obviously so yeah guys this this was the problem effectively this was a very silly bug from my side but it could be devastating if gone undetected and i only discovered it because i had a maximum connections on my back-end and if you want to learn more i have a fullback full video i have a full video about database connection i'm pulling check it out i'm going to see on the next one you guys stay awesome goodbye and guys if you enjoy this kind of content make sure to go to database hosting also dot com and go get a discount for my introduction to database engineering udemy course i discuss a lot of cool stuff here a lot of details i have over 100 lectures doing into the deep of database engineering cool tracks best practices all sorts of stuff and then you guys we have a great community there where we have a q and a's make sure to grab that coupon okay just head to her database dot hussein also dot com thank you so much
Info
Channel: Hussein Nasser
Views: 8,264
Rating: 4.9847035 out of 5
Keywords: hussein nasser, backend engineering, postgres, postgres pooling, database connection, nodejs, node express, nodejs javascript, nodejs postgres, database connection leak
Id: KGbwkbaCwss
Channel Id: undefined
Length: 11min 50sec (710 seconds)
Published: Sun Oct 03 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.