Is there a Limit to Number of Connections a Backend can handle?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what is going on guys my name is hussain and this is a very interesting question that i thought i gotta make a video about i already answered this question in the comment section uh for that particular video i think it was the the video on uh uh on door dash moving moving from rabbit mq to kafka and the question was what is the maximum number of connections tcp connections that the client can send or or the server can receive or or or something like that right so i don't remember the exact question there are so many parts to this question so if we're talking from the server perspective from the client's perspective the answer is different however we need to clarify some points first of all there is for a given server there is no limited theoretical maximum server connections it depends on the memory and cpu to actually process the tcp connections however before you guys yell at me for each client connecting to that particular server there is a limit there is a theoretical element that you physically cannot exceed and and why because of the tcp header right tcp header specifically layer 4 has a port header and then udb and and and tcp doesn't have to be easy it's just layer four in general quick i don't know if they quick fix that i hope they did i can't i i don't think they did because uh you they're using is still udb but the bit size is 16 bit that gives me a maximum of 65 000 connections 64-ish because some some stuff are reserved and that only means for let's say one client right from a given ip address to the server to the same ip address they can establish 64 000 tcp connection between each other and that's set you cannot exceed that because what where do you get to put the next the 64th and one right there's no limit it's only 16 bit in the port header right and then the tcp however if you're a server guys right like take take whatsapp for example check out the video right here they managed to do three million tcp connections per server why because usually a client will not execute 64 million connection 64 000 connection to the same server it's it it's just ridiculous nobody does that right and i did talk about uh the browser executing in http 1 1 to be specific opening six to ten connections that's just a hack guys right because http limitation in one one they could not send multiple requests at the same time concurrently on the team tcp connection they had to revert to this idea of having multiple uh first they tried pipelining and that failed uh because of head of line blocking and then they tried to open multiple tcp action to the same host right but that even that will not exceed the maximum right it will put a little bit of load on your server but it's not really a big deal right so that that's why you're gonna get all these connections and but yeah so you can parallelize them so yeah guys that's that's that's there's no physical limit only when it comes to a client and a server there's this 64 000 connections but a server can handle so many right because if it can it can handle here's the thing right the server to a given port can port like listen the server is only listening on one port right let's think about this real clear right a server is only listed on one port why is the server and the client have maximum of 64 000 connections the reason is that the server has a fixed ip address the server also has a fixed port remember it's listening only on port 80 or 443 efficient gps right or any other port so you technically as a client from a single ip address to run out of this combination that's the 64 000 connection from the client side that's it you can only execute 64 000 because you're gonna first you're kind of connected and gonna use the port right the source port because there's always a source port the client that that we use in order to send back information to the to the client right from the servers if that is the source board doesn't exist we we don't know where to send the information and that's what labeled the connection right and there's a that source ip surport is a 2 to the power 16 which is 64 000 connection right so that's the limit here so you can do so much it's just poor pair client pair ip address right so if a server it can if there's a different imp address here's another 64 000 connection if there's another ip address here's another 64 000 connection so pair client right but nobody's going to execute 64 000 connections from the same ip address right why would you do that doesn't make any sense right because the same tcp connection can handle so much and we fixed that problem with http tourette the the pipelining and the head of line blocking will still have a head of line blocking with http 2 at the tcp level which we fixed with http 3. but one tcp connection is enough now we open one is that the problem's like the server how much does it handle you can handle as much as you want right as long as you had don't don't open a lot of connection for my client so so technically there's no limit right as long as you're coming from different ip addresses there's no limit the sky is the limit you have more memory be my guess open right you have more cpu to handle this to these connections be my guess i mean i mean once i did it three million i don't think anyone exceeded that right i don't i'm not aware of any company pair server again one server i'm talking about one stinking server right so let's talk about proxies here proxies have problem with this guys what is a proxy what is a reverse proxy we talked about that check it out here but a proxy a reverse proxy is actually a client when it's when it comes to the backing so this client connects to the back end in a pooled manner it opens a lot of connection to the back so there is fear of actually exceeding the limit per server so let's say you have three uh two servers two back-end servers here and you have one reverse proxy and there's a client that makes a request so at the side we don't worry much right because there are many clients from different countries from different ipads but at this side at the back end of the reverse proxy which is also the front end of the back end right the reverse proxy will make a request to the back right and let's say it picks the server it can open one tcp connection and once it opened that tcp connection you have reserved tcp connection right i'm talking about layer 4 proxying right now layer for proxy is a problem when it comes to this because if you have a client and then that client now that it will just almost like terminate the connection and then create another connection on your behalf the reverse proxy is the source ip right and it establish a connection to this guy and then let's let's another client comes in with another ip address the reverse proxy will connect as itself to the same target so we just used the same ip errors in the reverse proxy different source port so now to a given back end the reverse proxy can only open 64 000 connections that's the limit right that's the limit we we will hit the limit right especially with layer 4 proxy it's not it's not scalable it will easily you will easily hit that limit with a whoop socket if you do like whoop sockets proxy ink in hi proxy i believe they just downgraded to that uh to to layer for um to uh to layer four proxy from layer seven just just this stream the same connection that and then that's as a result that tcp connection cannot be used for anything else unfortunately if you're doing a layer for stateful connection right like websocket or even just database stuff like can't use the same pipe to send other stuff like http stuff that's why connection pooling and hopefully quick invoice doing a good job with this and i and i i really like what they are doing with this all right invoice using http 2 at the back end right and when you use http 2 you can use one tcp connection and send as many requests as you want given that you are a layer 7 proxy right because if you're a layer 7 proxy then you don't have this you you can technically reuse the same tcp connection if you know what you're doing because it's it's almost like a stateless request right the request becomes stateless now if i have a request that comes to the server at the layer 7 like http i'm going to terminate it and i'm going to establish the request goes to this server or the server or the server it doesn't matter right you can establish to any connection and then send that request to any back end that is absolutely powerful and if you use hdb2 you can multiplex these requests into the same tcp connection and that's awesome right because now you can have you don't have a limit anymore you're going to open one two three let's say that's it you're not going to exceed that right and then you can serve hundreds of client at the back in on this three tcp connection this is called connection pooling right but more likely tcp not easy uh this is multiplexing right http multiplexing quake will do the same job do we'll do it even better right at the back end there is the cost obviously http 2 costs a quick cost where where now we're not using just a beautiful tcp connection where the operating system is doing our job it's us at the application level we're assembling these streams and and figuring out what what what packet belongs to which stream and reassembling and so the application can actually understand what is this thing so this is cpu cost and and google and team are working on lowering that cost of the cpu for that thing so yeah maximum connections exist it depends what are you doing with proxying if you're doing layer 4 proxy you'll hit that very quick i believe right because if you have like a hundred thousand it really depends how many backends right that's why if you're load balancing in the back end let's take a layer for example again let's say we have 120 000 clients here right and we have one back end and you want 120 000 websocket connection which will turn into lower level layer for proxying which any client will now be terminated and then now i as a reverse proxy i will connect to the backend on behalf of the client that's that's what a reverse proxy is right now the back end knows me as the ip address so i have only this much source ports to work with so you'll have this limit even less than 64 000 because most of the ports are reserved right so yeah that's a problem you're going to hit it quickly right so that's why you add another back end so now you have double you have sixty four thousand thousand times two 120 120 and more and then yeah you will immediately hit this level then so the solution is add add more back ends add more backing obviously now this is a button link now what do we do we we know what to do right we know how to scale versus proxy i talked about that active active active passive keep alive all that stuff right you can you can basically do a dns balancing where you have multiple reverse proxies which are technically stateless and and and it doesn't matter which which reverse proxy you hit we get a load balance across these backends so you can do so many tracks to kind of balance the load as much as possible all right and we talked about all that stuff failover and active active with dns right so there is another trick that hj proxy did at one point to avoid that maximum number of connections and it is it is basically what the router does right the router your router today if you have if you have a lot of devices right and all of these devices go to the same router right and that router makes the request on your behalf technically right but what the router does actually keeps a table it keeps a nat table so technically it is just one tcp connection all the way to the destination but it keeps track of which client connected to which server and then it matches up it says okay oh you went to google and okay so this is your source ip source client let me forward it back to you and it can only do that if it is the gateway and if you go to now you're watching this on your phone or your computer go to the wi-fi setting or the lan settings and click on details and you're going to see something called the gateway ip address the gateway ip address is the trick here your machine your laptop your phone if it doesn't know where to forward the packet to it will forward it to to the gateway and that's that's for free right so so the ip destination is not the normal router right just like the reverse proxy it is actually the google or or idea or or or wherever it is destination website but the mac address at the layer 2 data frame is destined to the router and the router knows what to do about it and if you can build a configuration that acts like a router you can you can just eliminate that that limit right because now you don't have this the this limitation anymore right you you have a table and you can grow as much stuff you can put as many as many stuff as you want right and now in this case it's very difficult to achieve because now you have to put your client in the same network and make them as as a makes it those client gateway iprs as the reverse proxy or as this router right and it's very very difficult to to achieve and once you do that now the client is as if it's talking to the back indirectly it's like the reverse box is just acting like a router it's not having one testification between the client and the reverse proxy and another one between itself and the back and no it's just one just like you when you go to google.com on your phone and you're on the wi-fi network on starbucks or why where anywhere you don't have a connection between you and starbucks wi-fi and the starbucks wi-fi will correct another tcp connection no it just swizzles the packets that's how it works right all right guys this video was supposed to be four minutes talking about that but i end up talking about proxy and maximum number of connections so it was a very good discussion and i wanted to make these videos like almost like on a daily basis uh i want to go through and just kind of answer a question a day and then i like talking to the camera like that hopefully i get a decent camera because i cannot use my phone when i do this and i cannot do searches because i'm using my phone to record this stuff so guys i love you so much if you enjoyed this content subscribe to this channel i talk about the back end mostly talking on networking security i specialize in in back in technology i love i love i love i love black and so i took usc most content is only backup back-end technology here in this channel so love you so much gonna see you in the next one you guys stay awesome goodbye
Info
Channel: Hussein Nasser
Views: 10,568
Rating: 4.9359999 out of 5
Keywords: hussein nasser, backend engineering, max tcp connections, backend connection limits, tcp max
Id: o-EkdZW4zbA
Channel Id: undefined
Length: 18min 42sec (1122 seconds)
Published: Tue Sep 08 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.