Load Balancing with NGINX

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
g'day welcome to this video my name is jay desai and i'm a solution engineer solution architect for nginx based in melbourne australia in this video as the title suggests i will talk about nginx and we'll focus on the load balancing side of things what i intend to cover in the short video is of course load balancing but look at the available methods with nginx and also we'll look at how you can configure session persistence with nginx plus i will also look at all these ways of load balancing from a technical point of view as well so i'll jump into a vm and i'll show you how the functionality actually works let's get on with it before i jump on to the load balancing side of things i'll just like to highlight the fact that nginx is your extremely lightweight performant resilient and easy to use all in one tool which you can use as reverse proxy load balancer web server api gateway kic all in one so you can run nginx pretty much wherever you want i've done a few videos where i've covered nginx working as reverse proxy and i've done one for api gateway i've done one for kic in today's video we're going to focus on load balancing side of things so with that let's jump directly into load balancing and the first question uh we get is what exactly is load balancing so if you go out and google the definition of load balancing you'll come back with these specific words load balancing is defined as the methodical and efficient distribution of network and application traffic across multiple servers in a server farm a lot to digest anyways in simple words load balancer is a load balancer which sits between the client and your backend service and it receives requests from the client and based on the incoming requests it uh passes the requests on to upstream servers or instances which are capable of fulfilling that specific request so that's the easy definition in order to set up load balancing with nginx uh what you do is you define a group or a pool of service that will handle all of these incoming client requests this is managed by the proxy pass directly so the proxy pass directly passes on these requests to the pool servers within the pool uh can be defined by a combination of their ip address in the port number or a domain name or even a unix socket for http load balancing within nginx what we use is known as upstream directive and the upstream directory sits in the http context now if you're not familiar with the configuration and the and the context which sits within nginx configuration file uh i've done another video on that which you can view and i'll post the link to that in the in the bottom comments and what you need to remember for this specific section is that the upstream directive sits within the http context in this example what we see is our upstream directive indicates that it's got a server pool or upstream name called backend service and within it there are two specific servers or a back-end applications listed the crucial part of their load balancing configuration is the proxy pass directive so in this case uh requests are forwarded or listened by the proxy pass directly on this specific server and using the process directive they are forwarded on to the backend service nginx selects appropriate server based on the load balancing method you have defined in this specific example no method has been defined and thus nginx uses the weighted round robin algorithm as the default method and talking about the default method of weighted round robin in this specific method each request is distributed evenly across the pool of backend servers now nginx uses this mechanism of assigning a weight to each upstream or backend server so in this specific scenario the one in the middle each one of these server has the weight of one and it adds up to a total of three units therefore the load is divided equally across these three endpoints however if you look at the example at the bottom what i've done is i've manually defined the weight of each of those individual endpoints and in this specific case so if you look at the the backend one or say the port 9000 one application i've defined a weight of two for it and the way nginx calculates how this weight works is that it adds the total number of weight or the total number of units across all the upstream instances in this case five plus three plus two so total of ten units so two of the total ten possible requests are routed to server number one or backend one or port 9001 three of the total 10 are routed to the 9002 and 5 of total 10 is routed to application number 3. these requests are distributed across the servers in a random fashion now nginx can also take a server out of the rotation if it fails a certain number of times so the two key conditions for this are the maximum fails or the max fails parameter and the timeout for failure or failed timeout parameter so the max fails parameter specifies the number of consecutive failed connection attempts before a server is marked unavailable and the failed timeout parameter specifies two specific things one the time limit for the consecutive failures as well as the duration that that server is marked unavailable for so in this example uh if you look at uh backend one or the port 9001 it's got max fails of 10 so this server is allowed to fail 10 times within 90 seconds before it is marked unavailable once partly available nginx does not send the server any new requests until the 90 seconds have passed listed here on the screen are the load balancing methods provided by nginx and nginx plus and the server weights we talked about in the previous slide can be applied across all these methods so the very first method the hash manager or the generic hash method as we know it uses the md5 hash algorithm and it's based on a key that you specify so based on the key information the client requests of a certain type are forwarded to the same server they originally connected to this method is often used on the backend or behind a firewall rather than being used on the front end requests while this specific method can persist the session information what you should note is that every time you add a new server or remove a new server from the pool the hash keys most likely would be lost and the session information then becomes invalid so that's that's a point to note there the ip hash method the second method is very similar to the hash method however in that method we explicitly say that uh you're going to utilize the client ip as the specific key the least connection method as the name suggests sends a request to the server with the least amount of consecutive connections the least time method selects the server with the lowest average response time and the least amount of active connections so this algorithms only available with engine explosives that's that's a point to note the last but not least uh the random method passes requests to a randomly selected server uh with the optional two parameter nginx randomly selects two servers based on the server weights and then randomly selects one of those two uh based on the load balancing method you've selected and routes the request to that specific server more details on these methods are available on the website and i will i will ensure that the link is in the comments down below so we can read up more about these methods now let's look at all these methods slightly in detail in terms of what the construct of the configuration looks like so what we have here in this example we're using the generic hash method and using the client request uri to build our hash key so what i've meant specified here is the dollar value request uri essentially uh it's variable which nginx captures uh the request uri uh in other words uh the client with a request uri of slash example will always route to the same server and the client with the different uri routes to a different server so for example slash test would always go to say for example back in one and slash test two would go to back end two or three or whatever it may be so in this example here we aren't showing any weights but weights can be added to affect the initial server selection as well so if changes occur in the upstream the hash keys are recalculated so as the point i mentioned earlier what we have here in this example is the ip hash method but in this case uh it uses a very similar algorithm to the hash method what we used previously but in this case the key is already determined to be the client ip address now this can be either ip4 or ip6 address nginx divides uh the whole ip address by the number of servers available although this guarantees that the request from the same client is sent to the same server what you need to note in this specific example is that for example you've got a reverse proxy or a proxy kind of setup in front of these application servers uh it could be possible that all your requests may be routed to the same backend server say for example if the requests originate from a proxy your back-end application server based on the the load balancing method method or the iph would think that every single request is coming from your reverse proxy and it would be routed to the same server so please make a note of that what we have here in this example is we're using the least connections directly to control the load more fairly across all the back end servers if there are multiple servers with the same number of connections the weighted round-robin method would be applied to those servers so if we specify a server which for example is a server rate of one and server two has a server rate of two in this case nginx would say that server two can handle twice as many requests as server one so here's a unique example so if server one has five active connections and server two has eight active connections nginx will say that service two has least active connections because it is configured to have two times as many number of connections as server one now this is in scenario where we've set up server two as weight 2 and server 1 as weight 1. so this is a weighted list connections configuration however i haven't got the weights in the slide here which i just noticed but you get what i mean the least time method so in this example uh what we are using is the least time directive with the header parameter in order to determine which server responds uh to the request the fastest uh this is calculated value based on the average response time on and also the active connection so besides the header parameter you can also use the least byte and in flight parameters so more details about this specific method is also available on our website and i'll ensure that i include a link on the comments below now this takes us to the session persistent side of things now the easiest way to enable session persistence with nginx is by using the sticky cookie in this specific method nginx plus adds a session cookie to the very first response from the upstream group and it identifies or tags the server that send the initial response so the client's network next request back to the server contains the the cookie value and nginx plus routes that specific uh request to the same upstream that responded to the very first request so in this example what you see uh out here we're sending a sticky cookie uh with the name of srv underscore id and we've defined a few optional parameters that of the expiry in this case it's one hour uh the domain it applies to and the part it you want to consider all all the uh the expire the domain and the path they are optional parameters another way to configure uh session persistence is by using sticky route in this case nginx plus assigns a route to the client when it receives the first request all subsequent requests then are compared to the route parameter of the of the server directly to identify the server which the request is proxy 2. so the right information is taken either from a cookie or the request uri and also there is a lot more details to this specific method so i highly recommend that you go to the website and read more about it on how to configure this specific method similarly sticky learn is uh is a way you can configure nginx plus uh where it finds the session identifiers by inspecting the requests and response and once again more details on this specific method is available on the website and i'll point you to that some other things to note uh when you're considering load balancing with nginx is that that you also have a max connections parameter which you can configure so max connections parameters sets uh the maximum number of concurrent connections a server can handle so here we've set back in one to have max connections of 300 and with the max connection parameters what you can also use in conjunction is the the queue directly and the queue directly what it does it places the unprocessed request in a queue where the upstream exceeds the maximum number of connection limits so for example if server one in this scenario was serving say for example 400 active connections 100 of them would be put in the queue the timeout parameter here for the queue parameter controls how long the server waits before sending a 503 error to the clients in the queue so that's from a slider perspective when looking at the configurations this is all i have now it's time to jump onto the demo so let's go ahead and do that so what i have here is a very very very simple setup i've got ubuntu virtual machine over here i've got a browser which is from a local machine so this is a vm running on my local desktop laptop engine x-wii will give me the latest version of nginx i've got running on this specific box and what i have are a few configuration files already pre-created so if i do a ls on here you'd see that i've got a dashboard.conf loadbalancer.conf and a webserver.com i've got a github repo link to these files in the comments below as well so let's go ahead and look at these configurations file configuration files so the dashboard stock standard very simple dashboard once again is only available with nginx plus you cannot configure or view a dashboard with nginx open source so dashboard running on port 8080 if i come out here let me find the ip address of this specific box here where is it here cancel that come on close that here come back to the browser here here it is already serving a request now what we want to see is 8080 that's where we have our dashboard perfect this is what we want to see at the moment ah we have nothing configured let's have a look at the load balancer configuration excuse me your alice again so these are the files we have uh before i jump in there let me show you the local applications which i'm serving from this box here and i'm serving it from the web.conf so if i hit web.conf you'd see that the configuration for my web servers in this case the same instance of nginx which is acting as a load balancer is also acting as a web server and serving these three static files in this case it's serving three files in port 9001 9002 and 9003 and what i've done to make things simple for some testing which we'll do a bit later on is i've inserted a header parameter called custom header and for each of the applications of uh static or inserted the name application one application two and application three respectively so i'll show you uh when we'll use that feature now let's go ahead and look at the configuration we have for the load balancer and this is my cat lb.com so very very simple uh load balancing file i've got three upstream servers in my case i'm serving them from a local machine so localhost or 127.0.0.1 in your case this could be all sitting on a different box or they could all be sitting on three different vms or three different endpoints i'm listening to port 80 on my local machine example.com and um setting proxies at header all this information is essential to try and forward the request from the original requester to the actual backend i've got the proxy pass directly here pointing the traffic to backend servers which is over here why am i not seeing this service over here ah that's because i don't have a zone directly defined so let's go ahead and fix that we need a zone director over here save that close that interesting let's try and see excuse me aha there we go we've got hdb upstream perfect this is exactly what i was intending to see now you've got the three endpoints over here and you can see that w which is weight and by default we haven't defined a weight for it so the weight is standard one i can manually go in here and change the weight from here as well edit the selected uh server selecting will help and i can change the weight from here i can set the next number of connections over here as well and all this can be done from the dashboard at this stage it's not going to save anything it's going to save it live and the moment i'll reload the configuration all that data is going to go because i don't have a state file which i'm writing this to but that's very simple if you need to write and you wanted these changes to be persistent you write them in a state file and they'll be saved but we'll leave that as is for now and let's come back to our configuration so let's look at the config now what it's doing right now is we have not defined a specific load balancing method so by default as i mentioned earlier it's going via brand robin and if you look at the server weights we haven't defined any weights over here so the weight it takes by default is one one one one so if i was to go to the ip and hit the box and hit refresh a few times you will see you'll see that the application one two and three uh the load is distributed evenly across those three so you can see out here uh the number of requests routed to each endpoint is exactly equal now at this stage uh what i can do is i can apply several weights to these endpoints uh i can make it persistent by making a change in the config or i can directly do a change from the dashboard here itself so for now let me just go out and do a change in the dashboard itself i'm going to add the selected server and i'm going to make it the weight of five and click save over here click ok and now from the slide bar which i presented earlier you'd have noted that now i've defined the weight of five over here so the way engine x is going to determine uh where to route the request is by adding the total total weight so five six and seven so five out of seven requests would be routed to server one and only one and one or uh one request out of seven and one request out of seven would be routed to two and three so let's just go ahead and test it over here and we should be able we should be seeing a lot of application server one when we hit refresh and as you can see most of the times it's application one and to try and test it out what i'm gonna do is run as simple so i've got a terminal window open over here try and do a curl on the application server and there you go so this is where the custom header field is going to come into play i'm going to try and run a simple test over here if i was to simply curl on that endpoint uh we'll get a lot of html information and i'm just going to use uh fill i'm going to filter that out and just capture that specific header field so for that i'm going to run this command here close that and there we go so application one run that again and most of the time the result should be application one so out of seven five times it should go to application one and only one and one should be application two and three so it's quite evident over here let me just go ahead and run this in a loop so that we can see the changes happening live so put this here and i'll also include the script in the comments below so there you go now it's uh running in a while loop application one one one two one three one and as you can see more waiting on application one so i can come out here and change this on the fly we'll make this one save it and go okay and now you'd be able to see that it's already changed you're going one two three one two three one two three a lot more and the requests now are being routed equally so 34 12 12 35 13 13. perfect so we've seen how the weight across these endpoints work now what we want to try and see is a couple of algorithms so let's look at the hash screaming based on the request uri so for that i'm just going to drop this over here come out here and insert a simple change over here where i'm going to insert hash based on request uri save that here close that here serial engine express s reload so it's reloaded the traffic over here and now you can see the curl requests out here based on the request uri a lot of this information is routed just to application three because it's only going to the slash location so if i come back out here hit refresh a couple of times you'd see that because it's selected application three and the request is going just the slash most of these requests would be served by application three over here because that's what it's selected at the start same over here and same over here uh let's try the ip hash method where it uh wraps a request based on the client's id address so let's test that out so i'm just gonna come here comment this out here save that close that save engine x s reload and now suddenly you've seen it's gone from three to one uh the new excuse me the new algorithm has been applied and now uh the application once picked up that specific request because it's associated application one with the ip address which it's requesting from if i come out here and hit refresh a few times uh my ip is routed to application one at the back end so you can see it's same over here because it's running from the same machine say myp so it is uh being served by the same backend application server which is exactly what we want and that is exactly what we have configured now let's try out the the least connections method however in our case it's not going to work very well we can just show it to you working but because i'm just running it on a local machine i'm not going to really have a production like workload or anything so you can just see that request would still be routed to all three servers anyway because uh the other other two servers i'm not really going to have any load on them but let's just save that here so you print an a reload amount here hit refresh a few times there we go one two three one two three one two three so it's intermittently changing between one two and three based on how the requests are being routed mind you i've also got another terminal window uh running or firing off requests on the same backend so if i come to the dashboard overall what you'd see is the requests are still spread very evenly uh even in the least connections method because we don't really have any other uh performance test or a real load of traffic hitting these endpoints awesome so now this takes us to uh the example of showing how session persistence work and the example we'll look at is the sticky cookie example so let's look at that come back to our load balancer configuration look at this over here comment that out and the director we use for that is sticky cookie right here hit this so we're doing sticky cookies so idea let's just call it something cool uh what do we call it we call it lb and it's called example so this is the name of the cookie lv example and it expires in one excuse me expires in one hour alright let's save that here let me close that there engine express that's reload you always run nginx t before but i am a cowboy and i'm running this in my local dev environment so if you're running it in production please run engine xlst before you go out and push the change out but we push the change out and now let's go out and make a request so now this request is only so this in the bottom over here because it's coming from curl there is no way that can actually store a session cookie so that request is being routed to all the servers at the back end so one two and three because it's coming from co however the request coming from a browser you can see that is being routed to the same upstream server let me come in here go to developer tools and see where is that cookie cookie magic refresh here this is the request and beauty there is the set cookie and we call it lb underscore example and there is a cookie and it expires friday it's given it a time so we set a time of expiry in one hour and there it is in all its glory beauty that's exactly what we wanted to set so perfect so the sticky cookie works for us using this example okay so look uh in terms of showcasing an example and live configuration this is where i'm gonna stop uh i think this is enough to give you guys a hands-on look and feel of how load balancing works with nginx so give it a shot in your dev environments any questions reach out to me on j.bessai at f5.com but it's been a pleasure until next time have a good one you
Info
Channel: NGINX
Views: 60,240
Rating: undefined out of 5
Keywords: NGINX, NGINX Plus, NGINX Controller, NGINX Load Balancer, Load Balancer, NGINX API Management, NGINX Unit, NGINX Open Source, NGINX OSS
Id: a41jxGP9Ic8
Channel Id: undefined
Length: 30min 3sec (1803 seconds)
Published: Thu Jul 01 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.