Traefik vs. Nginx performance benchmark

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video we're going to compare nginx and traffic reverse proxies we'll measure CPU memory usage as well as latency and traffic we'll run multiple tests including routing plain HTTP traffic from the Upstream service https which uses HTTP 2 protocol and jrpc to be honest I was slightly surprised how they handle traffic at the breaking point especially I didn't expect huge differences in handling grpc compared to regular hdb traffic let's go over the setup first we'll use terraform to create VPC and provision a few ec2 instances as a backhand for proxies for the first two tests we'll use goaling fiber framework to create a simple HTTP API for the third test we'll use ajrpc framework and create a simple RPC service then we'll configure nginx and traffic to proxy those requests to the clan and we'll use the k6 testing framework to measure the latency of course to collect Matrix for CPU and memory usage we'll deploy parameters and scrape the targets now for the first plain HTTP test the client initiates a connection with the proxy using HTTP 1 protocol then the proxy terminates the TLs connection and creates a new one toward application that's why if your application needs to know the actual IP address of the client we need to add it to the header on the proxy side otherwise our application will see the source IP in the request of the proxy not the client then proxy establishes a new TCP connection with the service also using HTTP 1 protocol next when we use https from the client we can actually upgrade HTTP 1 to the HTTP 2 protocol HTTP 2 is almost always established using the secure https protocol then proxy terminates the https decrypts the payload and sends it to the application unencrypted message using HTTP 1 protocol finally when we use grpc it also can use HTTP 2 protocol but the most significant difference here is the connection between the proxy and the application it can also use HTTP 2 protocol but using a different implementation which is called h2c which does not require TLS to establish HTTP 2 connection as always for each new video I upgrade all terraform providers libraries and third-party packages so if you want to get up to the code and be up to date with all new technologies in the cloud space subscribe to my channel to create rest API we use goaling and a fiber framework with a single endpoint that returns 10 devices in Json format now jrpc uses product buffers instead of Json it uses a binary format and typically can be almost twice as smaller as equivalent Json message to create grpc service first we need to define the proton message we have a device message with uuid and Mac address and firmware version then we need to create a request object for the request for example we can ask for a specific device using its uuid finally the manager service is equivalent to a rest endpoint in some way it accepts the request and Returns the device to use it in the code we need to generate a goal code based on this definition to do that you can use protacy compiler and point it to the Proto definition then we get the device struck to represent the hardware devices as well as get device service endpoint and Associated methods to create grpc server jrpc framework also can load balance requests has a retries mechanism and other useful features to create a grpc server we want to use generated code and Define the get device method in which we simply return the device that we created in the init method now let's take a look at how to route all these requests using nginx for the first test we'll Define a server block to listen on Port 80 and match api.ng Nick's hostname when we get the request we forward it to the back and go link application using either its IP address or a hostname for the second https example we not only need to listen on Port 403 and use SSL option but we must include HTTP 2 to force nginx to use HTTP 2 protocol by default even with TLS it will use http1 since I use self-signed certificates I need to provide them here and forward a request to the goal link application by the way here you can add additional headers to include for example the IP address of the client to forward to go link you can obtain certificates using let's encrypt automatically if you want for that you need to install servebot it will detect the server block and convert it to TLS finally we Define the server block for jrpc service here I also have self-signed certificates but to forward it to backend we use jrpc underscore path directive by default it will assume that your Upstream grpc service does not use TLs but if with us you need to include grpc as prefix to your Upstream URI next is a traffic proxy first I want to enable API dashboard but do not expose it to the internet in production environments then we Define the entry points similar to listen directive on the nginx proxy we want to expose parameters Magics on Port 8082 then Xbox Port 80 for HTTP and 443 for https notably the traffic proxy is capable of using the same 443 port for both grpc and https when in nginx we need to have two separate ports when we use TLS let's disable Telemetry and version check this file is called Static configuration in traffic terminology to discover Upstream Services it has a bunch of providers but we'll use the simplest one which is called a file then the config file which is called Dynamic configuration we can configure traffic to watch that file periodically as well now when it comes to parameter those metrics we need to be very careful when defining histogram buckets since I tested it locally on average each request takes about 900 micro seconds I'll Define corresponding buckets here in one of the previous videos we created a custom parameters nginx exporter to extract service metrics from the access log I will also use the same histogram buckets there you can find it in the GitHub repo next let's take a look at the dynamic configuration of the traffic first of all we need to define a service where we want to forward requests then create a route with rules to use https we need to include a TLS section with certificates or use built-in plugin to automatically issue certificates from let's encrypt for https we also need to create another route and specify the TLs section to indicate that we want to terminate TLS on the router and forward a plain HTTP request to the Upstream service find finally similar jrpc service if you use TLS on the jrpc itself you would use H2 instead and we also need another router to forward their PC requests with the TLs section to terminate TLS you can find more examples in the folder to reproduce the setup not only for benchmarking but for day-to-day operations to run tests we'll use key 6 load test tool for both HTTP and grpc in the first test we'll spin up 5000 virtual users in parallel to simulate a high load for 5 minutes for proxies I use g3a small ec2 instances for the goldlink Upstream series I use t3a large and for the client where we run tests I use t3a extra large a similar test I have 4 traffic with the same parameters just different URL let's go ahead and run both of them in parallel what I noticed immediately that the memory usage of the engine next stays on the same level but traffic usage grows much higher then it looks like the CPU usage is almost the same but by the time we reached 4 000 virtual users the traffic latency skyrocketed to 400 milliseconds compared to one or two milliseconds for the nginx it looks like nginx can handle a larger number of requests without degradation in performance but actually the handle load entirely differently at the breaking point now I know that you can optimize and configure this proxies for a specific workload but the majority of people even in large corporations will use default settings as I do here in these tests now if we switch back to the terminal we can see that traffic tries to serve every single request that it gets even if it causes huge latency nginx instead drops requests to keep normal operation and keep latency low that's the most significant difference in how they handle High loads if you wait till the end of the test we can see that the maximum CPU usage for the traffic has increased to about 82 percent when the CPU usage for the nginx is only 76 and of course a big difference in memory usage at the end of the test we can see that P90 and P95 for nginx are way smaller as a reminder P95 for example represents 95 percent of all requests that completed under this duration so for nginx P95 is 130 milliseconds while for the traffic it is 445 milliseconds the huge difference is mainly because nginx simply drops the requests when it knows when it's overloaded what I found is that custom primitos exporter actually consumes a lot of CPU because it needs to process all 5000 log entries every second and parse metrics from them on the large instance it could be fine but for these tests it makes a huge difference now let me disable Prometheus nginx exporter and rerun the test since means we're not going to get latency from the nginx anymore let's make CPU and memory graphs larger we still can get the latency from the k6 client which is actually much more accurate since it's calculated on the client side let's rerun the test without exporter now you can notice that the difference in CPU usage of the traffic and nginx is significant the blue line is for traffic and the green line for nginx at the end of the test we see similar proportions between nginx and traffic but in this test nginx didn't actually drop any requests next let's run the benchmark test for https it should be a more CPU intensive test so I limited the number of virtual users to only 1000 from the 5000 from the previous test in this test proxies must decrypt every single request and then forward them to the Upstream series the memory usage is lower just because we decreased the number of clients but the CPU usage difference is still significant between nginx and the try traffic when it comes to terminating TLS Connections in the results you can see that nginx P95 is only 3.5 milliseconds while for traffic it is around 16.5 milliseconds finally let's run the test 4 jrpc both proxies still need to terminate the GLS connection and then use HTTP 2 protocol to forward requests to the Upstream jrpc server here I was surprised because we have the same 1000 virtual users but it takes much more CPU to process the equivalent number of requests compared to just Json over HTTP 2. even thought RPC messages are much smaller here are the tests plain HTTP with parameters exporter then the same test but with disabled parameters exporter https that requires TLS termination and finally grpc well I wasn't surprised by nginx performance but I didn't expect that grpc would take so much poor CPU to process than quests maybe because in HTTPS test I return 10 devices as an array but in grpc I return a single device but since k6 distribute load similarly during the 5 minutes the grpc should even take less CPU to process requests since the requests are much much smaller maybe I need to run more tests in the future to compare Json over HTTP 2 and jrpc if you have any ideas for Designing the test please let me know I have another video where I compare a gold Link versus rust and golink versus not Jazz thank you for watching and I'll see you in the next video
Info
Channel: Anton Putra
Views: 30,938
Rating: undefined out of 5
Keywords: nginx, traefik, Nginx vs Traefik, Traefik, Nginx, traefik vs nginx, nginx tutorial, nginx reverse proxy, traefik tutorial, monitor nginx with prometheus and grafana, monitor traefik with prometheus, prometheus, grafana, aws, ec2, terraform, devops, sre, public cloud, anton putra, nginx performance benchmark, traefik performance, traefik performance vs nginx, terraform aws, aws terraform
Id: bgcfEW_Yh7E
Channel Id: undefined
Length: 12min 37sec (757 seconds)
Published: Sat Jan 07 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.