Tech Talk: Server Scaling in Node.js

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey guys so to this afternoon I'm gonna be talking to you about server scaling and nodejs so there are a couple limitations with the way that we have our basic node.js apps right now and we can actually address these pretty easily without MPM installing anything so this is actually already built into node and you can do this into pretty much this afternoon if you want it it's really easy so let's just talk about a couple of limitations that we have so far so here's your project your stack athon it's awesome everybody wants to you know come and look at it and you have all these different like tons of HTTP requests coming in all of a sudden and your server can't handle it so because you only have by default notice single threaded it's asynchronous but it can only handle so much at once second bottleneck you have some like really computationally expensive endpoint and your API is doing a lot of like maybe image processing or something some other algorithms it's running and it's gonna hold up a lot of your further HTTP requests that are coming in because it's waiting for this to return because it's blocking so that's another bottleneck that we can address and the third bottleneck maybe it's literally a bottleneck on your server it just goes down you can't do anything about it you've got to restart the whole thing so that takes time so we can actually address this issue as well well actually let me rephrase it if it's all on one machine maybe this situation doesn't actually you can't actually resolve it with the way I'm going to tell you but abstractly you know if if a version of your server goes down there's a way to build some resiliency into your programs so here again just to review the three bottlenecks we have high number of requests by default we have a single threaded process that's gonna block when and just have a lot of issue processing it know does pretty well with concurrency since it actually forms out a lot of you learn via a select the i/o operations which are slow it puts in the event loop and that's handled by some threaded database possibly but that's not like the earth level at the JavaScript level you still have that single thread which again in the computationally slow end point you're gonna run into that limit so and server failure read when it's a single thread single process and it goes down there's no backup for it again just a little breather rate we have a single threaded process but we have these nice expensive laptops here that you know you spent a lot of money on and we bought all these eight cores and we want to be able to take advantage of it because if you actually run a node process and you fire up your Activity Monitor you're gonna find actually that only one of your CPUs is running at full tilt if it's you know continuing churning and again it's not like Web Apps servers necessarily are like just grinding away like 100% CPU power because you're mostly processing HTTP requests and you're not doing anything crazy like a big compute but you might so it's just something to be aware of right so let's just talk really quickly that's my little animation concurrency versus parallelism and it's a little bit of a technicality but technically rect is the best kind of correct so here we have two processes on the Left we'll talk we'll see them running and currently as note actually already does and just in general like any multiprocessor since I'm not gonna put a date that's gonna be wrong but for a very long time they're all been single threaded anyway but they all work concurrently and because you're running multiple programs on your computer even though it's it can all be single threaded there are many programs running at the same time and versus pedal on the right so the difference is that again concurrency means that many things are running at the same in the same period of time while parallelism means the two things can be happening in the exact same moment if that what that means is that for example these little dots represent some instructions for this particular program right so here we have maybe the CPU runs the first two and then it runs decides it oh okay it's it can run the next two of the second program I'll run that one and I'll just like switch back and forth and you won't really notice necessarily a difference in that two probes aren't literally running at the same time whereas in parallel we are actually getting both each clock cycle of your CPU it's running two instructions or multiple instructions at the same time or and these don't even have to be on the same computer like this could be two separate computers like two laptops or a cluster of servers for example and that's you know that's the way it's been done for a long time you just buy more computers buy more servers all right so let's get to it first way to scale the application to deal with HTTP HTTP bottlenecks is with this cluster module in NPM 9 p.m. I'm sorry node which don't actually have to go to NPM for so we can just do this with the closure module again and it does this what it's actually gonna do is basically just copy your server just take the program and just copy a whole make a bunch of copies and there's gonna be one process that's called a master and it will just by default use a round-robin scheduling algorithm which just says like it'll do the first well it'll HTTP request come in like it'll just say the first one goes to the first one the first worker the second will go to the second worker and just down the line and so on and I'll just loop back around and that's it does okay for what you want to do so let's just see in the code what that looks like if I can get this to not explode so here it is with a here's a basic Express app on the left where you just have again nope with no cluster it's basically what we do in every other app we've done this you know in this boot camp where we have you know you just hit some end point here like on line 12 I just simulate some work just like do some stuff and then sends it back just you know process a bunch of routes pretty basic and in our cluster what we're doing is actually using as we require this cluster module and then here what we're doing is just going to say oh if the cluster is the master it will do this cluster dot fork and fork off a number of CPUs and you can query for that by using the OS parameters or just type it in manually you could put as many as you want but really there's no there's not really any benefit to scaling more than what's actually physically available and then if you're working process you basically are just the HTTP server so you just can do basically everything that was in this similar in the single threaded app so why do you want to do that just so here's the demo of that situation right so here let's do no cluster here's our single threaded server just like the server's listening great so I'm going to be using a HTTP benchmarking utility called siege so it'll just simulate a bunch of concurrent connections coming in so just running 100 concurrent users my body prepared a script so save typing just run for 10 seconds you probably want to benchmark longer than 10 seconds but essentially it's like alright that's kind of the performance we get for say your basic note server right so now if we do npm run with the cluster okay if you can type cluster there you go okay for Kim's CPU so now you can see that it's actually copying that serving and just running 8 copies of it and now over here if we do the same test 10 seconds oh my god eating my time say there we go you can see the performance difference in having a single one on the left where the transaction rate we're getting 448 and transactions per second on the left on the single threaded whereas with the 8/4 of 8 you can get almost like over three times performance and it's a I mean you have to look at user specific app in the actual benchmark your actual app but you get a lot better throughput on well enough to put technically but yeah I'm running out of time so you want to get the other two real quickly okay right so here's our other second bottleneck is computationally expensive endpoints where you just have some huge calculation here and it might not even be like necessarily a frequently hit in your API but once it hits it then it's slowing everything down behind it and it's just holds off holds up like everything on your server if you have the compute on your HTTP server as we're you know in our naive case so to fix that we're going to use something called child process and what that does is where is it I'll just show you the one that with fork so basically here's a slow on computational endpoint and here on this line I'm just save time not going to show you the normal version it just trust me that there's a normal version of this and so we see that we just call the fork fork function which the computation that we want it to run so instead of doing it on that we just start a separate process so let's kill that no Forks okay so say I know I'm out of time but I just want to show this last thing dancing okay so that's with forks now forks okay let's do it with you are same you can ask me after class about the details of the actual benchmark but just to save time let's just do it ten seconds so you can see on the left once it hits a computation so endpoint and you only have one thread handling that you only processing 54 transactions we're on the right you're able to process over about 400 so and it's really easy to do I actually want to I was gonna do a demo but I'm out of time right now to show that actually I just took the author workshop it literally just put in the cluster and just like forth it off and you can get the same like kind of performance if you benchmark that server availability again there's a like a demo whip but you can use the same cluster to listen a register and event listener for any workers that kind of just crash unexpectedly and when you detect that you can just spin off another worker or forking off another worker so it can continue to handle your HTTP requests so to go more faster than that probably just buy more hardware use it for an in server that faces the internet like maybe nginx or Apache or something to actually does something maybe more sophisticated take advantage of multi-core you can use child processes to call native libraries if you have more computationally expensive endpoints whatever you want to do it's gonna depend on your use case but there are options for you so let's just leave it at that good thanks [Applause]
Info
Channel: Fullstack Academy
Views: 64,477
Rating: 4.9206533 out of 5
Keywords:
Id: w1IzRF6AkuI
Channel Id: undefined
Length: 12min 5sec (725 seconds)
Published: Mon Sep 25 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.