The Hidden Cost Of GraphQL And NodeJS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
the hidden performance cost I forgot that word existed uh of node.js and graph quel okay so no. JS and graph queel are popular Technologies for building web applications but in my experience they come at a certain scaling and performance tradeoffs to be aware of I wait hold I thought bun made made JavaScript faster than rust so what's this thing he's talking about incorrect take bad take uh graph qu's modular structure generally leads to code that instantiates excessive uh promises which degrades per request performance benchmarks show as much as 2 to 3x latency increase yeah yes about 3 weeks ago I tweeted that promises are actually really bad for performance okay no. JS is known for its non-blocking IO operations yeah that's what everyone keeps telling me if you say something like this to the modern JavaScript engineer if you say the phrase don't use a promise you will literally get people to be like you mean you're going to write sync code isn't that slower and it's just shocking that there's not a middle ground like Oh you mean you want me to handle this with without using you want me to do async without promises it like it doesn't exist in the brain at this point it's shocking all right all secret is work and no JS happens over an event Loop thread other than a few isolated multi-threaded features like worker threads and of course garbage collection largely happens async if you read the what is it oroo oroo what's the name of it or Oro oroo something o o something something something is the name of their sweet garbage collector it's awesome uh with the event Loop is managed well and the IO is uh is a true bottleneck no JS can uh be very efficient scalable technology in general JavaScript is very efficient if it doesn't have to do a lot it's like your classic way to fix things how do you if JavaScript isn't doing a lot then of course it's pretty fast what would be a sync without promises in JavaScript a call back I this may be hard to comprehend but taking out a few promises taking out a few promises and going back through the uh event Loop can actually really help people forget about continuation passing style people forgot about every I know it's are callbacks really a sync if your date never does uh does them shut shut up oh my goodness uh on the other hand if a request does a lot of processing on the event Loop it will block other requests on that container no JS applications are particularly susceptible to sporadic performance issues due to noisy neighbors a lot of the sporadic performance issues actually typically come from garbage collection uh others he let's see other heavy request handlers that overlay consume uh consume the event Loop additional graph quals resolve like like if you run on a single core so if your instance is single core yeah concerned about performance don't use no. JS it's really that simple uh but if your if your instance runs on a single core you get massively hampered by garbage collection right a garbage collection is so efficient these days because it can run on more than one thread additional graph quals resolver structure can result in more promises overhead compared to rest end points which may cause suboptimal user perceived latency if not managed carefully Fair graph quel oh no it's that thing here continue reading just let let a man read graph quel enables a modular design for apis for example we Define a type in our schema and Define the one resolver for that type regardless of where that ver appears in the graph user query query the user this modular design is great for developer experience but leads to promise heavy code okay yes we know uh each promise adds a minuscule but non-zero amount of work for the event Loop which is discussed here to demonstrate let's say we want to write a feature that retrieves a user's item let's see that retrieves a user's items wow so many s's right there on a shopping site we might build a rest endpoint like this user items details this would be power powered by a few squeal queries get some user get this you know typically maybe maybe we would do a join you know typ you know I always say we should just write squeal instead of using an omm I might be mistaken uh a well structured rest endpoint would have some relatively simple code that makes these database queries and massages uh and massages the data back into a desired format we would have no more than a few promises involved and resolve in the request life cycle in graph quel we would encourage to write query like this user items ID details item other fields if we have a well structur let's see if we have it well structured as graph quel resolvers we might have type resolvers for users and item details yep yeah by the way this is called creating a chatty protocol by the way so for those that don't know chatty protocols it's where you start making a bunch of small requests to something else and so chatty protocols tend to have like a Distributive uh degrading performance problem right because obviously making one small request not a big deal not chatty not with a d chatty right not Giga chatty no not no one would say that unless if your protocol is very chatty uh what executed a graph quel query with nested Fields uh will result in a promise per field being created such as user get user item user and items const ID other fields get item by details oof as we use down data loaders to prevent the N plus1 query problem this would translate to the same squeal queries as we described in rest endpoint case so the io cost would be as optimized as possible but we would create one promise per item in the loop and each promise adds work to the event Loop a promise per field is pretty crazy the funniest thing I've seen with graph quel is the fact that it ultimately ends up as JS API common queries command F if we only had something like that if only we had some sort of structured language in which we could make queries with that's really what I think people want you know structured queries language I'd be squealing for it slq oh yeah let's call it slq structured language query I like it I like this TJ we're on to something you know a lot about language servers we could develop one together I've written a benchmark of gra quel server that returns user to demonstrate the impact the overhead increases as we increase the number of promises involved we choose two graph quel servers Apollo server plus Express and MC Mur mercurius mccarus mercurious mercurious oh I'm so curious mccarus the common mccarus uh by the way Express is actually the worst framework ever created like I understand it was the first but Express is so bad at performance it is shocking how bad Express is like Express just doing basic requests not a hot take there's nothing hot about that take it's crazy uh my Innovation would be that instead of putting select before uh before from you'd put the from you'd put it after from uh and squeal and and S would be so oh dude it'd be so craigm from this field select that out I do agree that actually is the superior way are you saying that Express is not Express at all dude it's not it's not express is absolute dooo and I'm not I'm not saying that the people who invented Express are Doo I'm just saying the performance that has been created in Express is Doo okay because it is it just is all right the benchmarked uh queries return the same data but one wraps uh every field response in a promise and the other returns data synchronously we return 100 items per user number of let's see number of users return sync user uh milliseconds okay uh let's see data loaders plus promises okay I'd like to investigate this more because I've done some playing around with this and it can be really bad this is very interesting though like look at this that's pretty wild this is pretty wild mous oh I'm so mous uh okay I mean again I I never trust other people's Benchmark numbers but I think there's at least something to be said that it's much much larger right that maybe maybe it's not maybe it's not good maybe it's not great but this shows that there's a huge disparity and that there's probably a big problem there we see that wrapping each user an item in a promise causes two to three increase uh in request latency an invalid criticism here is that real world graph quel resolvers perform IO so the overhead will reduce significantly as a percentage of time taken by the resolver a well-tuned database can perform two squeal queries to return turn 10K items in less than 100 milliseconds you could just do one baby uh which is reasonable small percentage of high okay I'm I'm actually a little stuck on that whole two squeal thing I'm really stuck on that I'm going to let it go everybody we're letting it go together everybody in chat say can we just all Quote is it Elsa can we all Quote Elsa right now and just let it go let's just let it go let it go Cho all right an invalid criticism here uh is that the real world graph queel resolvers perform IO so that the overhead will uh will reduce significantly has a percentage of time taken by the resolver a well-tuned database can perform two squeal queries to return de gay items in less than 100 milliseconds which is a re I mean all of this doesn't make any sense like this phrase doesn't make any sense in general right uh which is a reasonable small percentage of high latency caused by the graph Quil server here regardless of Express or mous uh real world code is even Messier we might check feature Flags or perform other async work in a resolver which further increases the number of promises the event Loop has to Pro uh process all right so can I give you a quick reason one reason why the event Loop can be a little bit difficult and people don't really understand why it's bad can I give you a quick little understanding of it uh let's just let's just talk about this so the event Loop how it effectively Works looks something like this okay uh let's go like this let's pull this thing oh gosh I'm I'm I'm I'm no master at uh at excal draw but we're okay at it all right so the event Loop does something like this right it looks it looks a little something like this where it uh pulls next task off q and check micro task Q right or empty microtask Q so you can starve your threads by having this so if you have a bunch of microtasks or things that run right away you can kind of starve yourself right and so you get you know it can be bad it's a loop if one would say now what does a what does the task Q look like well every time you do something like set timeout what will happen is that there's there's a linked list that exists somewhere and every time there's a set timeout it does this right it adds another it adds a item to the list so let's say that you let's go like this I'm going to put four items in here right we're going to put four little items in here there we go and let's say that you are this item and you are the first to be executing okay you get this beautiful chance to be executing and the rest of you are going to be red items all right and let's say that you are going to do a promise and This Promise actually resolves synchronous cached work okay so all you do inside of your little promise is you have like a little async Funk uh that checks for some sort of cached value cached value like you know if cashed return right you get the cash value it returns it back out pretty simple so what that means is that when you do this you go check your cash value all right you got a cash value return it well what's going to happen promises resolve next tick so this guy's going to be uh let's see let's go like this let's take this guy let's take him off the next time you get to run is now at the back of the queue so now this person's going to run this one's going to run this one's going to run and then now you're going to run again and you're going to have another chance that's how the process that's how the event loops work work and so what ends up happening is really simple items like this you throw an async on a function that doesn't need to be async guess what it actually is it will run significantly slower because you could have a bunch of people in uh in line ahead of you a ton of people they cannot resolve in they cannot resolve in the they can't resolve uh in the in the like immediate queue if they resolve in the immediate queue you would you could starve yourself you could sit there and just starve yourself over and over and over and over again and so a call back right like so if you use like if you use a callback to know when say you write out to um oh transaction stay open too yep it's going to stay open for a while so if you use like uh a call back to write a file to disk or you use a promise what you get is when the thing is done right when writing a file is done it calls the call back that call back is called synchronously right so it stays you you maintain owning the process event Loop for the duration of that call back so you do your extra work and then you write more to a file and until that file comes back you're not going to be called but once it's done it's called back and then you now are at the back of the queue and once you hit it then you can start using again so this is the big inefficiencies with this stuff right because if this takes if each one of these takes say very little time a half millisecond whatever right each one of these takes a half millisecond but there's also another half millisecond in between between each one of these to be called to the next one or whatever it is a quarter millisecond whatever it's going to be you could add in an extra you know three four S milliseconds of just lag and so every single process every single time you go back to the process tick it happens again it happens again it happens again it happens again and so this is why you can all of a sudden get these huge amounts of latencies is because you just happen to keep going to the back of this queue over and over and over and over again so you know understand some things understand why these things can happen because it is really important uh it can really add a bunch of stuff hey IND different ghost how you doing all right let's diagnose the problem it's useful to diagnose this problem in certain operations first we should confirm that our application is actually blocked on the event Loop no JS exposes a handful of perf hooks to measure event you Loop utilization oh I didn't know how much of this is true I haven't played with any I haven't played with their uh I didn't realize that noj offer some hooks for that that'd be kind of fun to play with I'm going to have to play with that uh next we should confirm that our event Loop isn't blocked by code we control in my case I confirmed this by inspecting CPU profiles if the event Loop is occupied for more than 50 milliseconds with no obvious culprit in sight the culprit is likely in the runtime okay fair uh next next we can confirm how uh promise heavy our code is through the following code snippet each graph qu qual operation should increase the number of prom has created and give us a clue how promise heavy is our code all right so we're going to do a little async hooks hooks create hook AIT something type promise count plus hook enabled oh interesting I didn't realize you could do async hooks like that that's kind of interesting uh another practical approach to determine whether the event Loop is a block or is determining the difference between client reported database query latency and database reported query latency for example I mean this is actually a a very true Source right if you can query a database and you get certain laty uh latency then you query your application and you get very different latency you got some things you can at least make some judgment about obviously the hard part is where are you located where's the database located where's the database located in comparison to your application versus where you're located at you have to take a lot of those things in uh for example I notice that the client side reporting of certain database queries is often greater than or less than 100 milliseconds no greater than 100 milliseconds sorry um even though we were making an index query with a table with less than a th000 rows as expected we couldn't replicate such a slow performance when manually quering our database The Slowdown was because the event Loop was overwhelmed after making database requests so even though the database responded to certain requests very quickly the web application did not get around to processing the responses until after a significant delay if you've forgotten why there's a significant delay remember the graph that I showed you and remember every single time in a weight happens it does it once every single time a DOT then happens it does it once so if you dot then dot then dot then you will have three going back to backof thee line operations since asyn a weight only affects request throughput in certain promis heavy conditions or most open source code is not heavily optimized to prevent unnecessary promises this example graph queal Shield is one of the most popular graph quel off libraries assumed every field resolve resolvers async there therefore it constructs a promise for every field in the graph quel response which further amplifies the number of promises created in this life cycle of a request that's crazy it's crazy typescript and JavaScript do not prevent developers from unnecessary marking functions uh as async yes this is true so we need that we need es uh es lint rules like require a weight to avoid unnecessary async O8 calls dude I literally found a performance problem in some code and I kid you not it was because a function was marked as a snc that was not a sink it increases garbage collection it increases time it takes right it's a it's it's actually a real problem it's wild yeah exactly promise explosion equals memory explosion which equals more GC interrupts dude it's wild an accidental unneeded acing function can add milliseconds to uh to response times which means I me think about how many acing functions you could go over right all right APM and Promises actions per minute uh everyone's favorite thing us star Crafters uh finally we can in Let's see we can incredibly slow down promise execution if we use async hooks a deprecated but widely used no tojs feature async hooks help us track asynchronous resources for example a tracing Library uh might desire to track a request across callbacks and Promises unfortunately any code we import May rely on this feature and can Auto enable it uh DD Trace data dogs APM library and likely many others uses this feature to provide traces across promise executions o o when your tracking Library slows down your entire universe that's not good this makes me want to join Marvin uh H mist and start optimizing op Source libraries it would be a it's a it's a pretty big win the thing is is that often a program's not slowed down by a single issue right oh that by the way that's my favorite tweet which is J I'm gonna I'm gonna tweet that I'm gonna I'm gonna tweet this this is it triggers everybody which is my favorite thing it's like my favorite kind of tweet is the one that nobody wins let's do this nobody wins here nobody wins boom post it oh my goodness it's Annie if you don't know Annie you don't know about Twitter okay you don't know about Twitter I've have tweeted this a few times some version of this oh man it's the it's the greatest oh it's the greatest people lose it people lose it all right anyways fantastic uh async with hooks yep none of this is surprising uh obviously adding tracing to anything you do asynchronously of course by its very nature is going to cause a huge slowdown what is this Twitter it's the place you go to so when you go to x.com you actually get redirected to a place called twitter.com so we all use Twitter I don't know what this x thing okay I don't know what this whole X thing you keep talking about okay I I don't know what it is you guys keep telling me about X and that I keep using it wrong yet I keep going to twitter.com I don't understand why you guys keep telling me this okay I know X is going to give it to you just hasn't given it to me yet okay I haven't got it yet I'm waiting for it all right anyways okay we see that it just gets worse obviously we see that async hooks at significant amounts of latency I mean it's no different than like say four each for an array four each obviously adds latency or adds processing in compar in comparison to a for Loop totally reasonable right uh does this article uh discuss resource pooling uh no they don't do any of that because it's not about that it's just about promises which I think is really great this is a great topic by the way data loaded no asyn hooks data loaded with asyn hooks async hooks overhead log scale yeah I mean it makes sense that we roughly see that asyn roughly adds 3 to 3 and a2x overhead to resolvers data dog Engineers are diligently working to reduce this overhead by contributing to no JS and va8 features however improvements in this area are critical to get uh right and take time to be implemented uh in general we want to reduce the overhead of promises and reduce the number of promises we uh we invoke reducing promises overhead to reduce promise overhead we want to minimize promise inspection of features yep reducing the number of promises to reduce the number of promises inol we have a few areas to consider we could remove the use of graph quel middle layer let's go let's go uh especially the ones that assume every field is async we could also rewrite graph quel queries to use fewer async type resolvers just rewrite it just rewrite it bro Fork it just Fork it um a single resolver that manually queries the database and returns all the data needed for performance sensitive queries uh rather than relying on graph quel to hydrate nested Fields type resolvers uh by the way just rewriting things is difficult it is difficult it is very hard to be able to see the the problem about easy is that easy is hard do you know what I mean easy is hard easy is truly hard and so this is like graph queel gives you the promise of easy I know I see the pinned message fine Omega La Omega La Twitter the Facebook same energy if it's the same energy why does x.com take me to twitter.com okay what am I supposed to call it I can't read did you put a spelling joke in there you know I can't read you know I literally can't read okay you know I literally cannot read um all right what kind of joke is that oh I'm making fun of prime for not being able to read what a loser can't even read what a loser thanks I guess dislexia is a cool thing you can make fun of now you know out of all gosh I should be able to say okay DJ I'm gonna quit saying I have dyslexia and that I'm neurode Divergent and then guess what when you make fun of me you're making fun of neurode Divergence and that is pretty offensive TJ I mean that's I I would say we're probably in cancel territory disgusting absolutely disgusting shook hashtag not a safe space American Education Kona I can't read Kona can't even blame him k k say I'm making fun of the US it's okay okay you're lucky you did get by by making fun of the us so we could write a One-Shot resolver that implements the entire query okay uh wait find user by ID dude I'm just so triggered by this I am literally so triggered by this chatty protocol find item details I am so effing triggered why why do you got to do this return all the items Items Map cons details find you create so okay so I'm gonna I'm going to say something completely different I want you just to look at this for a second okay I want you to look at this and you want to ask yourself why why why does my endpoint have so many large latency spikes why am I garbage collecting all the time let me just like regardless of the fact that you're doing two queries like this let's just talk about something different first off in a wait obviously causing a promise which causes multiple callbacks which cause a whole chain thing to be set up blah blah blah blah inefficient causes memory do it again causes memory you also create an object right here not only do you create an object you also have an array you copy the array boom okay so you create an object and an array and you create a closure and you create a Lambda function okay you create four pieces of things that have to be cleaned up after that you return an object okay you return an object inside this object you do yet another map over items okay so now you're at seven objects in here we're going to do a find a find creates two more yay then you're going to return an object that has an inner object that creates the the details by ID and then which literally is details by ID I want you to look at this it's details by ID and then creates another copy of the details you you copy the det details it's like 12 15 pieces of memory so every single time that's called something has to go and like collect all this right there's so much going on here this is massively this is ma you know it's just like a lot this a lot of memory this is why I get triggered so easy by JavaScript because it's so easy to create memory it's so easy like it's so easy you could just do it you could just just create it all day on accent and then garbage collection is wild it's like 15% of your application if you're on a single core machine it could be well over 15% instead of multiple batches of promises we fetch the user items and details in one shot that brings up the meta question why use graph Quil in the first place but that's for a larger conversation for a separate time okay anyways uh I do agree with the whole graphql do you really need graphql what are you buying out of graphql I understand the the benefits of graphql I did literally right falor I still think parts of falor are a good idea to this day but I also see the downsides of this the easiness to create chatty queries the fact that we're looking at two select things that clearly should be one right like you see all these things that end up happening when you break up your API into these really fundamental small little pieces you can always accidentally create hyper chatty protocols and this is a great example of those hyper chatty protocols and so it's emotional it's emotional right you know but I love the point of this article which is promises are they they cause so much more overhead I really wish this article would have went over this because I I mean I didn't do a really great job I didn't do a lot of Justice here on this but this is really good to think about is that whenever you do something and anytime you resolve you go to the back of the line so if you have 15 requests per second then you are literally potentially sitting behind 15 requests every single time you do a promise right it could be really hard it could it could be a lot depending on how many queries are you know how many things are running at that time how many promises are running internally if you do a promise. all you still have all those being added to the queue right so it's not just a singular promise a promise. all could have you do it a bunch of little times as well anyways just something to think about the name I really wish I didn't concern myself so much with memory but it's an emotional bruising situation and sensitive topic okay a Jen
Info
Channel: ThePrimeTime
Views: 156,493
Rating: undefined out of 5
Keywords: programming, computer, software, software engineer, software engineering, program, development, developing, developer, developers, web design, web developer, web development, programmer humor, humor, memes, software memes, engineer, engineering, Regex, regexs, regexes, netflix, vscode, vscode engineer, vscode plugins, Lenovo, customer service
Id: i0YfiQlzv6M
Channel Id: undefined
Length: 28min 34sec (1714 seconds)
Published: Fri Oct 20 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.