Finding memory leaks and CPU bottlenecks with Node.js debug tools - Vladimir de Turckheim

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
oh yeah right now we are waiting for the next speaker and his name is buddy mr. Tuerck I'm yeah we don't see you you don't let me know it's better so tell us maybe a little about yourself and we are good to go okay so I'm letting me a nacho cam I do a nutria security full-time for a company named screen I've been developing with nodejs for the last six years now and that's a technology I'd like units and two years ago I've been really humbled and her not have been nominated to get nodejs commit rights so for the last two years I have been a no js' collaborator okay good so I suppose you may start and as usual after the whatever speech we are going to have a zoom chat where you can ask him oh well you can ask him any questions you want and I'll send a link to the chat so for now yeah you may start wish me luck thank you very much so let me start by sharing my screen and it should be good awesome thanks so let's go my name thanks a lot for coming today I will give a talk about fighting memory leaks and CQ butter like in OGS because it's an important topic and I realize that most people learn they need to know about it when it's too late I have enough people who ping me and ask me about these topics when they start to have issues in production and they are usually in panic so the goal of this talk is to give use basics of worst case scenario debugging so if and how never happens to you but if it happens to you eventually you know what should be the first step to save your Prada as mentioned and letting me know check em Kappa world about screen screen is tools that provide security for web applications so we protect nodejs application as long as our technology is it's really easy to install so if you have any any production in the server in production you might want to check what we do that's probably interesting and it's pretty easy to use and we block attacks and detect attackers also feel free to follow me on Twitter I'm trying to get more for a while then my CEO I'm 104 shots so please help me win that fight disclaimer this is my first online talk so first time I give a talk from home first time I will do this kind of talk and I will do live demos so anything can happen according to Murphy's Law let's hope for the best but I planned for the worst what is this talk about that's a good question this talk is about application crashing maybe slow web application probably will stack trace we will see what it is about and unique home because I'm locked down in my apartment my only friend is this stuffed unicorn so I figured that she wanted to see my only friend for the last few weeks we'll start with the first part of the talk which she is probably the most important one which is intro to the chrome dev tools and I won't even tell you about who they work and how to use them in this part I will tell using only things that's important how to connect nodejs instance to the chrome metro so let's start with a story something weird is happening in your application it can be anything it can be application being slow it can be application crashing it can be anything but something will happens and you want to know what's happening and you have no idea you just have a few symptoms that may be logs that may be Turing solution telling you things you don't know about it what do I do then that's probably when you need to know about the chrome dev tools they're available in Google Chrome and they use a communication protocol named the chrome dev tool protocol this protocol is all documented online it's pretty interesting and this protocol enables you to debug anything that happens sorry in Google Chrome meaning you can debug the Dom the H channel page or the eight and basically when you open the developer tools with right-click and inspect on Chrome that's what happens it opens the client for the chrome dev tool protocol and right now you must be wondering what I am talking about chrome because this is a talk about nodejs in an odious track and you're right but no js' one of the main component of nodejs is the 8v8 is the JavaScript engine engine that transgenic scripting - no js' chrome and Chrome no meaning that when you run node.js all the JavaScript code you write is actually interpreted by this piece of software developed by Google named v8 and that's a virtual machine where our JavaScript Rams it's responsible for calling the function garbage collection handling the lifecycle of every objects so we will use the chrome dev tools CDT in short to debug the v8 instance often OJ's process so indeed there is no J's process to use here and you can't just take any no J's process and connect it to the dev tools you have somehow to tell the node.js process a I want to use you with the dev tools would you mind to collaborate with them please there are beautiful ways to do that and the easiest and most straightforward one is to start the process with the flag - - inspect so instead of start senior process with note several OGS you would start your process with note - - inspect several GS and at this point you will see two more logs to your usual application the first one will be debugger listening on WebSocket and for help see HTTP nodejs the dog documentation inspector what's interesting here is the first line of log debugging listening on wetsuit cats localhost 90 - 29 this means that no js' process is now opening a WebSocket server like you would do with second attire WS module but this this WebSocket server actually is expecting someone to talk to it using the chrome debugger protocol meaning that if you've got any WebSocket client which is basically an upgraded up HTTP connection and you know the chrome debug protocol you can start to debug the instance and that's why the chrome dev tools do they connect through WebSocket to an instance of v8 that listens to this WebSocket and before we go to the Devo of this part I wanted to show you another way to tell nodejs to go in debug mode what if the process is already started because you know what previous example we started the process with the - - inspect flag but sometimes the process is already started and you cannot restart it let's say you are in a cloud in the kubernetes environment and you don't want to change your deploy mechanism to make it visible or maybe it's a view it's an issue that you want to debug an issue that happened only on certain condition and you are not sure were there so when you detect it an instance you want to jump on it and start to debug it but you can't start all all your applications with the debug flag it has basic performance impact so what you want to is send a signal to the process to tell it hey you are running in a usual setup can you transfer and become a process running in a debug session so I totally apologize right now for all the windows user this tote is really oriented toward Linux or UNIX for this equipment so if you want to tell running nodejs instance to go in debug mode you first need to know its PID its process ID the unique ID of the process on our way under machine of the system so you would use the PS command in UNIX PS aux that gives you all the info you need and usually I use it with pipe Greg node so I only sees no J's instances running on my machine that gives you the results through multiple lines with the one you see the second chunk of code which reads like video track m60 35 a lot of members then load 7oj s basically video check M is the name of the user owning the process in that case it's my user on my laptop then the number is the PID of the process the unique ID to communicate with this process on the UNIX machine then you've got that type of the CPU usage to memory use it and eventually the last thing in the line note several GS is actually the command to start the process and then we will kill it no just kidding to send a signal to process in UNIX you can use the kill command and pass an argument to tell which signal you want to send so here we will use kill which we will test use - usl-1 meaning don't send the kill signal don't kill this instance but send the signal that actually is named us Ellen that is defined by software and in no js' usl-1 means let's switch the process to debug mode and then you put the PID so the system's know the system knows to which process the signal must be sent then you will see in the log of the application the log saying debugger listening on WebSocket mellimelon meaning that the process is actually turned into using the into going into debug mode which is what we want then when you have that you can open the chrome dev tools and going in the URL chrome in google chrome and chromium chrome : - - inspect and I think it's time for the first live demo of this talk but because this is a remote talk and I'm trusting my internet connection and I still want to take some risks and I want I want to you know like without any safety net I will go by doing that on a remote instance of nodejs so instead of just starting a process locally with dev - inspect what I will do is I will connect on Heroku and taken in sense that she's running normally in a Roku and start a debug session from my laptop to the instance running in the cloud let's go so can you all see my screen here I've got a hero quark application it's a stand on the OJS application I did not go on the free plan because I wanted it to ran when I did the demo but otherwise it's a standard application so it just opens remarks because we want to see them here you can see the app to start it if I drew a couple query on it I've got more logs like they have been a great request and /name /features /justification 2 different see and really no GS user and as you can see started and it's not written that the debugger is listening to anything and if I go to chrome inspector doesn't detect any stairs so what I need to do first is to connect to Heroku in SSH to the cellar I need to connect to the server where the application is running through SSH to do SSH on Heroku it's actually pretty easy you will need a tool named the Heroku toolbelt that's basically the CLI for Heroku and then you may run a command name PS : exactly it basically says means in to Heroku that Heroku specific it says hey can you start sssssh session between my laptop and one Dino of this application and then you put - - app to tell it which application you want to connect to of course I have logged in into Heroku with the cni first so he knows Eric who knows what my user account is so when I do that and now it start to get risky because I'm doing things on remote server in a talk things you should not do it says hey you are connecting to web dot well it is the name of the Dino if you have multiple dinos you need to select which one you want to connect to on demo are three tools at the name of the application I created just for this talk so remember we want to do PS a you wax because we want a list of the process because we won't read it if I the process of the node.js process so here we've got a list of process first of all we've got the user account with PID CPU memory memory memory feelings and then the command used to start the process so we are looking for comment starting with node and you know it looks like it so we'll commence with notes everywhere looking for node 7 oj s so we check the PID at the PID is 52 so now we can send the signal so kill us a 1 and then 52 and thing happens okay nothing happens but maybe that's expected let's see the logs of the application so if I go into the log of the application and here oh cool I can see that there is a new line of flog debugger listening on WebSocket localhost 92 29 the URL for help go up documentation so yeah we managed to change the state of the tedious process it went to running in a usual way into running in debug mode meaning you can connect now the inspector to it but here it's still not available in the remote targets and that actually expected so let me log out so there is something interesting in that remote target can listen on any URL you give it but I can't just give the URL of the Heroku instance to Chrome because Heroku does not open the port to the world world the tornado reports in recruitment to the world world are the HTTP port so I cannot use a WebSocket client to directly connect to the server what I need to do is find a way to make this port on the remote server available on my local host on my local machine and thankfully there is a solution through SSH for that and Heroku exposes it in its current client - it's called port forwarding so what you want to do is to forward the port 9224 um Zee remote instance to your laptop so you use the command PS : forward and then the member of the pot so if I do that it will still take a few seconds to connect and then it tells me that it opened a server and what 92 29 that is forwarding to web that one remember that the name of our Dino 92 29 that means that on my own laptop and my own computer my macbook hat there is currently the port 9222 is open and any connection I would do to that will be just for one did through SSH to Heroku and they would not need to know anything more so now when I get to Chrome and see that there is a node.js instance that is available once again I'm on Chrome : - - inspect and hey I see that there is emoji is instance connected to my to my local computer so if I connect to it and I try to run in the core so I try to get what is the hostname of this instance and it tells me it's all this thing and that's not the hostname of my laptop in the host name of the Heroku server that is running on AWS cloud so I actually managed to connect remotely through the deft pool and I can no browser source of the code get the memory get the profiler information run anything I want in the console I can do anything I want through this instance from my laptop let's say I want to go back to Heroku lag I see that the consulate red hello I did is available directly in the logs of the instance so when I've done what I want to do on the Hiroko instance I restarted because I don't want it to stay in debug mode for performance reason and that's the first part of my talk with the live demo of connected to Heroku it was the hardest demo and we survived it so whoo there is an alternative way to connect to to use to connect to a debug session it's using the inspector module the inspector module is a module available directly nodejs so you do require inspector entry be available and it's actually a client for the chrome dev tool protocol it enables you to run the dev tool protocol on for instance if you want to debug no js' from lab Diaz or I don't know valve v8 if you want to debug nodejs locally remotely like an oval nodejs instance or even chromium chromium you can debug from the GS with this combo do so if you want to write CL eyes or tools that automatically detect memory leaks or cpu bottlenecks with node.js on another node.js application that could be a follow up for this talk so now let's go to the practical path because we know in theory how to connect to node.js application with the debugger and now we need to know what can we do with it and the first example will be debugging memories so let's say you are checking in your monitoring tool and you see that the instance the server restarts a lot so things you are a good developer with develops you decide to check the logs and the logs are actually really really helpful because you see that the application crashes with this logs ok this logs may be a bit complex at the top we've got two section name last few GC for garbage collections then we've got a section in named J a stack trace we expect some JavaScript code that there is at least no JavaScript coordinated C and C++ code so it's not really helpful and there is actually one line that is extremely important in this log and that we really want to see in this one fatty Nero ineffective mark compact the heap limit allocation failed JavaScript out of memory the important part the part you want to know about is JavaScript out of memory this means that the process crashed because there were too many things in the memory so there is probably a memory leak or two big memory exhaustion we need to know what's happening into the memory of the process and we know it's a JavaScript memory which is to be so we want to understand what's inside so we can understand part of the code generates too much objects that screw the memory and that the garbage collector cannot fix so let's correct what's called the hip dump what so hip damp glad you asked a heap dump is a file showing the contents of the hip memory of the ate basically it contains all the JavaScript memory of the process it might contain sensitive data let's say you are comparing passwords and you and you decipher them or before you hash passwords if you don't if they are not garbage collected there will be there every piece of memory of your process will be available in a heap them but you will Co to make them a little bit safer and my math teacher in high school used to say that example is better than speech so let's go for a second lefty mark and I expecting you all of you to be cheering the top settlement people around your being surprised so let's go to this live demo let me just start a local server and here it is so in the previous talk you were introduced to autocannon and me I will use another tool that basically the same thing that autocannon with named ABI fuh but she match basically it makes a lot of requests on an HTTP server so I use the - argument for concurrency and - - an argument for number of requests this means can you make 15 requests in concurrency until you reach what 10k request on this server so let's do this so now it's running requests we can see it's already halfway done I will use this time to drink a bit and it's done and we have the percentage of requests within a person time okay that's interesting but that's still not a memory dump letting me stop lying about what you're showing okay okay I will be shrinking so here I started my application with the - - inspect flag I can see that it is actually running and available in the network so I will open the dev tool and I will go to memory okay it's pretty empty what we want to do is we want to click this button that does take a hip snap shot but before that we want to call this little garbage which means can you please trigger garbage collection meaning can you remove all the content of the memory that is actually removable that's actually pretty important because first of all we can remove secrets that are not supposed to be leaked and we can actually remove noise because in the fan we will correct that it will be written as e so we click on the button to correct a heap snapshot and it takes a bit of time and that's pretty much why I did that on a local instance because if you do that on a remote Heroku instance it takes like three times a tangle that so the first thing we need to see is if you see my mouse is the size of the heap snapshot here it's 70 megabytes and we have all its content so inside it we've got the content of everything in the process sorted by constrictor so for instance we know that they are like 4 143 K objects in the process and we have 15 instances of set we have lots of URL inside the process ok we can check the columns quickly because they are interesting the first color is actually the size of the item so if I take this object I know it has a side of 24 bytes but it's its own size it also has what we call a retain size the return size is the size of all objects that are actually that can't be garbage collected because this object 73 for table has a pointer to them meaning that if you are able to garbage collect this checked you will be able to garbage correct this amount of memory and the second part of the panel is interesting too because at the retainer meaning if you touch an object you can check every object that has a pointer to it and prevent it from being garbage collected so for instance this object I don't know what it is I have no idea what it is looks like it's a library okay it's a library and it's held by lot on the Titan I mean another object okay so we are still trying to help a potential memory so the second thing we will do is we will add more traffic to the application so we run a B again that's enough and we will correct a new snapshot so this one is a bit longer to connect because there is even more things in memory because remember is probably memory leak is that example otherwise my talk would be useless so it takes a bit more to compute for the truth so let's see the size you see it's almost 100 megabytes we were at almost 170 megabytes once again we don't really see directly what's leaking so what we do we use comparison and when we do comparison with snapshot 1 it actually tells you what objects are available in the new snapshot that we are not in memory in the first one so we have new currents we have number of new items number of deleted items the Delta items and we also have the size Delta for the items so what we want to do when we find look at our memory is find the biggest size Delta's so it's probably not checked because Korean arrays are internals of a v8 so we see that we have lots a lot of new objects so let's check these objects ok we see that we have a lot of objects that are 2088 in size that probably the same kind of object so let's quickly look inside okay it seems to me an HTTP requests we can recognize it because it has the class incoming and you need to know a lot about node to do that but it's pretty current with the fact that I've got also a lot of incoming message in my memory an incoming message and server responds you better know them in node.js and there's a name rec and res rags HTTP requests is an instance of the incoming message class and rest the HTTP response where you will write a response is an instance of server response so I see that a lot of them staying into my application and that's actually really really weird as if they were kept in memory so let's click this one and try to find its retainers so we can see that it has a retainer here and each time we've got items we have the line of code where they are created so this is not internal this is not internal okay this doesn't look like node eternal marriage literal this looks like user could could of my application so we click on it and it will actually bring me to the source code of the application and here I see I've got an express controller and each time there is a request getting inside it calls get object on the rack and get the UID back so let's see what get object thus it creates a new object with as a first parameter of the request and then already and then it adds it to a store but I don't see any code where things are removed from the store so that's where the memory leak happens we are writing in this store but we are never removing from it so here we've got two memory I will just quickly show you something really interesting that could have made a real life even easier so I just benchmarked another endpoint which does exactly the same thing but why with one small difference is that it created the object differently I will explain you right now what this means when the snapshot is available it will take a few more seconds and that's why live demos are dangerous because you lose time in your touch remember and now we do comparison with snapshot - and we can see that there is a new class named item and there are a lot of them like 2000 of them so if I click on it I think that this is a class that is leaking that the object instantiate it from this class are leaking and I already have the stack trace here to call so it's like exactly the same code as before but instead of creating the object which just bracket we create the object with a class and it makes our life way easier in terms of debugging because we directly see what is leaking inside the application because the HID lamps are sorted by path sorry I stopped sharing my screen that was not what I wanted to do so let's go back to slides and no let's move to CPU bottlenecks as a tribute apart so let's start with another story you see in your monitoring tool that your server takes a long time to insert an endpoint and don't forget that in nodejs in the list of best practice the first one is always don't break the event proof actually no js' is single Freddy meaning that if you've got one single synchronous operation that takes a lot of time your server will be blocked doing only that and it could jeopardize the whole application so when this happens to debug it what we want is a CPU profile what to see if you profile it's a face that will tell you which functions are running at one point so during the correction time the profiler with sample well every n milliseconds the name of the currently running function and way it's running this will highlight a function that are slow but the fastest function won't because it's like a ticking system a different function runs quicker than the teaching system it remission it will follow the JSA execution meaning that I think reduce operation will actually have an impact on it and once again I think it's better to go directly with a demo so let's go back to application I will run turn HTTP request and a certain endpoint and I used the time method to know how long it takes and it looks pretty slow should be good now please don't do my effect by running forever come on no it should be known now okay so that's a total do more effect and something had to go wrong so it ran for 22 seconds so let's reduce the size of the string okay let me do something quickly okay better so now it should run in approximatively seven seconds no yes seven seconds okay so now i just restarted the instance let's connect back to the debugger and this time I will use profiler profiler it's not a one-off situation profiler you start it and you studied because you recall what's happening over a time frame so we will just start it here we start CPU profiling we will trigger a request because we need something to happen why we are where we are profiling so you need some activity otherwise you won't see anything we wait we wait and now we stop the profiling and it gives us this weird file that we call the frame graph so I will just look up to the very first part of the frame graph please do so a frame graph actually is a stack trace with time it tells you here the defection on connection as could the function of circuit that has called the duplex constructor that is called a writable constructor that turns world a cold event emitter and then you see there isn't a synchronous operation so the stack trace is broken and we need to go there it has called a mead that's cool all these factions that have been called we have another a synchronous operation and we call that and we've got the time so here I can see that it has been running from 3.5 KVD it's account up to 12.5 came in seconds and I know that the functions here I've been running for a very long time what's important is the function at the bottom or at the top of the frame grass so in our case it's at the bottom because it's a top-down frame graph so once again this function has been calling this function that has been coexist function and the until this function has been calling this function consider this function is named module that export and that's probably not a great name if we want to debug it probably would have been better if I gave it a real function name but at least this function has been calling a function then validate input so let's click on it and here Chrome's tell you hey this function has been running for 10 seconds 90 milliseconds it's probably really slow and that's actually these slides that has been really slow and if Iran or anyone who is used to nutrious security is looking at this code what they have seen is that this radius is actually what we call backtracked red X meaning it contains readers read exodus regs dose is not the tour goal of this talk so feel free to check it what you need to know is that it's a reg X that can read feel very very very long time and since it's running for a very long time and it's Josh Krypton C Cruz so Sarah is blocked on running this reg X this exact reg X for a very long time cannot do anything else so to fix this program you either have to update the red X or time moment I recommend to check the module name VM the core of node setting about you - time bomb synchronous piece of code that's really powerful but highly unused that's a great height so thanks to the CPU profile we were able to detect where the leak of CPU happened and here you can see they look very very small because the other function has been so slow we have the method that occurred for entering the HTTP requests okay let's go back to the slides because the talk is almost finished now please share ok so let's wrap up I hope you all love the chrome dev tool know because they're probably your best friend when something goes wrong so in wrap-up I usually give a few advices to the users we have famous lights so the first advice I can give you is monitoring monitoring metering if you remember in both night scenarios I gave into this presentation they started by monitoring gathers something weird monitoring told us the application was crashing monitoring told us the impacts we're slow monitoring monitoring monitoring you have to monitor performance you have to monitor for security you have to belittle for infrastructure because otherwise you don't even know the first thing about the health and the security profile of your application so no one use classes don't create objects out of the world quick parenthesis all these tips works not only for node.js but any JavaScript application because the method of debugging I showed you today are also available for front-end application in the browser you can't connect to the remote but you can always stop at a web page and open the chrome dev tool for this page so use classes I know it's the popular amongst react people but use classes it will solve your life save your life in time of memory nice named functions naming functions is the best way to find it easily in the CPU profile so here I created dummy code examples and it was pretty easy actually to find the problems but in the real world you could is much more complex and made any applications that don't do anything so name things give them class names and you will be saved don't panic and trust the engine v8 is like one of the best piece of engineering ever made and you can trust it to save you and the tools around it to make it up for you you mine write code you man make mistakes this will happen in production for you maybe not today maybe not tomorrow maybe not in node.js maybe in Titan maybe in Java but these principles are the same for all languages all platform this will happened you find instance you can read it's a great blog post about from the discount team that explains I switch from good to rest because I had a different memory management model and they had to understand the memory management model of good to understand why the application was slow also if you're interested in the performance of nodejs you can check the blog post by Netflix they've got one of the most amazing team in the world regarding these topics debugging and that's the last quote of this talk debugging it's like being a detective solving a crime but you are also the criminal or if you are not a criminal it's one of your coworker so when your iCard don't forget to write it as if the person will have to fix a bug is a maniac with no way you leave and also when you write code think about not dying when thing will go wrong because that's where you will use time you will lose money and you will lose customer thanks so much for you to follow your attention I assume you have been an amazing crowd but it's really hard to say I'm sure you have him so let's keep in touch you can email me or contact me on Twitter very prison to Twitter it's really hard to give swag on the remote conference but I tried to so there is no js' security checklist it's a free checklist of the is the thing you must check if you have an application including using node.js for security check it it's an scrim plug the slides directly online so feel free to group up them and here is the link for the Q&A I think it's under chat what I sent to the chat so I suppose people already seen it and I loved your spontaneous jokes really thank you so much is because I understand how hard it is to speak to basically no I'm you know I you try to mingle it and it was amazing so thank you for such an amazing speech you did a great job I suppose yeah I suppose you may go to the QA room wait till people will go there and ask you some questions thanks so much for having me today thanks so much for being an amazing person have a great day everyone
Info
Channel: Geekle Official
Views: 2,731
Rating: 5 out of 5
Keywords: node.js, node, debug tools, debugging, CPU bottlenecks, memory leaks, geekle, online, conference, sqreen
Id: F_qshjijxlE
Channel Id: undefined
Length: 42min 59sec (2579 seconds)
Published: Tue Apr 07 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.