All you need to know about "child_process" in Node.js

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
node.js has this very cool module called child process it basically lets you spin up new child processes from your current process for various different reasons for example maybe you want to create a new process where you want to do some heavy computation so that it doesn't really affect your old process or maybe you want to do something completely isolated there can be many reasons and we're going to take a look exactly how to do that and also we're going to talk about the difference between worker threads and child processes because I already have video on worker threats and these two concepts are very similar and there are some implications for both of them so if you're ready and curious about Learning child processes let's get started okay friends so before we look at the code of child processes and how to actually spin them up and use them I would really suggest going to the Blackboard to first of all understand the difference between worker threats and child processes because I'm pretty sure you're already asking this question yourself so worker threats first of all in worker threats we have a process and the process has a main thread of node.js and we can also spin up additional threads like this one this is no longer going to be a main thread so this is just a newly spun up thread and of course they can communicate with each other so threats can communicate to the main thread and exchange information and you can achieve this by using the worker threads but in child processes what we have is what we're doing is we're spinning up new complete process so the new process is going to lift side by side to the old process so literally here or maybe let's let's make it like put it a bit down because it is a child process of the main process so it's not equivalent so it's going to live somewhere here and all of them have main threads as well all right so what are the use cases then so whenever you create a new process here you achieve a complete isolation so this process is completely isolated from the process that we had before so this is the process that we had before and this one has its own memory has its own place in the memory all right the operating system treats is treats it as a completely separate process so what what means is whenever this process crashes for some reason it's not going to affect this process that's already running here so it's completely isolated and it's basically suggests that whatever you the logic you put here it should be also isolated as much as possible from all these three different processes compared to worker threats worker threats are running within the same process so they should be related and if one worker threat crashes it already affects the whole process so when you need to complete is when you need complete isolation or want to utilize external processes you're going with child processes choose worker threads for par parallel computation within a single process so if you simply want parallelization between within the process then you go with worker threads second resource utilization child processes typically have higher resource overhead compared to worker threads due to process creation and management so as you know creating a new process is not so easy it's not a piece of cake so it is some computation that the operating system has to do so you got to be careful and use worker threads like here if you simply want to do some parallel computation and not necessarily achieve isolation with processes all right and complexity child processes are simpler to implement and manage compared to worker rats and well obviously you can guess why which require careful handling of shared memory and synchronization so as we said these guys need require synchronization because they are within the same process and tightly coupled while these guys don't need don't need much synchronization there's a way to exchange information between them as we're going to see later in the video but still they're isolated meaning there shouldn't be much communication overhead which leads to less management from our side all right now let's go back to the code and see what this child process is all about so there are three types of two ways of creating a child process the first one is called exec and you can import it from the child process module what exec is going to do do is it literally accepts a command line a command not a command line so a command is going to be string and of course you can pass a call back as well so the command is going to be this so LS basically as you know is just going to list all the files that we have and the call back is the following so we have an error the standard output and we have a standard error as well so the error is going to happen when there's an actual error standard output is when we console log something out and standard error is going to print it when we actually throw an error ourselves so let's run this exact and let's see what we're getting in the console so I'm going to say node index.js and we're going to see that it simply lists all these um files that I have in my directory now this didn't spin up a new node process because what we're doing is we're using exec but we're simply running a a command all right in the command line we're not targeting any node process here or node script okay so this is one way but a more common way that you would see using exec is actually uh the second type called exec file all right so the exact file looks a bit different in which we are having a path resolver to resolve this file name which also lives here exact file processor this is how I called it it and we're going to use node we're going to give it a path and of course do all the error handling if needed but let's look at the this file exact file processor what's going to do is it has a simple counter up to 1 million and it's going to conso loog process processing entri is finished when it's done so if I save this and I run index again uh we're going to see that it did the processing and it finished so basically whenever you want to run an external script a note script you can use exact file and if I go to the documentation of child process we can see that actually um it's it's a pretty good documentation so you can come here and and see what different types of options you can pass here so you can pass an encoding option you can give environment variables you can give a timeout a buffer is you can also use a signal to use a board signal and so on so it's pretty good all right so we're going to go back here just to have a good overview and I bet this one is understood so we're simply calling a separate file now there's another one this one is called spawn so what is the difference between spawn and exact because so far it looks very similar so in a spawn we what we're going to do is we're going to use find also a command line script all right nothing nothing big no rocket size and and we're going to get some data so we're saying on data we're going to print this out on error we're going to print the error out and someone so let me simply run this guy and show you the difference so I'm going to run this and we're simply getting finding all the files that I have in my in my directory very simple but there is one big difference so as you can see this one is kind of event driven this one actually listens for the data event so apparently the spawn whenever you run spawn again you give it a command but it returns an event or it registers event so it you can listen for errors you can listen for data all right it doesn't give you a call back right away like in exec here you have to define a call back but this one listens for different um events which kind of signifies that so in the spawn what we're going to have is we are going to dealing or it is good for dealing with large amounts of data all right so if you have a lot of data here obviously if you use exec and invoke this uh script that deals with a lot of data uh you're going to freeze your application you're going to freeze your main thread especially if you don't have any worker threads all right so you're going to freeze your main thread while you're reading all of that or or doing all of that computation while with spawn this is kind of event driven meaning it's asynchronous so it's going to get the data in chunks one by one so on is going to take this chunk and then one second later it's going to take another chunk and print the chunks uh one by one well it printed everything um in one go because we really don't have that much data but it's really good for for example dealing with um as I said already with big data if you're doing Network requests and of course you can spawn like a a web server with with spawn and listen for the events that are incoming maybe file upload and so on in a different process which also can be good but I'm not going to show that because you can easily do that just put a file name in here and uh write node and and file name in here and you're going to create your node uh new Express server and listen for the routes just the way you would do in a normal way all right so and the last way of creating process is the fork and this one is also very interesting so Fork looks very similar so we have a fork processor and what it's going to do is it's going to listen to message and um it can also send uh send a message so this is the one that can actually communicate with each other as I said if you remember some of the processes uh can communicate each other and this is done when you are actually using the fork so Fork is here so first of all you can listen for a message let's go inside the fork process you're going to see that we have this process. send so this is going to send the counter to the parent process and you can also send this hello world object into the child so the child is going to also listen for a message and say message from parent process all right now let me save this and I will go and ex this and as you can see this is coming from the parent and now all these counters are coming from the child how cool is that now let's talk about some of the implications all right so as we saw or as we talked about whenever you have you're dealing with uh large data make sure to use spawn because then you're going to be utilizing event driven an event driven uh architecture and and asynchron and syn asynchron I cannot pronounce that word um basically you're going to avoid blocking your main thread um so that the the exact like this one doesn't hold all the data in the buffer and doesn't slow you down the second one is related to security all right whenever you're dealing with uh child processes and I also mentioned this in my previous note security best practi this video don't really don't put the user input into your child processes first of all don't invoke the the processes based on some user input don't let the user in user specify the direct path to your file and also don't send any data to your child process from the user input the hackers can really hack this all right and I think that's pretty much it we learned everything that we needed we know the difference between worker threats and child processes we know some of the implications when it comes to Performance and security and you're good to go guys I will see you guys in the next one if you like the video of course smash like And subscribe so that you don't miss this kind of videos and good luck in your coding Journey
Info
Channel: Software Developer Diaries
Views: 4,034
Rating: undefined out of 5
Keywords: software development, software developer, programming, software engineering, javascript, web development, coding, nodejs, nodejs child process, nodejs worker threads, nodejs async, nodejs parallelism, nodejs event driven development, nodejs scaling
Id: C1v4MXGhpcM
Channel Id: undefined
Length: 12min 37sec (757 seconds)
Published: Wed Apr 17 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.