Node.js Stream Tutorial - The Power and Simplicity of Node.js Streams

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
q so much for inviting me to speak at FSA conf 2016 I'm really just thrilled to be here and I'm here to talk to you about how to use nodejs streams today we're going to talk about streams what are no giant streams I seem to work in though just fine without streams why would I ever need them and most importantly what happens if you cross them this is the only question that I'm not going to be answering today I really have no idea you probably shouldn't try it but seriously folks why do streams matter streams are basically a pattern that convert big operations in a manageable chunks they're really a core feature of node and they're the only way that we have to manage async IO operations efficiently and they're totally necessary for running a server at scale but fortunately they're also really fun this is a 10-minute talk so I'm definitely not gonna have time for a deep dive into streams instead that my goal today is just a high-level survey of what streams are and how they can be useful and at the end I'm gonna give you a list of resources and tutorials that if you're interested you want to work with them a little more it'll teach you how to use streams in your own workflow so I want to start with a pretty simple code example this is just a basic vanilla node server all it does is it uses FS read file async to load up data.txt from memory and then serve it as a response using res and data it may surprise you to learn that you're actually already using streaming when you run this code and that's because res dot end is an HT HT TP method which uses streaming to communicate over the Internet because that's just the best way to communicate right because things happen in small chunks rather than all at once nevertheless there's still a lot of room for improvement here because what you're doing is you're loading data txt in its entirety into your memory before you begin streaming it to your users if you have lots of users with slow connections you're going to be loading data txt into your servers memory thousands of times simultaneously and also to boot your you are not going to see anything for a couple of seconds while you load data that txt into your working memory before you start streaming it down to him it's pretty lame right fortunately there is a better way and it's a pretty small difference but instead we're using FS create read stream this is much better because now we're streaming not only from us to the user but also from the disk up to our working memory also our users are immediately going to start receiving little pieces of data as we read it from working memory rather than waiting for the whole thing to be loaded up into our RAM so I'd like to pause here for a second and talk about what I think one of the primary confusions of streams was I know it was for me is well it's an async operation so like who cares right like notice concurrent it's going to take care of all of this and I don't have to worry about it the fact is that even if we as programmers and I'm definitely guilty of this myself think of async operations it's just happening sometime in the indefinite future it still has to get done right it's still a job that your server has to take on so streams ensure that when that async moment arrives our async operations use memory and bandwidth as efficiently as possible or in the words of James Halliday who's kind of one of the big heroes of node streams make programming a node simple elegant and composable I think that sounds pretty cool so let's talk a little bit more about how to actually use it so this is a pretty simple line of code but we're going to be using it a lot so take a good hard look these are both streams we're taking a source stream which is something readable like our data txt that we were reading up little piece at a time and then the destination is a writable stream such as an HTTP response you can think of these as streams as a success as a succession of garden hoses that you're kind of hooking together as needed to perform arbitrarily complex tasks hence the word pipe right we're like piping from one stream to another this is a lot like that pipeline character which we looked at during foundations in the UNIX operating system so that's a lot of the beauty of streams is that you can have small simple streams that do one job well and then hook them together to perform really complicated powerful tasks streams come in three delicious flavors there are readable streams and writable streams and duplex streams which are both readable and writeable and a special variety of those which is transform so let me just talk about each of those really briefly for one second readable streams can act as the source but not the destination in source dot pipe destination remember that we're mostly using streams for refining our async operations and usually that means i/o so this readable stream source is our input it might be on disk it could potentially be in Norway anywhere but it's something that we're going to take and read and do something with a great example of this is FS create read stream writable streams are the opposite they act as the destination but not the source such as FS create write stream is an example of that duplex streams are both readable and writable you can think of these as being kind of like a telephone because they can receive data and then also send data they can actually even enter into conversations with each other on different servers if you have them set up that way a special variety of a duplex stream is a transform stream which is most useful when you set it up in between two pipes from a readable stream to a writable stream this is where piping really comes into its own because you can use a transform stream to perform any kind of operation you might be doing something simple like converting your source stream in all caps you might be doing something really complicated like compression or video encoding but transform streams can handle all of that so I want to talk a little bit now about why streams are so useful for performance and scalability writable streams at least the way that they're currently implemented in a node have to actively send a signal back to the readable stream that they're being piped to that they're ready for more data this might sound a lot a little confusing but this is actually their most valuable feature because that means that if that writable stream isn't ready to receive more data then the readable then the readable stream isn't going to be setting any more data to it which is exactly how we would want it work right like why would you want to have that data in memory before you're ready to send it down the pipeline the even cooler thing about this is that it's configurable you can decide how full that readable stream is allowed to get before it's like up please don't send me any more data that could be like 40k it could be 40 Meg it's whatever you want and whatever it makes the most sense for the current amount of working memory that you have at your disposal we call this a high watermark which i think is a great metaphor because it's like how full is that stream pipe allowed to get before it's like please stop no more input for right now anyway until I can deal with some of it and then you can send me a little bit more so you can really use this to specify exactly how you want your stream to work so for an example I want to talk about a project that our very own Joe Alvez did at his last job which was uploading large video files on a regular basis encoding them and then hosting them so these were huge video files and the gigabytes range it was important how are we gonna do this efficiently well the answer is streams the other thing that makes this kind of challenging is that all of these things happen at very different and kind of unpredictable speeds so uploading from the client to the server in Norway happens very slowly because we have HTTP running internationally right also hosting which is going from the server to our AHS server is also slow and unpredictable because it's happening in HTTP in the middle we have to do video encoding which is actually pretty fast it's not a very time consuming transformation but we want to make sure that all this is pasted well so we don't suddenly end up with like a hundred gig of material that we have to be holding in our server that would stink right so the way that we do that is by using high watermarks to specify exactly when we're ready for for more data so since this connection is happening slowly we can set a fairly low high watermark that as soon as is full we stop video encoding as soon as that video encoding fills up with data that it's queued that it's not ready to send more and sends a signal back down to the client up please stop uploading for right now we're a little bit backed up over here in Norway with our AWS server and what this means for the users is a really fantastic user experience because your upload bar is actually a good representation of when your video is gonna be hosted right like your bandwidth is throttled downwards you're only uploading a little piece at a time and then as soon as it finishes that encoding is already done which means that that hosting must have already been done because these high watermarks were all hooked up to each other and then within seconds your video is hosted also nobody's eating up your connection to upload a video really quickly well then you like sit for hours and wait for it to be encoded it's pretty awesome I think so here are a couple of further examples that that that you can use in particular I think the stream handbook is really valuable this is hosted on github it's kind of the Bible of streams and it was definitely my main source for this talk highly recommend it so go forth stream early and often but most importantly never ever cross them thank you guys so much
Info
Channel: Fullstack Academy
Views: 43,480
Rating: undefined out of 5
Keywords: node.js streams, how to use streams, intro to node.js streams, Node.js Stream Tutorial
Id: GpGTYp_G9VE
Channel Id: undefined
Length: 10min 3sec (603 seconds)
Published: Wed Mar 09 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.