Deep dive on how static files are served with HTTP (kernel, sockets, file system, memory, zero copy)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
today I'd like to do a deep dive on how web servers Frameworks or web Frameworks serve static files things like how does node and Bun andex and and your own rolled out go server actually server static files it's a very mundane simple uh thing we never talk about we take it for granted it works we never think about it but how is it actually working under the hood when we understand that you can make decisions to to to question things that we take for granted and as a result maybe you can even optimize that even further yes even that process that we think is simple can be further optimized so let's let's zoom in let's go into the depth of this and learn how static files are run and I'm I'm going to be to give you the kind of a like the expectation of what we we're going to discuss here I'm going to discuss the lowlevel concept of TCP connection that is the receive buffer the send buffer these are kernel data structures that is saved on The Connection property for that file descriptor and I'm going to also discuss how a read happen from desk I'm going to talk about a little bit of the file system cache and also how the data is written back to the client the data being the static file how we read it how do you manipulate it and then how we write it back let's jump into it all right I love to use a canva I just learned of this feature called whiteboard so I'm going to be using it I apologize I know it's a white background I know some people you know does not like that they prefer darker but what we're interested in here is serving static files the overview we have a back end here right we have a client so we draw a laptop and the static file that we'll be serving is actually located locally in for Simplicity in the backend storage disc right for Simplicity I'm just going to use that right and we're going to go ahead and make a request I'm not sure purple I'll be sending an HTTP request right and I'm going to get back a response so this is get slash let's say I'm retrieving the index.html this is the very first page usually that we get back right index.html page and of course since this is HTTP we're going to get a bunch of headers too right heads plus the body will have the content of the index HTML right that's what's happening here so like the header like content l you know transfer encoding no the encoding like is it compressed or not things like that so all the content and so this is the overview but what really really happened what really happens is first we're going to establish a TCP connection I'm assuming here we're using either htb1 one or htb2 right and we're going to establish a TCP connection an actual connection right and that connection is established using TCP uh s andac act right optionally and I'm going to don't I'm not going to mention it here but optionally we can also establish a TLS connection which will make our communication encrypted right but maybe I'll cover that later as now we're going to send that request inside this TCP connection and the reason I'm saying that is because when I establish a TCP connection we kind of do something on the back end here we do something the kernel piece right uh is the one first I'm going to make it in green so we know that is all related the kernel piece will create a connection for us dedicated for that client right so assuming we go into Port 80 here we're going to get two receive uh two cues let me remove that so I can explain this all right so now we have the receive q and the send Q receive Q is when I when the client send you the request is technically there is nothing called a request when it comes to TCP it doesn't know what a request is it doesn't know what a response is to TCP is just a bunch of stream of packets going one way or going the other way that's all so if you look underneath that request will be broken into a bunch of uh packets yeah and each the group of these packets will represent that request these will end up in the receive Q right so let's say U I have three of those guys right they end up here so now it's interesting because where is my back end coming to the equation because this as far as we know this is actually the kernel so let's actually uh make this a little bit more uh better this will be the kernel the OS and this will be the backend process that is running that's the user's process we call it user process because the user actually to differentiate it from the kernel but technically this is your node application this this is your you know goal right it Etc when we send a request there is part of the nodejs or the bun or the go or the C program that constantly asking the colonel hey is there anything for me now this communication between the konel and um excuse it I don't know blue right this communication between the Kern and the user space this is usually called a CIS call you're making a CIS call system call to the kernel to get something because that's the kernel memory that's the user space memory so there is a process that says hey read go and read or receive right so that's we actually go go and call receive RCV and this will bring that data all over way here so that purple thing will end up in memory that's a copy that operation is actually this blue line is a physical memory copy because that is different than this we will copy the memory from this receive buffer all the way there but what is this this is just bunch of bytes if we're using unencrypted HTTP then it it's ready to be consumed almost not not not quite so that request will have the HTTP headers if it's a get request usually it doesn't have a body so and we'll take it as it is we start the the process will stop processing it now that we know that we pass the request we understand the request now that's when that only then after the copying after the processing after understanding what the headers are after after building that request object because that request object for node doesn't exist before that node actually builds out this object you know for you based on the data it read from the kernel and then it builds out this object and then it says all right on request it will fire off that function and that will basically get you the nicy event that says oh someone just made a request and you're going to start parsing and you have a code that says all right go let's read uh from desk essentially that file that we request and then send it back so let's go through that part now this go let's actually draw it here because it's kind technically the kernel does the reading right now things get interesting to issue a read uh from desk most of this operation is is blocking reading from dis is until IO urang is fully implemented everywhere it's mostly a blocking operation so when you read that process that reads is just boom cannot do anything else it's a synchronous operation not quite but when you send that request you're blocked whoever send that request is blocked so now you might say no is actually asynchronous saying talking about but it's read is actually blocking so if I'm reading from disk I will ask the colel hey read from this desk what the colonel will actually go ahead and put your process asleep and then will issue the read operation right to disk complete the read God uh gets the data from the disk we're going to put it in the file system cache and then the kernel will copy it or the user process will get a copy of it the file in the user space memory right but it is going through the file system let's just do this as a file system this is DFS why because every we we only deal with desk directly through we only deal through with the desk through the file system so there's like a bunch of files and these files have metadata and and last updated and last all this stuff is has to be maintained by the kernel so if you're reading it has to go through the kernel it has to go through the file system right so we're putting it here and then we're copying it right here you I know my drawing sucks but sorry but get it's you going to get the point so that eventually that's copying the beauty of this is if you issue a read and we happen to have the blocks because when you read something you don't really read index. HTML you actually say read this file I want to read from this position to this position and the file system converts all that into a bunch of file system blocks that means hey read LBA 72 and LBA 73 something like that and it will be sent to the uh desk controller here cuz that's actually the desk controller so this is like the desk right so this could be like in vme right SATA whatever that stuff right but it's actually talking to another mini driver here which is in also in the kernel and that kernel that that process talks to the my process that talks to the actual desk Drive controller in the SSD right and there's also a tiny cach here as well so there's cash everywhere there a cache here there is another cache in the file system and there is of course your process having that index.html you guys you got to remember what are we going to do here right so let's keep that so now that index HTML that lump of text the the bytes are in my process memory I copied it so how many copies are here we moved this here that's usually done through something called dma which is the direct memory access so so the the CPU is involved in the tiny stuff you know things like I know keyboard if you hit a keyboard that that will issue an interrupt and the the the data will be copied to the CPU register and from the CPU register will copy to the memory it's right through the the memory controller and the CPU so there's like always the CPUs involved but here if they like copying large data like like files usually the the the the dma will take care of it and and directly the controller will just flood data directly to the memory this is a complex process because if there's like a bus involved here the CPU will hijack the bus sometimes but it's a complex operation but and essentially what we need to do is we need to understand that we're copying something to the kernel and then from the kernel copying again to the to the process memory so there's two copies essentially right and we have to have this in the F because future processes like here might be another process asking for the same file we don't really need to go to this if you're writing to the same file we're going to write first to the file system cache and then flush it you know periodically to the to the system so that's that's basically how it works I'm going to keep this page I'm going to create a brand new page this now now what are we going to do husin we read the files there's a actual what you actually did is not read file you might say I never read a file I just said send file whatever you did use the express JS and says b here's something send it here's a index just send it I don't care how you do it but to send it you need to read it this doesn't make sense right file don't fly directly to the network right so it has to be read to disk which we did yeah so let's let's keep this as well this lump here all right so we have the what was this thing that was the request the purple thing is a request this blue thing is the indexed HTML now the process well of course if you're using JavaScript this is way more complicated because there is oh God because your process is actually interpreted code in this case right so there is is another piece of memory that converts JavaScript to actual bite code that will get executed by the machine code that's that will have an instruction but I'm skipping all of that right and for for for optimization we of get that which is do just in time compile it and produce compiled code that is runable in the Heap right because you see this is if this is node or go there is a text area in the process right that has the code and it's read only nobody can touch it but if you have a new jet code that will that cannot go in the same text right because that has node code itself but your code which is Javascript will have will go somewhere else will go into the Heap of that process because you know the process can has the stack the Heap and and and the text eventually plus other stuff like static files the static variables but now what are we going to do we're going to send back what are we going to do now we're sending the file or writing the content well to write the file well you cannot just write the file directly to the to the socket of the user right we need to write the headers first to need to write the headers you need to have at least per simple H is like content length content type what is this thing you have no idea what's this what is this lump that I just read well you can say oh it's an HTML because the stat the extension told me it's an HTML H it's kind of not always a good idea because you might have I don't know someone might named it HTML but it could be something else I don't know like an executable who knows right so that's when either you as a developer say content type index HTML you actually specifically say it which is faster right because you if you don't then the the process has to guess what it is and either it will use this extension which I don't know if that's always the case or not or it's going to even worse actually sniff the content to determine what it is and all of these simple thing you going to understand because you have no idea what this Black Box is doing right it's do all this all sorts of crap and you have no idea I was like what is it doing you don't have we don't have answers without looking at the sour Cod right and so that's the content so you have to write headers first I'm going to do the response in Orange right so the Orange is the response and we're going to do first I'm going to write the headers when you write the headers those right operations is done I didn't mention that but writing and reading from sockets which is like this connection which is this buffer uh these buffers essentially is asynchronous that mean it's non blocking you can write it and move on do your own thing and it's just how the kernel solved it because it's actually you're not doing much right just literally it's a copy of memory so now we're copying things again we're adding the header and then we need to copy that thing into the process memory right so that blue thing that lump need to go into the send que right so we're writing so we wrote the headers like us and then we write the content itself and they could this could be a single packet this could be 10 packets right because we don't know we just write WR write right and then this the the Kernel's job to segment these segments right into the into packets write this Big Blob into packets now you might say like if I write this why doesn't actually use this cash thing to copy it that's a good question I don't know I don't think there is a way to there might be a way right to actually say hey I don't want I don't I didn't H I didn't need to have this crap in my process why did you copy it all the way in my memory right I didn't need to at least in this case I didn't need to right sometimes you do need to copy in the browser sometimes you don't so there is like there is a method but I don't know much it to be H it's called send file literally it's a system call that's called send file that will literally you give it two file descriptors it will send it from this file descriptor which is to by the way to open a file you have to have a file descriptor and everything in Linux is a file right a file is a file directory is a file a socket is a file a connection is a file and a socket is different than a connection a socket is the listening thing and you can have many connections to the same socket right CU like like you have one server but 100 connections from different clients right all of these are files so there's something called send file that does that but I'm not going to discuss it here and so that that kind of optimize things just go with take a there immediately but how does that work if you're going to add headers you can add your own thing so it's going to it's not as simple as we might think it so we can we have to do some things write the heads and then we could have done that we could have written the headers and say okay so so don't read the file at all just uh write the headers I know the length let's assume I know all this stuff information I read the metadata you write the dis you write the headers but then you say okay send file so this you this system special system Cod that will read the file descriptor immediately the content and if it's in cash bonus points boom directly to the send Q the send Q by the way is where the socket connection lives before it leaves to the neck which is the network interface C which then leaves to the the cine right so that's what's happening here but we had to read it here in this case that's what almost most of the stuff does We read it in the memory and then I want to talk more about this but I'm afraid I forget when else do we going to read it compression if we support compression like G which HTTP most of the accept encoding headers supports like if if you support it then this syn file won't work right because you had to read it here in memory so that you can produce a new copy that is can you ch the color of this thing I suppose not huh oh sorry I'll just color over it this yellow thing is the compressed one because now you're compressing it so this is the compressed version so to compress you need CPU you need you need the thing in memory and you need to execute code that knows compression and that's all CPU right because like oh read this Huffman thing and just read this and the tree and then com produce a brand new so you need more memory first for compression and then you're going to shove that into the syn Q so that's how you do it with compression right so let's do this I'm going to keep this can I duplicate yeah perfect right so that's the with compression that's with compression that's without compression so send right content no compress we have this memory in the user space for the request we have this and we have this and then we have compress and you want to make it even more complex encrypt TLS which is almost always enabled right in the web today if you want to encrypt it after compression you have to do another copy to actually encrypt it right so this syn file which we by the way zero copy cuz literally there is no copy you're not copying to user space you immediately sending it to the socket it's very rare that you can actually pull it off right there is I might say there is actually a send file SSL right kernel method that actually you send your key to the kernel and says all right h i want you to read but encrypt on the way encrypt it on the way here in your kernel space and send it I don't know who which kernels which version supports I don't know any of that but yeah if you do that then you got this information now let's go back all the way to this puppy right now we did all of that I think we didn't leave anything out so now we're going to send the response how do we send the response it's basically these packets right these packets will end up uh all of them if we support encryption they will be inter corrupted as well but they will so this is the response now we're sending the response back to the client it could be one packet it could be 300 packets like depends on how large your stuff is and the client will receive that and then assemble all these packets in its own receive queue because there is also in the client kernel there will be another receive queue and there will be another send queue so in this case you're going to get it in your receive queue as a client so that's the front end and and then you're going to read and read and read and read like like way we did here we're going to read which is all asynchronously thank God right all these operation are asynchronous uh so something called the eole if you ever use that pull and select they all right is there anything in this socket that I can't read and it will it will say all right or there yes there is something so you go ahead and you read Because if you read if you attempt to read from the connection from the socket uh that is from the CQ and it's empty you're going to be blocked so I have to mention that right but if so what you do is you do an eole it says all right is any of these file descriptors like say you have multiple connection is any of these guys ready ready that means they have content so the kernel will uh will essentially tell you well uh no actually we don't have that so try again so this is it's actually very CH say hey is there anything is there anything is there anything is there anything is there anything oh yes so iio uring is basically the Holy Grail that sols all of this right I might make another video for that I O uring that's what you say that's how you say you send requests to the to the ring and there's like a shared memory between the user space and Colonel space and they can write and read simultaneously from it right of course is like of course protection against mutexes and stuff like that but yeah that's how it works it only took me I don't know how long to explain all of that right I know people complain ah this could have been done in 3 minutes and you talk like 30 minutes I know I'm sorry I'm slow right uh I don't know what to do I I like to go through details right and uh yeah I enjoy the stuff um I hope you I hope you did uh see you in the next one goodbye and yeah uh got a plug right you like this stuff you love what you see here in this channel go to this site back of the when and I made like another domain this to my UD me course U Back In engineering fundamentals of back in engineering I explain this stuff and more and more and more so if you like this stuff hit there and enjoy the course see you in the next one goodbye
Info
Channel: Hussein Nasser
Views: 19,980
Rating: undefined out of 5
Keywords: hussein nasser, backend engineering, kernel linux, bun, nodejs, static files
Id: rIcahiIklSk
Channel Id: undefined
Length: 27min 23sec (1643 seconds)
Published: Thu Oct 05 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.