Building an upload backend service with vanilla JS (with Progressor, No forms, No libraries)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what is going on guys my name is hussein and in this video i want to build an upload service from scratch and the goal is really uh to build this service with uh almost no libraries really in the front end i'm going to use the built-in uh browser functionality which is the file reader this is the ability to read a file from disk of obviously that the user have picked and then once i have the bunch of bytes i'm going to stream those uh bytes to the back end and then store the file on the backend and uh to transmit the content to the back and i'm going to use the fetch api which is also a standard so my html is literally referencing no libraries external third-party libraries that are right and on the back end i'm going to make keep it simple as well and i'm going to use the vanilla http library and that is it to build a completely fully functional uh upload uh service and um let's test it this is what we're planning to do so i'm gonna pick my uh nginx dot png file here for example and then go ahead and upload and you can see there is a progress here it is slow because i'm uploading 1 000 bytes per request obviously you can control that you can increase that but it's just all configurable at the end of the day and once the file is completely uploaded if you go to the back end you can see that this is the file effectively right and there is no limit you can upload literally any file size you want because we we're chunking it up right and ascending in small portions obviously this code is not going to be perfect it can be improved obviously to have a resumability if in case of failure you can resume it all of this stuff can be built and added as feature once you understand the the basic fundamentals of how this is actually working so let's go ahead and upload actually a move web dirt and paper and then just you can see that's a little bit slower that is expected because we're sending small chunks right this file and as you can see it's almost done and here's what's happening here when i pick a file i read that file using file reader i have an array buffer and then chunk it up and then send the each chunk my chunk size is default thousand i can obviously change that and then send that by it sets off by to the server on that server and i immediately write it to a file obviously of my choosing okay let's go ahead and start from scratch and then just just enjoy building this from scratch right so start a brand fresh project here and and maybe you can start with the back end right i'm gonna write index dot js here and my back end i need an http library obviously right equal require http and that's pretty much the only library and it's a built-in library that's available on node.js right and then let's go ahead and build a server equal http create server let's just listen up and add a listening event when we're starting listening we're listening something like that just we're gonna pick the port later http server and uh when when a request comes in we're gonna get two beautiful parameters here and uh here's what i wanna do if if the request though url is equal equal equal the root right then what i'm gonna do if if we're visiting the root which is the slash then immediately i want to just read some index.html page here and then return it so let's go ahead and create that index.html html html5 for this uh file uploader and let's just do my file uploader here you go my friend and uh how how how do we how do we send the result responded and immediately i want to send a bunch of okay so i lied i'm gonna need another library called fs we're going to read from disk i lied okay we're going to read it from disk on the back end right why because i need to read this index.html so read file synchronous and indexo.html and then boom return i just i want you just to end the response and then return it immediately right uh so that that that ends my http request let's go ahead and do http server dot listing on port it all right looks like what what am i doing here should be console.log all right run no debug console listening so open a new browser run my file uploader nothing fancy okay so we the first piece to serve the html file is there now let's build up the actual html file we really need a uh a file right so there is a this thing that's called input type equal file and let's give it an id f right because we're going to read that thing and then i'm going to need some sort of a dev element here dev output that will effectively output the progress bar how far are we okay we can do something like that look at this beautiful front end guys we might need actually a button right a button what an id equal button uh upload right and the button upload we're gonna go read and upload because it's gonna read the file and then it's gonna upload the file but no none of the logic is wired so we're gonna need a script button upload equal document dot get to element by id button upload just so we can wire the event i'm going to also give the output div output and then also going to get the file itself right so these are the elements that we're going to need and we need to wire an event when someone clicks on the button that is effectively upload add an event listener so that what i don't really need to do anything so we're gonna do that so we need a to declare a file reader here equal new file reader and this will be available for us because where we are in the client right this is just opens a file but here's the thing uh you cannot just willy-nilly read any file in the browser that's just a big security flaw right you have to ask the user to actually read select the file from the f and then you're just reading that file so let's get that file right a bunch of files and i believe it's f dot files if i'm not mistaken right it's effectively it's an array you can pick the select file we can support up even uploading multiple files if we want to we just really just do a loop so let's call this the file right so the file is this and it has its own objects and stuff like that we're gonna we're gonna go through it has a name it has a length and stuff like that so up until here you didn't actually read the file you have metadata about the file so how about we actually go and show you what do we can what what do we see with this file right so if i refresh this thing we see my beautiful buttons and then i'm going to do that right then just just uh let's just do some debugging here i'm going to put my file here can i can i put a breakpoint here i think i can put a breakpoint here right and then i'm gonna select a file engine axle click and immediately you can see that we're breaking point and here's what we have that's pretty much it the file selector only gives you the name they give you the size which is pretty useful this is useful method they give you the type of the image right and they give you the last modification date and so just without reading the file you have access to the full the metadata almost all the metadata you want right but that's not what we want we're going to upload the damn thing right so we need to actually read it and here's how you read the file and i i have one complaint about this uh library uh you can do by the way read as array buffer binary string the as text if you know it's text and i'm going to choose to read it as an array buffer because i'm i'm dealing with it as a bunch of bytes i don't care what it is right and when you do that right you specify what to read right which is the file right if you do that then that is an asynchronous call it's gonna start calling a callback every time it reads a chunk and these callbacks are defined in i believe on load start on progress on lowered end and unload it's going to tell you how far our again we're still in reading we're still reading the file from disk to the memory in there for the browser so if you have a large file it is worth to track the progress of this file right just the reading of it to memory what my problem with this is unfortunately the progress event doesn't tell you the actual bites it read it will be really nice if it tells you hi by the way i read this i read this no it just tells you how much it read because if it did tell me how much it reads as it as it reads it i can immediately go and send that chunk right so this will be a true streaming from the desk immediately to the network unfortunately what what we have is only on load end i believe or just on load i believe it's called tells you hey i'm done and here's my file and then you get an event function like that where you say okay i'm done right so we can do a console.log here for example just that so i can show you exactly what we have we have we're going to have the event we have the target and we're going to have the actual full file on load and that's the only really bad thing that i don't like not bad i mean i don't really want to read the whole file only so i can upload it next right does that make sense i want i want to stream it from desk and stream it to the back end as i read it from this if especially if it's a huge file regardless this is a limitation that we have until unless i'm missing something let's save we don't really need to this is all html so all we have to do is just refresh the page really i didn't really need to restart the server and uh choose a file boom boom read and let's put it uh did we actually read it oh there you go it's done right so now because it's done we have access to this ev thing the event progress event right and there is something called target i believe and in the target there is a result look at this beautiful array buffer this is a beautiful array buffer what is that they feel a memory inspector oh that's new i never seen this before reveal in memory inspector panel reveal memory inspector panel all right so we have the access the full thing and you can play with this so your event to upload will be on finish load when once we finish loading that now we can actually upload the damn thing but since it's an uh it's a load event so what do we do we're gonna chunk it up right we cannot just upload this whole file to to the back end and and you can try you're gonna fail because that of course would be so huge that most routers in the middle most uh proxies in the middle want uh will not successfully deliver that request and even most back end and most proxies have timeouts request time okay your request is just too huge i'm not gonna send seven gigabyte in one request so it's not a good idea to upload a huge file it won't even let you that's why you have to break it up all right so how about we start chunking this up and um sending it to the back and let's do that all right so how about we start chunking this up and sending it to the back end let's do that so back to our beautiful html page so let's add something here so it says okay red success fully right and then we're gonna do ev dot target dot results all right so result is the actual thing right the byte length is what we that's the total link okay so now if we do some magic right let's let's create a constant here call it uh chunk size and we're gonna make it thousand bytes almost a kilobyte not not quite and we're going to do i'm going to do a beautiful for loop let i equal 0 which is the chunk idea it's called chunk id right chunk id 0 while chunk id is listed then how many chunks do we have so how many chunks do we have chunk count is equal literally this is a thing right divided by the chunk size that gives us x amount of chunks but there's always remaining right there is always an extra chunk with the remain like so let's say we have a thousand and one bytes right so if you've divided by a thousand then you get one but also the there is a one byte remained right so this is the leftover stuff right so if you have like i don't know three thousand and one divided by three you get three chunks right one two three but there is one byte remains so we need to also send that last point so we're gonna we're gonna play with that so i'm gonna loop through this chunk id from zero up until chunk id while chunk id is less than chunk count just looping through all the chunks uh uh chunk id plus so plus so okay but here's what i want to do i want to go an extra one loop so i can either do plus one to account for the last uh remaining chunk with the remainder that's why i have to do plus one as well and here's what we're gonna do what we're gonna do is uh here's the actual chunk itself the content right we're gonna do ev dot target the result dot there's a neat function called slice which slices the array bytes which is a huge thing into whatever you want so we're going to start from so we're going to start from byte zero you can tell it okay go to byte number thousand this is not how many bytes you go this is the byte you want to go so if you said thousand that means this is going to give you a thousand bytes if you get 2 000 this is going to give you 2 000 bytes right if you do this 1000 and 1000 that means go this is literally just going to give you one byte right so if you go 1000 then give me 2 000. that means if you said 1 000 and 2 000 it means okay start from 1000 and go to 2000 position 2000 and read anything between this so if you do the math really is just literally chunk id times the chunk size right right which is a thousand in this case right so if it's zero times changi the chunk size is gonna be started with zero right and then what do we really what where are we going to what are we going to read we're going to read exactly the same if you think about it the same location but plus a thousand which is plus the chunk size that is that is the math i think i got it right let's go through the loop if you start from zero then this is zero right if you start from zero then this is zero this is gonna be zero plus a thousand which is gonna be a thousand awesome if you solve a one then this is gonna be one thousand right this is gonna be one thousand plus one thousand so two thousand and so we're gonna read the next thousand and the next thousand and the next thousand all right so that's now we're gonna get have a beautiful array chunks which we need just to send to the back how do we send stuff to the back end well first of all we need to make this function into asynchronous i'm going to tell you why because i'm going to use a weight here right a weight fetch we're going to do a fetch command we're going to localhost 8080 right and then we're going to go to upload here there's obviously there's no route by the way here the constants of browse is just literally path so we have to have an if statement on the back end to capture that right so what are we sending we're sending a bunch of stuff so the first thing you're gonna specify is the method i'm gonna post because hey we're sending stuff the next thing we're gonna send is the headers right it's an array of headers and the final thing is the body and i guess the body is the easiest thing can you guess what the body is the body is the chunk baby for the headers we really need to till the back end but um and not just the back in any proxies in the middle hey by the way i'm sending some bytes and this is called the content type the content type is called application slash octet i always misspell this opted stream octadream and then the content length is is what it's junk dot length hey this is how much i'm sending you beautiful so we're looping and we're sending a bunch of requests but they are kind of stateless they know don't know each other we want to somehow tag this request with some unique identifier and this is the file name so i want to generate a unique file name here right and i want the easiest really better way to do it is just create a file name here and the file name is and the file is up there right the file the file the name i believe it was the property called the name then when you get the file right this is the whole name with the extension but but what if you uploaded the same file again right we don't we we want to take a completely unique one so i'm just gonna do math.random here uh times a thousand just to create it some randomness here on at this end all right and here's the file name and that is the unique file name are there better ways to do this there's always better ways to do anything to be honest but i think a simple thing here really so now we have a unique way we have a unique upload id think of it like this way right so now i'm going to send an extra header here and this is a custom editor that i'm going to make up and this is called literally i'm going to call it file name and uh there you go the file name now this is the front end we're looping and we're sending a request we're awaiting it because we want to be in the loop waiting we want we don't want to go to the next loop send the next request before we actually get a some sort of a response from the server okay so that's why we're awaiting this if i don't have this if i remove this this is going to be asynchronous that means the whole entire loop whatever chuck count chunk count is is going to be sending a flood of requests to the backend right order we have no idea if we can maintain order or not right threading will be it will gonna be a mess and you're gonna run out of resources and your and your client so we have this is one way to do it we prop we can can you do it probably you can if you know if you if you want to paralyze stuff you can do it it's just to unlimit right but if you're doing in this way then you're gonna really run out of so we're almost serialized we own we don't send the next request the next chunk until we get a response that the chunk has been received i'm not handling any errors cases or whatnot but we're only sending a chunk when we get a response from the server that we actually received some sort of response from the server so that's that's what we're doing here uh can this be optimized of course you can definitely optimize that send seven chunks at a time but you need some sort of a finesse to do that out of the scope of this video all right back end what do you have for me so this path is when we are requesting the the url right but what if i'm requesting the upload if i'm uploading what do you have for me well i need some data right first of all i need that what do i need from you i need the file name can i get the file name from you very simple headers so request.headers and we have a beautiful custom header right here it's just hey that's a file name and then there is a nice function in the fs right that appends the content synchronously right append some content to a file and if the file doesn't exist it will create it for us all right so what we're gonna do let's get the chunk but here's the here's the trick here we need to read the body right of the request because it's a post request right so what do we do here right how do we read the body request if you think about it i don't know if this shows you it's an incoming message and if you go to it's an incoming message if you go to the help right always the help is your friend if you go to the help it's going to tell you that this is actually just another stream yeah so if it's a stream then you can essentially just read that stream right and there's an event called data and this data is the data in the body itself so we're going to receive that chunk and then what i'm going to do with the chunk i will take that chunk and immediately first of all maybe just write a console message here says okay receive chunk awesome so now what we need to do is just write this chunk to the file how do we do that fs dot append file synchronous to the path guess what the path is just i don't care but in the same location here and then what's what's the what are we writing we're writing a bunch of bytes and this is the chunk that we're writing boom receive trunk slam and literally once you're done with this respond upload it so this will keep repeating depending on the size of the chunk right uh one request might have many many man chugging this is what we're talking about this might time out right so if the smaller the chunk size that you send the better in this case so i'm sending 1000 which is too low by the way right and now we have to restart the back end and i have no idea if this is going to work from the first time so let's go ahead and refresh and select my engine up so this net read it how many chunks do we have we have 330.954 chunks and we're going to read successfully let's continue [Music] we're going to loop nope get the chunk beautiful thousand array buffer chunk and then we're gonna call the back end post to the back end up one we sent something we sent something did we get a response looks like we did we got a response nice we got a response now let's just let it run go to the back end we have a beautiful ass file the nginx and we have the beautiful engine x right so now if what we forgot to do is is actually just do some progression on the on the front end let's do that actually how about we do that so i'm looping uh the progressive should be really easy right if you think about it uh we have uh do we have on my output yeah div output what are we gonna do once we send it only when actually it's sent so we're gonna do here is text content equal literally uh chunk id right divided by chunk count times thousand hundred that's the percentage right see if we if that does that's right uh we don't really need to restore the back end boom let's do it let's pick something a little bit large i want something large sure let's do this boom again boom boom awesome you can see there is a percentage here but we didn't do that math the round i don't really care about this zero it means round it round around it up a little bit let's do it again and let's pick a large file now boom upload uh-oh we have a beautiful progressor nice cancel let's do my podcast my podcast logo is huge it's like seven meg sheesh if i do this and i upload it look at that it's so slow but if i do this i actually can't see it right here and uh the beauty of this is this visual studio code which is gonna actually draw it for you as you receive it does this uh do any flashbacks guys 1990 right back when we browse the pages look at this that's some upload service right there man oh this is ridiculous let's increase the chunk size guys we're gonna increase the chunk size 1000 is too little so let's do how about 5 000 5 kilobytes you don't need more than 5 kilobytes run yeah we're gonna be here forever otherwise refresh take my podcasto even five kilobytes too low but sure eight megabytes there you go done we received 309 no we received 1650 chunks you get the idea guys percentage i'm saying can you upload zip files sure upload this is a file source code of the of my nginx course where is it there you go reveal in finder extract look at that nginx this is my source code so yeah that's what i want to show guys works uh but there are a lot of flaws it's far far far from from perfect i'm going to push that code so yeah feel free to edit that how many lines of code do we have let's take a look in the html size we have 58 lines of code to do an upload and on the back end what 24 26 22 26 lines of code obviously the mod you need to add resumability right which you can right with this with this model you're just appending right if so if there is one failure and the next chuck's chunks come in right the file will be corrupt right this is the file from perfect but that's why you have to retry the same chunk id that means you have to have idem potency what is it called i di important see this name right by the importance you have to have importancy so each chunk should have a unique identifier with it effectively right uh what else what else where else so another bad thing i did here uh is i sent a header a custom header usually you're gonna send it either in the body or in the url parameter something other than the header but the problem with the header is if you have it's a hub it's not a hop-by-hop basis right the nginx or ha proxy or any proxy in the middle if you have used proxying then uh it might drop the headers it's like what is this i don't know it right it's not it's an unknown header unless you configure it so it's sent these headers right i believe experimenter just has to start with x or something like that again it works here when you have but when you have just us uploading another thing i want to discuss this can be used to build a scalable upload service yes what do i mean by a scalable upload service today this is a stateful service why because i am writing to the server that i received that means if my next request goes to another server another chunk will be written to another server obviously that will just break the whole thing so what you need to do is what i would do is one solution is have a database on the back end and write the chunks to a blob table right just write the chunks uniquely identified by a chunk id on the table and when the client is done he says okay i'm done what are you gonna do one of the servers will read the database and then i gotta get all the chunks and then write it to a persisted location as three or one of the servers and then retrieve that server this what in that case it will be a completely scalable upload service so each chunk that you send the client send from the front end will go can go to a completely different server and that is okay because each server will not write it to its own thus being stateless right it will write it to a database which is a third thing it's a completely different thing so yes the system is stateful because we have a database but the application that back-end application is is stateless it does not care about anything it just merely you can destroy it restart it doesn't matter right the client will retry and then we'll hit another server which where that server will hit the back end and then write it so we can do so much cool stuff to make a scalable upload service right and in the front end again guys we can do some more things this is so this is so you know this is this is very very in performance i'm waiting right let me actually show you what will happen if i don't do this if i do this right if i just say okay and just loop through and then send everything this thing is gonna break this thing is gonna break let's do it whoa look at that insufficient resources because that is just insane you're doing a lube and then sending a lot of requests almost in parallel not quite but almost imperative you're sending the next request despite you didn't receive a a response from the first one so i can imagine now the look at that look at that so now the connection yes we're doing multiplexing not much this is a reverse multiplexing in this case right because http one one that's what we're using we're not using hdb2 uh so the browser opens six there's a connection and then reverse multiplex the request on these multi tcp connections and then pull them back on the back end and god knows if you're gonna get them in order or not you will not get them in order so definitely i'm not gonna you could get the file uh you could get the file on an order by chance but this needs a lot of work right you can take advantage of this thing right let's say send five in parallel four in parallel play with that a little bit and then make sure that hopefully they reach in order in the back end and if they because why did why does the order matter in my particular case because i'm just appending to the to the file immediately if it was a database and i have a sequential chunk id the order doesn't matter because i'm going to write it to the database and i have a sequence right so if sequence number two chunk number two has arrived before chunk number one doesn't matter because i'm gonna order it at the end i have a field that is representing the orders of chunks all right guys that's that's it for me today i'm gonna see you in the next one hope you enjoy this video you know guys it's just i just love to break these things and just go back to the basics and fundamentals from time to time well guys um see you in the next one you guys stay awesome on this night bye
Info
Channel: Hussein Nasser
Views: 14,457
Rating: 4.98102 out of 5
Keywords: hussein nasser, backend engineering, node js upload, javascript upload, upload file progressor, upload file with progress, upload large files
Id: Ix-c2X7dlks
Channel Id: undefined
Length: 37min 31sec (2251 seconds)
Published: Fri Sep 24 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.