File (NAS) vs. Block (SAN) vs. Object Cloud Storage

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
when you're developing a mobile or a web application or even just a website one of the biggest concerns that you'll have from both a cost and performance standpoint is going to be what type of storage you're using now you could use storage for a multitude of your reasons maybe if you're developing an application you'll have users that are uploading photos or videos you'll need a place to actually store those if you're developing a website you might want a content delivery network or even just site backups and attaching external storage is not all that hard but understanding the different options that you have is a little bit confusing especially if you're not an IT professional I wanted to put this tutorial together for exactly that reason because if you go on google and type in the different types of storage so let's see file versus block versus object those are generally the three that you'll find you'll get a lot of explanations that are kind of just so-so most of them are just pitching you know that company's storage solutions so they don't really go above and beyond to explain the differences between them especially for non IT folks so if your developer like myself that I don't care to understand the true nuances of data storage I just want to know what my options are and what are some of the advantages and disadvantages so that's what this is video this video is going to be four and you don't really have to know anything about storage to understand it before we get into the different types of storage we truly need to understand why the storage thing is important in the first place and that is because when you're working with a virtual machine it actually has something called ephemeral or non persistent storage now what do I mean by that well tell to truly understand that you have to take a step back and and think about what these big companies like Amazon Web Services Google cloud or Microsoft Azure what they're actually offering you from a cloud computing standpoint if you walked into Amazon Web Services one of their facilities they've got tons across I think they probably have seven or eight main facilities across the globe they just are basically a bunch of servers running in a single room and you can think of each individual server kind of like an apartment complex so an apartment complex has several units within it and when someone moves into a new unit they can put it put whatever they want in there you know add a bed a couch TV whatever you may want to add to your to your unit now with these servers it's kind of the same thing each server is going to have plenty of space so these companies will actually rent out sub units of these servers called virtual machines and when you rent a virtual machine you pick the size of the storage that you want and once you log into it you can store whatever files you'd like you can run servers you know host a website all those kind of things but when you move out of that server or that virtual machine just like an apartment you got to take all your stuff with you and it's not going to be persistent so once you leave it's all gone to explain this a little bit better I'm going to actually show you in real life what this looks like I know this might be a little bit too much for some people but I wanted to to show you more tangibly what I'm talking about here since I feel like it's often left out when explaining these concepts so we'll go over to digitalocean which is the service that I use it's basically the same thing as like Amazon Web Services but it's geared more towards like individual developers small teams it's really easy to use and it's priced pretty competitively so I really like that I'll leave a link in the description for you if you want to check it out so we'll create what they call a droplet it's basically a virtual machine which remember each server is like an apartment complex multiple units ie multiple virtual machines so right now when we click get started with a droplet what we're doing is we are picking what kind of virtual machine we want now let's pick the cheapest I'm gonna delete this right after this tutorial so we'll pick the five dollar-a-month one so all we're asking for is 25 gigabytes of hard disk space which is just a tiny portion of each of digital oceans servers so we're basically renting a studio apartment here so we'll go down I don't need to add any of this yet we'll run it out of I don't know New York and I'm going to use this SSH key don't worry about all these specifics I'm not really trying to explain how to to work a virtual machine just kind of demonstrating the storage stuff once your virtual machine has initiated basically what has happened is I now in paying for I just signed a lease with this you know server for a individual unit of a little apartment unit of this server so it's indicated by this IP address so we'll copy that and then we'll go into the terminal to login we'll just take type ssh p22 for port 22 and then give it the IP address that we're using so this is basically just going to log us in you'll see that we are logged in now in this terminal is now operating a virtual machine which sits on a server out in New York so if we create a file this is just how we create a file for those that are not familiar type the touch command and will say new file dot txt we'll open that file up and add some text okay in real life you'd probably add an entire website a bunch of code here but I just wanted to add that one file the new file dot txt to demonstrate what's happening here so we'll stay logged in and go back to our digitalocean and now I want to attach some block storage now I know I haven't explained what block storage is yet but just bear with me we'll get to it eventually so I'll add a volume this is the equivalent of block storage I only need we'll say five gigabytes I'm gonna again delete this in a second and what I'm going to do is attach it to this virtual machine that we just made and I'll call it my block storage and we want to automatically format and mount and we'll use the ext4 filesystem which is essentially what most Linux distributions will use for their file systems so it tells me here that the the volume has been attached so we can go back to our terminal which I haven't touched anything and we'll see that if we go to the mount directory right here we have my block storage and we can go into my block storage and create another file and we'll put some stuff in that file and now we have stored some data on this volume or block storage so why did I go through this exercise the reason is to demonstrate what ephemeral storage is so we have two files that we created one of them is just sitting on the virtual machine and another one is sitting in that new volume that I just created so we have them in two different places now if I just stored stuff on the virtual machine and then I came back to digitalocean and let's go to our droplets and then we just deleted the droplet entirely so we're gonna destroy the virtual machine that we just created that droplet has been destroyed and we have no possible way to get back that file that we made that was called new file so that is gone forever but we still have access to the file that we created in our mounted block storage or volume as digitalocean calls it so let's create one more droplet here again we're going to choose the cheap one and we're going to this time add some block storage so we'll say add volume and we'll say attach existing and you see right here we have the separate block storage that we had originally created and since we created it in New York it's only going to let us use the New York one so we will add our ssh keys and say new virtual machine we have our new virtual machine which should have that block storage attached to it so we'll again get the IP address for it go back to our terminal and log in then we will go to our Mount directory and you'll see that we've got nothing there that's because we have to format our or configure our block storage so if we go back to our volumes and then we see this block storage it's already attached to this new virtual machine that we made but we need to configure it so we'll go to the config instructions it tells you exactly how to do it so I'll just copy that come back to the terminal and type in all those commands and now you'll see that we have the block storage if we go into that block storage you'll see we have this file that we created on the last virtual machine so in all the point I'm trying to make is the fact that if you leave all of your files so maybe you have your users uploading files to your virtual machine file system it's going to be fine as long as you don't terminate that virtual machine and your data center does not crash ever so you're pretty safe but you're definitely not safe enough for comfort because it'd be a complete nightmare if you lost all of your users data so that is the reason why we need to attach additional storage to our virtual machine and kind of separate out those concerns hopefully by now you understand why you need all this storage attached to you know maybe your virtual machine but what are your options now there are four types of storage that I'm going to be covering in the first two which would be dass & nass you'll learn what that means in a second those are not necessarily applicable to web development in terms of like a web app website or maybe even a mobile app it's mainly the stuff that I cover on this channel so not necessarily relevant but still good to understand the second - block and object storage are going to be much more relevant and we'll dig a little bit deeper into them so the first one that we will cover is called direct attached storage or dass now most people know what this is already because you've used it so you've got a laptop in front of you probably watching this video on a laptop maybe a phone and there's a little hard drive that sits within that laptop now this would be called direct attached storage because it is directly attached now this is great because it's cheap it's easy to use I think I bought my six terabyte external hard drive for just like a hundred or two hundred bucks something like that so it's extremely cheap for what you're getting the cons would be it's not shareable so if you wanted to share your data with someone else you'd have to either upload it from your device to the cloud or walk it over to that person's computer to share it so that's definitely not very useful in some scenarios and then also it's not really used as I said it's not really used in cloud computing environments because you know these services like Amazon Web Services as your Google Cloud the people working there they're not going to be walking around with little external hard drives plugging them into all the different virtual machines or that are the servers that they're running in their warehouses so this is not really a cloud computing thing but it's important to understand the next one is going to be Nass which means network attached storage now understanding Nass by many descriptions seems very complex but it's really not that hard this picture on the screen right here that is an ass device there's really only two things maybe three things that you have to understand number one that box has Internet connectivity so we'll talk about it in a second but that box is connected to some local area network number two you've got these hard drives sitting inside this box so on this one there's five of them and number three these hard drives can be configured into what we call a raid configuration that means a redundant array of independent disks and in other words basically what I'm saying is you can set up these you know I think there's five of them here you can set up those five hard drives to replicate data in various ways so maybe you do a raid one configuration say there's only four of them four hard drives and two of them are replicated amongst each other and the other two are replicated amongst each other that would be a raid one configuration not going to get into all the different configurations just know that you can set those hard drives up to you know store data all independently a little bit replicated or fully replicated so you have some sort of assurance that if one of them fails you're not going to lose all of your data although this is going to be more expensive than directly attached storage it's still pretty cheap I think that box right there I haven't looked it up or anything but would assume it's around a thousand bucks and then you have to actually purchase the hard drives to go into it so a couple hundred bucks there so it's it's not exactly cheap but it's not super expensive if you want to have this at home it's also really great for collaboration if you've got a bunch of files that a bunch of people are working on so a perfect example would be an office building where maybe you have 10 employees that all need to work on the same drive that's going to be perfect for this because when you see this on your computer it'll show up as a single drive it's also nice because you have centralized control of all the files can set permissions on who can see what in the network and finally as we talked about with the raid configuration you can replicate data and make sure that you have backups of that data now the one thing that's a little bit of a downside is that the network attached storage is going to be connected to a local area network which basically means that if you have some activity or if there's a ton of activity on your network it's going to slow this down a little bit so if you ever have heard of a shared Drive so Google Drive Dropbox iCloud all these are shared drives these are going to be this is basically the equivalent of a shared drive so you can see the NASS device attaches to a router which then attaches well the router is the network kind of facilitates the network so all the client computers can connect to that Wi-Fi network and get files to and from the NAS device the next type of storage is going to be called block storage but we're not quite ready to get into that yet as we saw on the nast Orage this is also considered file based storage and you'll see a lot of comparisons online that try to distinguish the differences between file storage and block storage and I think it gets really confusing because everyone's kind of talking from different contexts so if you look at something like a hard drive so this hard drive right here that is actually a block storage device which basically means that it is broken up into partitions which then store files on a file system as little 512 byte blocks so here in this little diagram I've shown in the example hard drive chances are you're not going to probably have this many partitions but you can see that we can run the ext4 filesystem on partition 1 and 3 which is basically the Linux file system we can run the Apple file system on partition 2 and the Windows NT FS files system on partition four so if you plugged your computer in or if you plug this into your computer and maybe you're running a Windows computer you could access this partition 4 which has it's going to have all the windows files maybe you have I don't know an Excel document that you've opened up and will say it is 200 kilobytes so 200 kilobytes that's about two hundred thousand bytes at and if we say that each block in here these little gray blocks are about five hundred and well not about they are 512 bytes I think I calculated this out it's about sixty to sixty two and a half blocks that you're going to need for that particular excel file now one of the great things about this block storage device is when you edit your excel file maybe you make changes to cells a1 through a4 I don't know that's only going to affect just a few blocks within that entire file so say the file is 62 and a half blocks maybe you only have to edit four of them and when you press save it's going to only find the four that it needs to edit it'll edit them and then the file is saved you don't have to do it all at once so it's really efficient in that way now the confusing part is when we're comparing the file versus block storage because in the end what we see as users is always going to be this hierarchical file system here on the left you know it's very simple got the parent folder some files child folder another file pretty simple and this file system is read from each of these partitions but the partitions are going to be storing the files as blocks now this concept is in the context of just general computing now when we bring it into the context of cloud computing and we're talking about block storage it actually means a little bit of a different thing so block storage in the cloud computing kind of area is going to be equivalent to the sand or storage area network the reason that it is kind of associated with sand is because the sand is basically a network of hard drives which are block-based storage so it's taking those direct attached storage devices that we saw way back here and it's putting them in a network so it's kind of similar to the NASS but you've got one major difference here the big difference here between the NASS and the Sam is that the sand or storage area network also called block-based storage is going to be much more efficient and that is because it's connected over fiber-optic cables rather than a local area Wi-Fi network so if you know how fast fiber-optic cables are you can run somewhere around like up to 128 gigabit per second and your common download speed on a Wi-Fi network is gonna be something like 96 megabits per second which basically means that the fiber-optic cables are about a thousand times faster than your you know not even average above average Wi-Fi network so you can see how this is going to be much higher performance it's also really scalable just add more hard drives and as we talked about since it's on fiber-optic cables it's great for a lot of readwrite operations maybe you have a database that is constantly updating that's going to be a great use case for this there's a few downsides to the storage area network number one it's very expensive it's not going to be ridiculously expensive if you're using it for cloud computing but if you try to set it up on your own it's going to be expensive it's going to be complex it's just tough to set up so that would be the downsides but overall the is probably one of the more common types of storage that you'll see in cloud computing you actually saw this a little bit earlier when I was demonstrating ephemeral storage when we went over to digital ocean and we created a volume so you can see that we can enter any sort of storage number and we can easily scale this up or down so if we needed more storage it's as easy as coming in here and typing oh I need 500 gigabytes okay $50 a month so not the cheapest but it's very convenient very nice for a lot of different use cases the last type of storage called object storage is a little bit different from everything that we've talked about before it's kind of in a whole different realm it's a slightly newer type of file storage and it works a lot differently so the biggest difference here is well actually there's several big differences number one you have objects rather than files or blocks so basically you just have a bunch of unstructured data objects that have three parts you have an ID so an ID you have metadata so that might be the authors of the file the date that the file was created permissions on the file so on and so forth and then you have this blob which is the unstructured data maybe that would be a picture file or a large video file and that's going to be stored in the block so every time you go to update one of these objects you have to update the entire thing so rather than as we talked about earlier where if you've got this excel file and you only update maybe blocks one through four out of 62 it's only gonna edit those four blocks now with object storage if you wanted to make any sort of change maybe you just wanted to I don't know change you know cut out the first five seconds of your video that's going to basically create an entirely new object you can't do these piecemeal operations and therefore this type of storage is going to be best for a very specific use case and that is storing lots of unstructured static data in other words we write once and read many times great example of this would be YouTube videos once the authors uploaded it they're not going to really change it that much so that's going to be a perfect use case for object storage another distinguishing difference between object storage in block storage and all the other stuff is the way that the the data is accessed so with block storage we had talked about how it's on a local area network it's transferred over Wi-Fi but it's actually going from machine to machine now with block storage you're getting it over fiber-optic cables directly connected now with object storage we're actually going to be making HTTP requests so this is good and bad the good it's easy for developers to work with object storage because you can just integrate it right into your code make an HTTP request and you've uploaded your video or whatever file that you're uploading the bad thing is that means that your o is gonna have to be online and it's affected by the network that you're on because an HTTP request is going to be basically on the Internet so as we said this is going to be great for lots of static storage so if you had like a content delivery network user data that was like photos or videos maybe even backups of your website this is going to be great for those use cases one downside is it's pretty expensive it's it's not the cheapest storage to buy I know I said block storage was also pretty expensive but on the scale this is probably going to be the top tier just because it's very efficient and it's also pretty new in terms of storage last thing I wanted to get through before wrapping up this video is the storage product names that all these different services are giving the products so I know marketing departments do their best to you know market their product is the best thing ever but in the end all of these products are going to be the same regardless of what they are called so you can pause the video just look through this chart for example s3 is the simple storage service the Amazon offers but really it's just object storage nothing more than that so pause the video look through this if you want otherwise that is the end of the video if you liked it do me a big favor and hit the subscribe button and give this video a like until next time I will see you later you
Info
Channel: Zach Gollwitzer
Views: 46,675
Rating: 4.9103446 out of 5
Keywords: nas storage, san storage, network attached storage, storage area network, object storage, cloud computing storage, cloud storage, cloud storage types
Id: 3r9RGJ0_Bls
Channel Id: undefined
Length: 29min 28sec (1768 seconds)
Published: Sat Mar 30 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.