RAID Levels Explained RAID 0,1,5,6,10 - Which one is right for you and Why?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right how's it going y'all so today i'm doing a video that i really probably should have done a while ago because it's talking about the basic raid levels and what that means so this is going to be going over how all these different raid levels work and what raid level is right for you and your nas or just if you have a bunch of hard drives together this is going to be really just talking about the basic raid levels that you'll ever actually see so i'm not going to be talking about like raid 4 which i have never really seen it implemented ever because of some issues and we are also going to be talking about some of the more different raids so for example there is going to be synologies shr1 and shr2 and i'm also going to be quickly talking about zfs is raid z1 raid z2 but i'm not going to be going into their like full in depth why they're so different it's going to be very much a beginner video all right so first off what is raid so raid stands for redundant array of independent disks or inexpensive disks there's just two definitions to it either one of them is valid and even the name redundant is not true for all raid levels well just one grade level so essentially what raid is is stacking a bunch of hard drives together to make it into one big hard drive with performance benefits over a single hard drive as well as generally fault tolerance though not all raid levels have fault tolerance raid 0 is the only raid level that does not have any fault tolerance and by fault tolerance i mean you can lose a disk so you can have a mechanical failure somebody can accidentally pull out the drive anything and the raid will still function and you will be able to rebuild off of that and obviously because you can't get something for nothing you can't somehow be creating data out of nowhere you actually do get the cost of how many drives you can lose in the amount of total storage capacity you've got and so it's a very effective way to not only significantly increase the capacity of a single drive but also give it some protection so if a drive fails it's not going to take down everything and stop you working now i should also mention really quickly right up here at the top of the video raid is not a backup raid is a redundancy so if you have a raid 1 configuration of 2 drives that means that you can lose any one of those drives and you will still have all of your data that being said raid will not protect you from a hacker raid will not protect you from a fire damage raid will not protect you from a dumb delete or a format because any command is issued to both drives period even if it is something that is wrong and so that's really where a backup differs from a redundancy in that raid is a redundancy it's got extra redundancy so if a mechanical drive fails you're still okay whereas a backup is really designed to recover from so normally you want what's called a disaster recovery backup so that means that everything in the world can go wrong and you can still get your data back and that is not what rate is it's not a bad place to start but for your really crucial files back them up not going to talk about that anymore here but that's just one they knew all right so now that we know that raid is really just combining drives into one bigger drive we can go through and talk about the different levels there are and we're going to first talk about the standard raid levels so it's going to be raid 1 0 5 6 and 10. those are going to be by far the most common raid levels you ever experienced and we're going to talk about their advantages and disadvantages because there are a lot there all right and so now to start this off let's go ahead and talk about the very most basic one that we're ever going to see and it's going to be raid 1. raid 1 is very simple and you generally see it with two drives raid 1 is set up where anything that's on one drive is on all the drives and so right here is a great diagram that really just shows you so i'm just going to be showing these a lot these are from wikipedia which is awesome and what they've got here is they see a 1 a 2 a a2 a3 a4 and so those are different bytes of data so those are data that goes into your hard drive and as you can see these are two different disks and the way raid 1 works is anything that's written to the raid is written to every single one of the disks so if i had 12 disks here which you would very rarely see a raid 1 configuration because you'd only get the hard drive space of a single disk i would have every single one of them having all the data on it and so raid 1 does have some advantages one it is very easy to recover from and it's very easy to start with so say i was to pull out disk one here i just ripped it out of the system i could easily one keep going because disk zero is still there and it has a full copy of all the data nothing needs to be done there and if i put in a new disk it could automatically copy over the contents of disk 0 to disk 1 and it would not have to do any math or anything fancy it would just have to copy those over because it's just well here's all the data just copy it on over now raid1 also does come with a penalty because it has the storage capacity of a single drive and it's because it's all mirrored i should also say all the standard raid levels that we're going to talk about today other than shr but that's a different story all assume that every single hard drive in the raid array is the size of the smallest disk so that means that if you had 12 disks of eight terabytes and one disk of one terabyte every single one of those drives in the raid level for all this raid calculation math would be considered one terabyte and so that is often a limitation with raid and that you have to kind of buy all the same size disks now synology has shr which does a whole different black magic but for all of the raid levels that is true okay so that is raid 1 and so this is used if you have a true drive system but it has a 50 storage limitation so if you have two drives you only have one drive of space now if you had like 12 drives for some reason in raid one array you'd have one over 12 the entire capacity of theoretical but it's pretty much the entire capacity of a single drive where the advantages of it is is say you've got a very busy file server and people are always reading files from it since the data is written on both of the drives identically if you've got a ton of people trying to access data on there they can read the data from disk 0 or disk 1 and that means that you have pretty much double the overall speed of a single hard drive and that works with a lot of people accessing it which is normally something that slows them down because any one of these can access the files and so that's where raid 1 really shines is those random accesses with a large hard drive and it's just very basic to start with your write speeds are going to be the speed of a single drive's ability to write here because any data that's written to these are have to be written to all the drives and so that's raid one very very very basic i generally recommend it for a setup with only two bays if you do want some redundancy so that is going to be raid one as you can see very very very simple to use but has a pretty steep storage capacity limitation and so now let's talk about the other most basic grade level and that's going to be raid 0. so raid 0 is kind of the exact opposite of raid 1. raid 0 goes through and any data that's written to the raid is striped across every single on the drives so this is a case where we've just got two drives here but this also works if you have 20 drives here and so you can see here that any of the data that's being written is only being to written to a single drive and so what happens is anytime a write is coming in it is being basically chopped up into all these small sections and then distributed out to every single one of the disks and so that means you can get insane speeds out of raid 0 because you can read and write from every single one of these drives at the exact same time now this is kind of outside the scope of this video but raid 0 does not perform as well with random reads as raid 1 because data does still have to be written from all the drives and so it does not give that advantage that raid 1 had where it can randomly read data from two different drives instead when you're randomly reading on raid 0 you get basically the random read performance of a single drive outside the scope of this video i'm not going to talk about that too much but that's just one factor that the enterprise really cares about now raid 0 does come at a big danger cost because any single drive in the entire raid 0 array that fails completely kills the array and no data is recoverable off of it that's because every single file has parts that is on every single drive and so that means it's not like you can just go say oh well the majority of my files will be fine no every single one of your files will have a crucial component on that disk that failed and therefore you lose one drive completely toast now that being said you also do get as much storage as you get out of it and so if we had 12 drives we would have the storage capacity of 12 drives and that is the one nice thing about raid 0 and it is very fast and so raid 0 is very good for scratch data data that is somewhere else and if the drive failed it's not a big deal if you lost all the data on there it's backed up somewhere always but you just want something that's really screaming fast and so that's often what raid 0 is used for and i use it for certain things all the time especially things where it's like okay if it fails it's not that big of a deal i'd much rather have faster performance than anything else so that is raid zero but just remember if you're setting up in raid 0 and you lose a single drive your entire thing is toast all right so raid 1 and raid 0 are our two most basic grade levels and as we can see they're kind of polar opposites from each other raid 1 everything's written to every disk raid 0 everything's written to only one disk at a time and so now let's go over to kind of the more complex raid levels we're going to start with combining raid 1 and raid 0 together with a raid 10 array and so unfortunately wikipedia does not have a easy to show diagram for raid 10 and so i'm just gonna have to talk it through it's really very simple honestly and it is pretty much just the combination of raid 1 and raid 0. and so you in a sense get the benefits of both so the way you set up raid 10 is you go through and you've got a bunch of different raid 1 arrays so say i had 8 disks i would then combine each of those 8 disks i'd pair them up until i had four total raid 1 arrays and so i would have four groups of disks that are all mirrored and so then i take those four raid arrays and i would stripe them with a raid 0 and so the way that would work is first the raid 0 would happen and so any file is going to be chopped up into four different pieces and it's going to be given to each of those raid 1 arrays and that way i get the speed and performance of raid 0 with the fault tolerance of raid 1 and so i could lose any one of those drives and still keep running and so the advantage of raid 10 is really kind of twofold there's two reasons why you see it in use the first is its speed of rebuilt and this is going to be way more important in a minute here when we talk about five and six but raid 10 is very easy to rebuild that's where you lose a drive and have to rebuild it from its parity or in this case it's mirror and so if i lose any one of those eight drives i can easily easily rebuild it by just sticking in a new drive and copying the information from its pair there is no fancy math that has to be done and only one drive has to be accessed to rebuild it and the other thing is you're pretty fault tolerant so you could in theory lose four drives in that raid 10 array because they were eight total drives and therefore there were eight mirrors however you could also lose two drives and lose everything it's kind of weird in that sense so remember we've got those four groups of raid 1 arrays and they're striped in a raid 0 array so i can lose one drive from any one of those four raid arrays those raid 1 arrays i could lose one drive and still have all the data because it's got its twin however say i lost two drives in one raid one array so say i lost both the drives and won the raid one arrays that would mean i lost all my data is that likely to happen probably not but that is the one kind of risk of raid 10 is you can be either very fault tolerant or not that fault tolerant but its advantage is huge copying over all that data from one drive to another is just so simple it doesn't require any math and so it's got very fast rebuild times which is very important when you have massive arrays of disks and it can take a very long time to rebuild them especially if they're getting hit really hard and so that's why a lot of enterprises will use raid 10. the other advantage and the other reason why enterprises like raid 10 a lot of time it is moving away as well is because it is very simple and it's got great random read performance in enterprise you're almost always doing a lot more reads than you're doing right and so by having the ability to randomly read from this massive array of disks very quickly that's very advantageous and the reason you've got these great random reads is because you can actually read from half the drives independently and so either one of the drives if one of the drives in a raid one array is busy the other drive will perform that read and that makes it very very very fast for random reads specifically and that is the primary reason it is used in enterprise i do not recommend running raid 10 for pretty much any users the advantage that it's got for the very fast rebuild time is not an issue for most consumer nasa's because realistically a raid 6 array will be fine because you're not using your nas 100 of the time you're not the enterprise and so raid 10 has a lot of disadvantages because it's got 50 storage capacity half your drives are used for parity and you still have a real risk of losing two drives in one of the stripes the other issue it's got on synology and i don't believe anyone else has the ability to do this either is raid 10 cannot be expanded raid 10 can only add hot spares to because it's striped across and since there is a stripe a raid 0 array at the very top you cannot expand a raid 0 array and therefore you cannot expand a raid 10 array on synology or as far as i know anywhere else and so because of those two reasons i do not recommend raid 10 for pretty much any users unless they actually are going to be having these drives running at 90 almost the entire time that's the only reason why i would recommend using raid 10 and i have yet to find a single client who actually has that storage requirement who actually has that requirement especially in the era of ssds raid 10 is not nearly as useful as it once was because if you need great iops you can just get an ssd which has almost unlimited iops compared to a hard drive and they just scale a lot better in arrays because they don't have that horrible random read performance and so i do not recommend raid 10 for most users but it is there as an option all right so now we've covered the three really most basic ways you can raid disks because none of these have required any special math whatsoever it's just chopping up data and sticking in different places we're now going to go over rate five and six which is way more complex but i'm going to give a simple example on how it works and you're just gonna have to kind of trust the process here so we're going to start with raid 5 and really i'm just going to explain raid 5 and then it's just continuation to raid 6. so raid 5 works by taking a section of data and chopping it up the way raid 5 works is raid 5 allows you to lose one disk and still rebuild the entire ring and it also takes away that one disk of storage capacity so it's pretty simple to understand so you can lose any one disk in a raid 5 array and you will still be able to recover all of your files the way raid 5 works is it writes all the data across the first x number of drives so let's for this example we will just say we've got four drives and so whenever data is being written it's chopped up into three different sections that's because one of these is not going to be used to actually store the data but this thing called parity and so it's going to write our first three pieces of that file to the first three disks then is going to do some special magic math called parody and stick the result of that math in the fourth drive and so this parity bit right here allows you to reconstruct any one of these three sections as long as you have the other two so it will allow you to figure out what was in each of these as long as you've not lost more than one drive and so that is how raid 5 works it is very fast for sequential reads and writes because all the data is being written to all the drives except for one of them and so you get very good overall performance for that but raid 5 does not scale with random reads and random writes because all the data has to be striped across all of them and so any read has to be read from all the disks and so that's just how raid 5 works but it's a great setup here the other thing about raid 5 is as you can see as we go down the blocks we can see that the actual disk that gets the parity bit changes that just keeps it from having one single drive that has all the parity on it which is not a great setup to have instead it's always shifting down and so every single drive has some parity on it that way you can that way when you're reading you're reading from all the disks rather than all the disks minus one and so this raid 5 setup allows us to have as many disks as we would like and be able to lose any one of them and recover all of our files not even recover just keep going with all of our files your end user would never know that a single drive was failed because it just keeps working you can keep reading and writing from the file system as much as you'd like and it allows us to rebuild now raid 5 is great because it's very storage efficient too so you have eight disks you will now have the storage capacity of seven disks while still being able to lose any one of them that's why raid 5 is great and what i recommend for a lot of users there is one issue with raid 5 though raid 5 has a very long rebuild time that's because it has a lot of complex math going on around here as you can see there's all this parity calculation going on for every single piece of data on a single drive and what makes it really slow for a rebuild is that data is on every single drive and so to rebuild a raid 5 single drive every single drive has to be read constantly now for home and small business users who are not using their nas 100 of the time this is not that big of a deal because this raid rebuild can happen in the background and so when you're not using the drives this raid rebuild can happen fairly quickly the reason enterprises really stay away from raid 5 in most cases is because in enterprise it's very likely that that drive is being written or read almost a hundred percent of time and so because it's being slammed like that the rebuild process is going to be on the bottom of the priority list and so that can make a rebuild take a very very long time and so because of that a lot of enterprises once they get to a lot of disks we'll go up to raid 6. raid 6 is going to be the exact same thing as raid 5 but now there's two parity bits so that means you can lose any two drives and you also lose two drives of storage capacity the math on it is way more complex but that's how raid 6 works all right so now i'm just going to do a really quick example showing how this actually works because it's kind of interesting so this is called parity so i'm just going to set up four drives and we're going to show how we can actually recover the data that was in any one of them okay so we've got four drives of data right and we're just going to say this is the the first piece of data and this is going to be a simplified example that's not actually how it works but this will help communicate exactly how the entire process works because i thought it was really weird that this was possible but it's actually quite simple so we've got four drives right here in a raid five array and so our fourth drive we're just going to pretend is parity once again the super simplified thing and we're going to go down to the smallest possible level and we're going to talk about a single bit a bit is either a one or a zero and it's how a hard drive ssd anything stores data is in bits so now we're going to say that one zero one right so these are just three random pieces of data that are either one or zero and they're on the three different drives of data so now we have to put the parity bit in the very fourth disk and so there are a lot of different algorithms you can use for parity there's a bunch of different ones all across the board we're going to go with by far the simplest here and we're going to be calling this odd parity so odd parity is very simple sum up all of the bits and make sure that the result is odd that is all we have to do here and so we are going to ensure that no matter what's in here the parity bit is odd so right now we are not even going to care about what the total result of all these this data is this could be 50 drives i would not recommend that but it could be 50 drives and so we're going to stripe this across right now and we are going to just make sure that when we sum this up it's all odd so right now we're going to start even because we start at zero so we're going to start even add one makes it odd add zero keeps it odd add one makes even oh now we're at the end so now we are currently even we need to make this odd all right one so now when we actually sum this up right here it's going to be odd and so now say we go and lose drive two we've lost drive two it's completely gone we know that the result of all of this has to be odd and so if we sum it up really quick let's see what there is so we're going to start at zero so we're currently even add one we're now odd we don't know so we're still odd add one we're now even again add one more we're now odd again and we know that all the data across here has to be odd and that means that the only option for drive two is going to be a zero and that is the most basic form of parity you can have there are way more complex algorithms out there that can detect fault tolerance and actually detect more issues than they can recover from using some pretty cool hamming equations but this is the most basic form of parity calculation and so by using this very simple algorithm we can always figure out the data that is on a missing drive as long as it's just one so that's how raid 5 works and raid 6 is just a continuation of this it is just an extension of this where we've now got two different parity bits meaning we can lose two different drives and recover from everything it is a lot more complex of math and so it does tend to be slower but it has the advantage of actually having two different drives of fault tolerance so now let's talk about when you should use raid 5 and when you should use raid 6. and i'm also going to say for all intents and purposes right now raid 5 is shr 1 for synology and raid 6 is shr2 for synology and so there is a case where that's not true so if you have an shr one array and you've only got two drives it actually becomes a raid one array and so that's one little tidbit but fundamentally it's the same shr one is going to be raid five and shr2 is going to be raid 6 for any kind of disc math we want to do so raid 6 is useful for businesses who need constant uptime and are okay losing two drives of storage data i normally would not do a raid 6 array with fewer than 4 drives just because it's not very useful you're not going to lose the drives probably if it's crucial and you want to be able to run for months without actually getting a replacement drive you can do that but normally i tell people to start really looking at it around six to eight drives is where i'd start to look at it and anything past 10 drives you definitely want to be using raid 6 just because of how many drives there are out there obviously you need to be having a backup as well but backups are slow to recover and so by having fault tolerance on your drives you can keep going for much longer and not have your employees sitting around while you're waiting on restoring a drive and so that's where i would kind of choose between raid 5 and 6 or shr 1 or 2. now the only difference between raid 5 and shr 1 is well one if you've only got two drives and it's raid one but more than two drives is it's got this cool ability to allow you to add drives and mix sizes as long as you're always adding larger drives so say you start off with four drives of eight terabytes and then you add four more drives of 10 terabytes with raid 5 all those drives are going to be considered to be the smallest drive which was 8 terabytes however with shr-1 it will actually allow you to combine those together and get more storage out of it the way shr-1 works for a total like disk storage is find the largest disk in the array and pretend like that one does not exist then sum up the sizes of all the other ones and that's going to be your total data size with raid 5 the way you calculate it is you find the smallest disk in the array then you multiply that by the number of disks in the array minus one and that's going to be your size and so that is why shr-1 can be very useful if you want to combine larger drives later on but it does have some weird performance stuff i'm not gonna go in here then raid six do the exact same things i just said but where i said one say two and that is how raid 6 works and so finally i cannot forget about our zfs people and i'm going to butcher this but it's going to be essentially the same thing that you need to understand so with zfs it's a little different and it's not the same i'm going to do a dedicated video on it but when you've got a zfs mirror that is going to be a raid 1. if you've got a raid z1 it is going to be a raid 5 which is effectively you can lose one drive and then raid z2 you can lose two drives and they actually have a raid z3 which you can lose three drives there's a lot of different stuff there that can be very different and that's a whole different video but that's the simplification of it and so now we can go down here and just kind of talk about to regroup on all this our different performances and our fault tolerance and our performance and the read and write performance are a little bit of an asterisk on there due to the fact that raid 5 and 6 require the special parody math and that can take a while not as much as it used to but it can take a while cpus have gotten a lot faster so we're going to start off with raid 0 and raid 0 requires a minimum of two drives technically it is what it is i mean you could also argue that any single drive is a raid 0 but not really so raid 0 you just striped across all of them so you have no fault tolerance whatsoever but you've got perfect space efficiency and your sequential reads and writes will scale with your number of drives so say you have five drives you're going to have five times the sequential reads and writes of a single drive now that is not going to be true for random reads and writes for random reads and writes you will actually have the speed of a single drive generally in terms of random iops it's a little bit more funky and it's going to be a little bit better because it doesn't have to read as much data but for true complete random reason right it's not going to scale then we've got raid 1 which is just everything on every drive and so it's got a horrible space efficiency of 1 over n and so in most cases this is just gonna be two drives and so you can lose either drive but if you had ten drives you could lose nine of them and still have all of your data but you're only gonna have the space efficiency of a single drive your read performance is going to scale linearly and so if you have 10 drives in a raid one array you're going to have 10 times the read speed and you're also going to have 10 times the random read speed too and that's actually why raid 1 was pretty big back in the enterprise before ssds is because it just had screaming fast random read performance for databases and then your write performance is going to be the slowest of all of your drives because every single one of your drives has to write every single bit of data and so it is by far the least storage efficient but is the most simple and so then raid 10 they once again don't have it on here because they don't like it apparently it's just because it's a combination one and so raid 10 you are going to i'm assuming you're setting up in the normal manner where you've got a bunch of two drive mirrors in which case you've got a raid 0 array composed of a bunch of raid 1 arrays and so the way that's going to work is your minimum number of drives is going to be 4 then your space efficiency is going to be 50 your fault tolerance is going to be between 1 and half your number of drives depending on where that drive failure is your read performance is going to scale linearly for sequential reads then for random reads it's actually going to be two times a single drive a little different there and then write speeds are going to be half the overall speed you could get if they were all together so if you had 10 drives you would have the right speed of five drives because there are five groups that both have to write the exact same data and then your random write speed is going to be the speed of a single drive most likely so that is going to be raid 10. now we're going to talk about raid 5 and so minimum number of drives for raid 5 is going to be 3 and your space efficiency is just going to be 1 minus your total number of drives and then overall the amount of space you have is just going to be your total number of drives minus one you could lose one drive of it and your read performance they say is in it's not so here you're sequential read performance is going to be n minus one because of the way that parity works and the parity is striped so really what you're going to get is n minus one then your sequential write performance is going to be the same for your random read and your random write performance it is both going to be one at best and raid six is everything i just said minus two and so those are how all these different raids work it's worth it to open up a raid calculator like this and just type in some numbers here and kind of understand everything you've got because these are very useful for kind of understanding what's going on synology also has a great raid calculator right here so if we go down to synology and i believe it's like right here right here this will help you understand exactly what's going on with shr and it'll also help you understand what is going on with your different raid levels and so it can be very useful and it's a great graphic here so i would definitely check it out and so that's another great resource to use all right well that's going to be it for this tutorial or i guess overview go and leave any other tutorials you'd like to see me make in the comments below and subscribe if you haven't already alright and have a good one bye [Music] you
Info
Channel: SpaceRex
Views: 47,084
Rating: undefined out of 5
Keywords:
Id: 4J7iSumiJNk
Channel Id: undefined
Length: 34min 1sec (2041 seconds)
Published: Wed Sep 14 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.