Choosing The BEST Drive Layout For Your NAS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
should I get four drives or what about eight should I use ra Z1 ra Z2 what about striped mirrors these are all questions you might be asking yourself if you're trying to set up a Nas for your home or maybe even small business and it can get a bit complicated so in this video I'm going to dive into a lot of the various things you need to know when deciding how many and what type of hard drives to purchase as well as how to configure them I'll cover how to get the capacity and level of resilience you're looking for as well as what performance to expect and I'm even going to on some real world benchmarks so let's get [Music] [Music] started trying to plan out a Naas can be a bit of a pain but not as big of a pain as trying to plan out all of your meals fortunately there's hellofresh the sponsor of today's video whether you're looking to save money stress less or eat healthier hellofresh is here to help all you do is pick from over 45 recipes each week and then wait for the meals to pop up at your door the recipes are easy to follow and everything is pre-portioned so there's less waste and less hassle the food tastes great thanks to the farm fresh ingredients and chef-crafted recipes and while I love having dinner with the family sometimes it's nice to just whip up something quick for lunch or even a snack that's easy to do thanks to the 15-minute recipes and quick and easy Market meals right now hellofresh is giving all new subscribers free breakfast for life with one free breakfast item included in each hellofresh order if you're interested go to hellofresh.com and use my code Hardware Haven free to get free breakfast for life that's one breakfast item per box while your subscription is active again that's code Hardware Haven free at hellofresh.com for free breakfast for life now before we delve too deeply into this let me outline what this video will Encompass I'm going to be discussing smaller NAS deployments from four to eight drives using ZFS specifically with tras core now this should apply to any Nas running ZFS like tras scale or even unraid now that it supports CFS if you're not using ZFS some of the topics will still be relevant but not all of them I'm going to cover the basics of pools and vdevs in CFS and the advantages and disadvantages of raid Z mirrored and striped vdevs we'll also look at the differences to expect when you transition from four to six to eight drives in terms of performance capacity and resilience now what I won't be covering are things like data sets and zoles compression caching or what specific Hardware to use with all of that out of the way let's talk about some ZFS Basics starting with pools pools are sort of the base storage in ZFS they're where you create data sets and volumes to share with other devices pools aren't comprised of drives though they're comprised of virtual devices or vevs the pool essentially sees each vdev as a single drive and then it stripes or shares the data across all vdevs vdevs can be comprised of one or usually mini drives and can be set up as a stripe mirror or in something called raid Z but what do those things mean well when a vdev is striped it just means that all of the data is distributed across all of the diss so data that exists on one dis doesn't exist anywhere else the benefits here are that no drive space is wasted for parody data and you can get incredible performance but you do lose all of your data if just one drive fails so stripes are rarely used in data vdevs when a vdev is mirrored all of the data is identical across all of the drives mirrors are often comprised of just two drives but can also have three or more the big downside here is that your usable capacity is cut in half with a two-way mirror or even cut down to just a third with a three-way mirror raid Z gets a bit more interesting and I should note that while raid Z and ZFS is similar to something like raid 5 or Raid 6 with other systems it isn't exactly the same so take all of this with a little bit of a grain of salt in raid Z or as I prefer for clarity sake raid Z1 one drive's worth of capacity is used for parody data this parody data doesn't exist on just one drive but it's split across all of the drives because that parody data exists if a single Drive fails the vdev can be rebuilt without losing any data this is very similar to raid five if you're not using ZFS raid Z2 or Raid 6 is exactly the same as raid Z1 but uses two drives worth of parody allowing for two drives to fail in a vdev before data is lost now there's also a raid Z3 but I'm not really going to cover that much in this video but it's exactly what you would expect with three drives worth of parody data when deciding on the number of drives to use the capacity of drives to use and the config configuration there isn't really a wrong answer the goal is just to find the best option for you and your target workload when deciding the configuration there are three things you can optimize for capacity resilience and performance and you can kind of Imagine these as a triangle where typically you can optimize for two of them so you could build a pool with great performance and capacity at the cost of resilience or you could build a very resilient Nas with lots of capacity at the cost of performance or you could build a pool with great performance and resilience but lose a lot of capacity there's also a fourth factor which is cost so it's almost more like a upside down pyramid where you can always get more capacity resilience or performance as long as you just spend more money so really it's all just going to be a calculation for you based on your needs and your budget I think this calculation will be a little bit easier to understand if we go ahead and take a look at a hypothetical Nas with four hard drives in ZFS there are realistically four ways we could configure a pool with four drives we could stripe them with each Drive set up as a single drive vev but that's a terrible idea in most situations since you wouldn't have any redundancy I'll go ahead and keep that in here for now though you could also set it up either in raid Z1 or Raid Z2 you could also set up two mirrored vevs this is also commonly called striped mirrors with these four configurations let's start looking at the pros and cons of each starting with capacity now just to keep the math simple let's imagine that we used 1 TB hard drives the stripe vev would be able to use its entire 4 TB capacity but that's because it has zero redundancy and should pretty much never be used in an asz The Raid Z1 vdev would be limited to 3 tbyt of usable capacity due to the one terab of parody data but that still makes it the winner here the raid Z2 vdev and the two mirrors would be limited to just two terabyt of usable space Now quickly I do want to mention one benefit of mirrors over raid Z in ZFS there are two ways to expand a pool one is by slowly replacing each Drive in a vdev with larger hard drives and the other is to add more vdevs with mirrored vdvs only needing two drives it makes it a lot easier to just Chuck in two new hard drives and have more data in your pool now what about resilience well the striped to miror is the obvious loser here as it has zero redundancy The Raid Z1 and raid Z2 configurations can lose one and two drives respectively without losing any data but the mirrored setup is a little trickier technically it can lose two drives without losing data so long as those drives aren't in the same vdev if two drives in the same vdev are lost there's no way to recover that data anymore and it's also important to note that in ZFS if you lose one vdev in a pool you lose the entire pool so that covers parity which is important when talking about resilience but there are some other factors to consider as well when you replace a failed drive it takes time for the nas to put all that correct data back on the disc this process is known as resilvering and it can be kind of scary if you're in raid Z1 for example and you need to Res silver a failed drive you can't afford for another drive to fail during that process so ideally you want the res silvering process has to be fast obviously things like whether the drive is an SSD or spinning disc or what RPM the drive is might affect this but the amount of data on the dis can drastically affect res silver times as well unlike traditional raid the entirety of a replacement Drive doesn't need to be Rewritten in ZFS you only have to rewrite the amount of data that was stored on the fail drive so for example rebuilding a 20 terab disc that only had 2 terabyt of data on it would pretty much take just as long as if you were replacing a 4 tab disc with 2 terab on it if you go with more small smaller discs the res silver time might be much shorter this could potentially prevent you from losing another disc in the entire pool during the process the configuration of vevs also affects the time of res silver takes with mirrors being the fastest now I didn't have time to Benchmark this but I came across some good benchmarks from lauren.com not sure if I'm pronouncing that right but I'll make sure to link that in the description here it seems that small raid Z1 vdevs are roughly on par with mirrored vdevs but as the vdevs get wider and you move up from raid Z1 to raid Z2 or even raid Z3 you start to get much longer res silver times okay so with our hypothetical for Drive Nas we've covered resilience and capacity but what about performance well this always gets a little bit tough because there isn't just one metric of speed realistically you have bandwidth iops and latency but I'm not covering latency in this video there's a really good PDF from ex systems that I'll also Link in the description that goes into depth about the theoretical limitations of different pool configurations in terms of bandwidth and iops to quickly cover the key takeaways with striping you pretty much get the full read and write bandwidth and iops for all the drives so for some really easy math assuming each Drive can deliver 100 megabytes per second read and write as well as 100 iops read and write if you have four striped drives you could theoretically get 400 megabytes per second sequential reads and wrs and 400 iops when doing reads and wres when talking about mirrors the same applies to reads but with rights you're limited by the number of vdevs so if we assume we have two vdevs with two-way mirrors you could get 400 megabytes per second and 400 iops in reading but only 200 megabytes per second and 200 iops in writing with raid Z iops are pretty straightforward each vdev has essentially the iops of a single drive so with a raid Z1 or Raid Z2 vdev of our four hypothetical hard drives you would only get 100 iops whether you're reading or writing for streaming bandwidth though you get the capabilities of all of the drives minus one drives worth of bandwidth per parody level so with raid Z1 you would have a parody level of one which means you lose 100 megabytes per second giving a total of 300 megabytes per second read orite and same goes with raid Z2 but you will lose 200 megabytes per second so 200 megabytes per second read and write okay so with all of that we've covered capacity resilience and hypothetical performance of four drives but what if we were to add two more well here this opens up a lot more options we would still be able to have a raid Z1 vdev a raid Z2 vev and striped mirrors although this time we could have three vdevs instead of just two we could also set this up with two raid Z1 vdevs with three drives each assuming we're still using 1 TB drives a single raid Z1 vdev now provides us with 5 terabyt of usable space with raid Z2 we would have four terabyt of usable space and if we were to use the two raid Z1 vdevs we would also have 4 tabt of usable space because two-way mirrors cut our usable capacity in half we're left with just 3 terabytes now as I mentioned earlier there are other options that I'm not going to really cover like for example you could set this up as just one raid C3 vev or you could set up two three-way mirrors in my opinion most of those don't really make a lot of sense except for some very specific applications so I'm not going to cover those today when looking at resilience nothing really changed with raid Z1 or Raid Z2 however it's important to note that while we still have the same parody levels with raid Z1 and raid Z2 we don't have the same degree of resilience as we add more drives to the vev the odds that two or three drives fail goes up as well so this is why you typically don't want to have what's called wide vdevs a lot of times it's better to have multiple smaller vdevs like with the two raid Z1 vdevs there we can lose two drives as long as they aren't in the same vdev with the three two-way mirrors we can now technically lose three drives but once again only if they aren't in the same vev as we add more drives and configurations theoretical performance gets more interesting raid Z1 and raid Z2 iops would be the exact same but we would now be looking at the streaming bandwidth of 500 megabytes per second or 400 megabytes per second respectively with our hypothetical hard drives with the two raid Z1 vevs we would get double the iops but only 400 megab per second streaming as we still have two drives worth of parody with the three two-way mirrors we would only get the right performance of three drives but with re iops and bandwidth we would theoretically be able to get the performance of all six when moving up to eight drives I added one more configuration two vdevs in raid Z2 this does cut usable capacity in half but we can also lose two drives in each vdev without losing any data the most capacity friendly single raid Z1 vdev currently provides Us Now with an impressive 7 terabytes of usable space however that does come with some drawbacks if two drives were to fail once again we would lose the entire pool and with eight drives the probability of that happening is significantly higher for performance all of the calculations here are pretty much the same giving us a lot of streaming bandwidth for the really Wide raidy pools but we get substantially higher iops when moving to two raid zv devs and even more when moving to four two-way mirrors I find it interesting to look at the theoretical performance but I also wanted to take a stab at some real world benchmarking now fortunately ex systems sent over a true ASAS Min R for me to borrow and with 12 hot swappable Drive bays and 10 GB ethernet this made for a really great testing platform all of my tests were run using tras core with 10 tbyte Western Digital red plus drives for each configuration of vdes we've covered so far I ran two groups of benchmarks first I used fio or F I'm not really sure running on a minis for ms01 with Debian 12 installed the ms01 has dual 10 GB SFP plus ni an Intel i93 900h and a u.2 nvme SSD which should all help to eliminate any bottlenecks with fio I ran four tests in total to simulate multiple users or applications reading and writing a lot of small files at once I ran one random read and one random write test with a block size of 64k an IO depth of eight and with eight threads running in parallel then to simulate large sequential reads and writes like what you might expect when streaming media are editing large video files I ran one sequential read and one sequential write test both with a block size of 256k an IO depth of one and with just a single thread I also ran another very simple Benchmark using something called dis bench I used this to test how long it would take to transfer files between the mini R and my Windows 10 desktop for this I wrote a 20 GB video file to the nas three times and then copied it back to my desktop three times and measured how long each transfer took I also did the exact same thing but with a folder full of a bunch of random config files to hopefully simulate some random reads and writes now if I just ran all of these tests as is performance would probably look really similar across each configuration at least in the read tests thanks to Arc in CFS Arc is a cache that can drastically increase read performance by storing commonly or recently accessed data in memory with 32 gigs of RAM I was going to have well over 20 GB available for Arc which would probably make a lot of these configurations perform very similarly I could have tried testing with a mass of data set to overwhelm The Arc but I didn't want to be running benchmarks for a week straight so instead I ran each test with Arc enabled and then reran each test again with Arc disabled I'm pretty much only going to cover the tests with Arc disabled and I'll show you why here in a bit now also I don't do a lot of reviews and benchmarking and I'm not a statistician so take a lot of this with a grain of salt we'll start off by looking at the fio results for the four Drive configurations with Arc caching disabled for sequential reads and writes I think it makes the most sense to look at bandwidth here the striped pool Takes the Cake in terms of performance but once again is fairly useless in terms of redundancy the mired vabs performed really well here both in read and right performance when compared to the raid Z pools now Arc was disabled but with ZFS there's still right caching with transaction groups and stuff and I think that is the reason that our right speeds are much higher when moving to six drives things change a little bit the single raid Z1 vev now outperforms the three two-way mirrors and the two ra Z 1 vs and reads but lost by a pretty decent margin at rights with eight drives the two raid Z1 vevs had the best read speeds with the four two-way mirrors having the best right speeds I'm not quite sure why the mirrors didn't perform better in terms of read performance in this specific test when looking at these results with all of the configurations it's pretty clear that both sequential reads and wres overall scale pretty well with the number of drives although I would have expected a little bit more of a bump in performance moving from 6 to 8 I figured the striped mirrrors would perform better here but it seems like at least with six or eight drives two raid Z V devs seem to have the best balance of read and write performance now really quickly here are the fio results for sequential reads and wrs with Arc cache enabled and you can see why it's not really helpful because my workloads weren't large enough to really overwhelm The Arc read performance was pretty much the same regardless of what configuration we were using and this is important to know because when using ZFS Arc is really helpful and can add a lot of performance so realistically with most of these configurations the results are going to be somewhere in the middle as Ark is going to help out some when looking at the random read and WR results I think it makes more sense to look at iops versus bandwidth these are the results from the fio random read test for all configurations and the big takeaway is that more vevs equals more iops while there is a good amount of variance here from what you might expect in the theoretical realm the iops roughly scale with the amount of vdevs for example when looking at the six drive configs one vdev gets around 30 iops with two getting 60 and three getting roughly 90 with random rights things got kind of messy I think this is because of how ZFS uses memory caching and transaction groups and this test didn't really show what the drives were capable of so sadly I think I just have to write this one off no pun intended and once again here are the random read results with Arc cache enabled as I mentioned earlier I used dis bench from my Windows machine and you can see pretty good scaling here when transferring a large file to the nas multiple times this is an average of how long each transfer took in seconds so lower is better it seems that iops were still important here even with a large video file as configurations with more vdevs tended to perform better than wider vdevs when just looking at the four Drive configurations the striped mirrors were substantially quicker than the raid Z configurations when reading that file back to my Windows machine it seems like the wider six and eight Drive raid EV devs actually performed best but when just looking at four drives the striped mirrors once again outperformed the smaller raid EV devs now sadly my dis bench test with a folder full of small text files didn't pan out very well I imagine with rights to the Naas memory caching and transaction groups once again smooth everything out quite a bit but I'm not quite sure what happened with the read test though maybe it was a limitation of my SSD or my network connection I'm not really sure but it seems like this test isn't really that useful I think at this point we've covered quite a bit in terms of resilience capacity and performance but what about that fourth Factor cost well there's not a ton to cover here because most of it is just going to come down to how much you need to spend to get the drives and configuration you want for better performance and shorter res silver times buying more smaller hard drives is going to be better than just a few massive drives but that also comes at a cost in two ways while really high-capacity drives can get really expensive you're almost always going to get more terabytes per dollar with large drives versus smaller ones and another small cost that's still possibly relevant is power draw for example this junas mini R Drew 44 Watts from the wallet idle but with four drives it consumed around 69 watts and 97 Watts with all eight drives so this would indicate that each Drive consumed somewhere between 6 and 7 Watts 25 watts or so might not seem like much but if you're paying really high prices for electricity that could easily add up to a difference of 70 80 $100 a year if you're planning to run your Nas for let's say 4 years and you're paying over 40 cents per kilowatt hour buying fewer larger hard drives might be worth the penalties to Performance and resilience knowing what we know now figuring out what configuration of drives and what drives you want to get should hopefully be a little bit easier if you're like me for example and mostly deal with a lot of large video files having tons of operations per second isn't quite as important as having good streaming performance having something like two four Drive vdevs in rid Z1 or Raid Z2 might be a good balance between capacity resilience and performance if I was building a server purely for backing up Mission critical files raid Z2 or even raid Z3 might be a good option and if I needed more ride operations for multiple users I could maybe set up a pool with two raidy veedubs for announ that's running a database or volumes for virtual machines you'll want tons of iops in that instance striped mirrors might just be a no-brainer and for more resilience you could go with three-way mirrors allowing for an extra drive failure per vdev regardless of how you configure your Nas it's always a good idea to have a backup of your data on a different Nas this can be tough especially if you're on a budget but it's the best way to ensure your data isn't lost from a hardware failure or more likely human error with traz and ZFS there's a lot more to play around with than just drive layouts and if you want to learn more about things like like caching data sets and all the other fun stuff to configure I'll have lots of helpful links down in the description this video was quite a bit different for me but I hope you guys still found it helpful if so maybe give me a thumbs up and tell me about your Nas configuration down in the comments below that's about it for this one though so as always thank you guys so much for watching stay curious and I can't wait to see you in the next [Music] one [Music]
Info
Channel: Hardware Haven
Views: 108,236
Rating: undefined out of 5
Keywords: raid, zfs, drives, how to setup raid, which raid should i use, raidz vs raid, raid5 vs raid6, raid10, mirror, stripe, truenas, how to setup nas, how to set up nas, how to set up truenas
Id: ykhaXo6m-04
Channel Id: undefined
Length: 21min 42sec (1302 seconds)
Published: Fri Feb 23 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.