RAID: Obsolete? New Tech BTRFS/ZFS and "traditional" RAID
Video Statistics and Information
Channel: Level1Enterprise
Views: 312,436
Rating: undefined out of 5
Keywords: RAID (Invention), Btrfs (Software), ZFS (Software), Tech
Id: yAuEgepZG_8
Channel Id: undefined
Length: 32min 56sec (1976 seconds)
Published: Thu Mar 19 2015
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.
There are some things here that need a little touching on. He mentions that Google relies on their software to correct errors on their hardware, and that they use many many cheap SATA drives to store the volume of data they need. While the part about the cheap SATA drives may be true, I have a very length case study in PDF form (that I can post if I can find it) regarding Google's handling of large drives and RAID rebuilds basically taking too long and suffering failure during rebuild. The solution is replication, not correction. They're not rebuilding/correcting but rather they just have tons and tons of replication. Because their arrays are so large with such large disks they are no longer relying on the rebuild of arrays and instead they simple offline a bad array and rebuild it in the background by replication, not rebuild. It's a very wide and flat scheme.
I read a blog post from someone which made a good point even though it was a little tongue-in-cheek: Run large arrays in RAID0 - you'll get the benefit of speed and you won't be living on the assumption that you can rebuild it. As a result your backups and replication will be PERFECT. A little edgy of an idea, but no bad overall - when it comes to data preservation, ZFS/butterfs/etc are all great, but nothing beats replication to many redundant arrays.
+1 for the Tek Syndicate guys
Very interesting, I still like mdraid as a low-end option though.
Those zones at the end of the video didn't seem clickable, I was interested in buying him a beer.
There are some glaring problems with this presentation.
First, a NAS is a "network attached storage". This is very different from a SAN (storage area network). In the former, the disks are typically given a single IP address, and a filesystem is exported over the network (NFS, CIFS, etc.). The latter is a local network of multiple disks or devices. These are typically exported over iSCSI or Fibre Channel, and the storage can be accessed via raw blocks. While the end goal may be the same, they are very different. I feel that he didn't explain this well at all.
Second, he's treating software RAID as a second class citizen in storage, without addressing any of the hardware controller issues. Even worse, he brushes off Linux software RAID as something an enterprise wouldn't even want to consider if you're serious about data integrity.
Given the abundance of hardware resources in CPU, bus, and RAM these days, software RAID will usually outperform hardware RAID controllers. Linux software RAID also supports TRIM if the underlying disks are SSDs, which is virtually unheard of in hardware. Linux software RAID supports external metadata formats, allowing the use of fake RAID.
Hardware RAID may have a battery backup, but unless you're willing to front a good deal of cash, they are slower, and prone to failure.
Thirdly, ZFS and BTRFS do not need battery backed controllers to keep the filesystem in a consistent state. Both are atomic in nature, meaning you get all of the write, or none of the write. In the event of a power failure, you're filesystem will still be consistent, just old. With persistent external SSD write caches, which both ZFS and BTRFS both support, the transaction will be present on the next boot, and flushed when the pool is available. There is no need for battery backed controllers with ZFS or BTRFS.
Fourthly, ZFS does not need tons of RAM, unless you're using deduplication, or a large L2ARC filled to capacity. ZFS gets by fine on 2 GB RAM installs on laptops or workstations. Yes, it will use all the RAM you can allocate for it, but this isn't a bad thing, and it's tunable.
Fifthly, ZFS and BTRFS are true software RAID managers. Yes, they're filesystems, but they employ legitimate software RAID. The RAID1, RAID5, and RAID6 in BTRFS and Linux MDRAID are standard RAID levels. In ZFS, RAIDZ1, RAIDZ2, and RAIDZ3 are certainly nonstandard, but it is legitimate RAID taking advantage of all the disks in the array. Nothing is being fudged, in any case.
Finally, the large advantage of software RAID over hardware, is the ability to migrate disks from one system to another. ZFS, BTRFS, and Linux MDRAID make this virtually painless. Not so with hardware RAID controllers. With proprietary controllers, you are locked in to that vendor when migrating disks. This is partly the reason SANs are so expensive. You have to have a controlling head that will talk to the disk shelves, and the disks cannot be migrated from one vendor to anther without data loss or multiple copies.
2 stars.
Isn't this just some dude reading me Wikipedia?
Seth Rogen speaking RAID, awesome
I'm pretty sure this guy is ... either oversimplifying, or flat out wrong in many cases. I think he's really overplaying the occurrence of data corruption based on the "Write Hole" - this is an issue with single disk desktops also, but you can, 99 times out of 100, pull the power cord from Windows, plug it back in and boot ok. IME, and yes, this is just my experience, but MDM very rarely if ever kills an array because you pull or lose power and then power it back on.
That said, I do like ZFS and use it at home, but again, I can run it fine for a home system on 2GB of RAM, and for videos at least, I can pull power whenever and not lose any noticeable data... The system is pretty old though I did just upgrade the disks. In the future when I rebuild the whole thing, I will likely go ECC and more RAM etc...