Intro To Software Defined Storage! Hardware vs. Software Raid & ZFS!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

This guy's a legend

👍︎︎ 101 👤︎︎ u/ssonicmail 📅︎︎ Aug 08 2020 🗫︎ replies

Man I'd love to have one of those google appliances, or even just the faceplate

👍︎︎ 19 👤︎︎ u/siscorskiy 📅︎︎ Aug 08 2020 🗫︎ replies

In Wendell we trust!

👍︎︎ 33 👤︎︎ u/aviftw 📅︎︎ Aug 08 2020 🗫︎ replies

No mention of CEPH, etc...

👍︎︎ 7 👤︎︎ u/SilentLennie 📅︎︎ Aug 09 2020 🗫︎ replies

Love Wendell

👍︎︎ 13 👤︎︎ u/onmf 📅︎︎ Aug 08 2020 🗫︎ replies

What aboot...CEPH?

👍︎︎ 6 👤︎︎ u/Cuntable 📅︎︎ Aug 09 2020 🗫︎ replies

I can't believe a legend like this was being held down by the nut that is Logan, and not that long ago.

👍︎︎ 3 👤︎︎ u/QuickShutter 📅︎︎ Aug 09 2020 🗫︎ replies

LTT, take some notes.

👍︎︎ 7 👤︎︎ u/jonboy345 📅︎︎ Aug 09 2020 🗫︎ replies

Is there already some kind of information about the "mix different drives" feature available?
The last few ZFS talks i watched basically only talked about vdev expansion.

👍︎︎ 2 👤︎︎ u/floriplum 📅︎︎ Aug 09 2020 🗫︎ replies
Captions
I'm back with our gigabyte are 181 dual Xeon server again just a massive amount of memory dual redundant power supplies only one you you might think that that means that it's got limited expansion capability but that's not really true this actually makes a really great platform for software-defined storage solutions strictly speaking software-defined storage that's a marketing buzzword doesn't really mean anything by itself then I get to thinking about raid and the different disk configurations that are available in chassis and I thought we could talk for a second about software-defined storage and what that means for raid arrays and why rate controllers are generally an older obsolete technology now not really a lot to it so if you look at our chassis here we have two CPU sockets so a lot of CPU horsepower with the dual 28 core processors and we get ton of room for memory expansion and the front panel was configured with ten two and a half inch drive bays it's configured for Seder right now but you could just as easily be serial Attached scuzzy for high end you know SSDs serial Attached kazi's a little bit lesser a little bit yesteryear but also offers some more advanced features than SATA there's also an nvme front so if you want to have two and a half inch nvme drives not a problem you can do that it's a different chassis configuration those OCP slots at the at the back they're low profile and high bandwidth for things like you know 2500 gig network connections they won't get in the way any of your regular PCIe expansion so you can use those for externally interface cards for or for nvme or if we want to connect a lot of old slow spinning rust we can do it but we can also mix new fast modern storage like nvme and an AV storage and flash just solid state storage even though it's only a 1u chassis so talked a little bit about raid the way that you should think about a RAID controller is that it's a computer on a card and it's Turtles all the way down on a raid card you take a bunch of mechanical or SSD disks or maybe you know some of them will let you SSDs and mechanical disks and use the SSD for caching and like the LSI Kash Kade some other cards call it something else but it uses a little bit of flash memory to cache the storage could be SAS could be SATA there are a few that do support nvme but maybe even spinning rust and it tries to do everything sort of with one piece of hardware takes that piece of hardware that amalgamation and presents it to the host server like as if it's one piece of storage this is how a raid card traditionally has been the problem with that is the scalability is kind of limited you can only sort of scale it within this one chassis maybe an external chassis or to four disks but failover and redundancy isn't that great if you imagine a hypothetical scenario where I've got two of these one used servers and they're managing my storage array it actually gets really complicated to handle that at a hardware level with a hardware RAID controller because each one of these needs a physical piece of hardware that connects to both sets of disks and they need to do a lot of intricate communication there are ways to do that but most all of them don't involve physical PCIe cards running inside the machine that are physically attached for that we need to get it a little bit more complicated on the software side of things to sort of manage the complexity and that's the software and software-defined storage let's take a look at the level one storage array which uses ZFS and also flash oh and well we'll bring this with us there we are the old Hill yellow hole Yeller the Google server this is currently the main front-end for level ones main storage server 172 terabytes of rock capacity but we physically split the array wonderful our drives is the data here tell our videos and everything else then there's another replica off-site it's really a pretty mundane pedestrian setup really the ZFS is not really particularly magical here ZFS is a file system but it really combines device management and volume management and a file system all together so I've got a ton of mechanical hard drives here but there are also two and a half inch drives and PCI Express flash storage the FS historically doesn't do the best combining high-speed devices and low-speed devices they are working on that that is going to be a thing but you know you can have a high speed ZFS pool and a lower speed mechanical ZFS pool and that's the thing so I want to talk about this is County in the context of Linux's LVM so with LVM it's a piece of software on Linux I'll say that I add my shelves of hard drives here 48 mechanical hard drives in this case although there are you know rack mount cases like this that are - two drives deep to a sled so when you pull one of these out you get two hard drives not just one so be you know 96 hard drives instead of 48 with LVM it's a piece of software so these drives would would present to the server individually if you ask a server operating systems historically like Windows hey manage all of these hard drives it's really not gonna do a job it's not super optimized for that you think hardware RAID controller these actually are NetApp shelves this is older technology but the NetApp shelves were exactly that they were built to be a network attached storage or fiber channel attached storage and it was really just computers it wasn't a hardware RAID controller it wasn't a computer on a small card it was full rack servers just like this and that's why I can use these is because they were so well designed I thought I could still use these you know even 10 years later that still get a lot of mileage out of it I can even mix in this brand-new gigabyte are 181 server which has got you know dual Xeon gold sixty to eighty s and 768 gigabytes of memory with this older technology this is my slow storage tier and because the software has advanced the hardware is still valuable because it's still useful because it is just a software constructor it's back to LVM what LVM does is it lets you just add hard drives to a pool so if I were using LVM with this instead of ZFS I would I would be able to just tell LVM hey get 48 hard drives here for you just add them to the pool now in LVM I can create volumes and this is a little different than partitions you might be familiar with like a partitioning of hard drive taking one hard drive and slicing it up it's just like a pie it's a little destructive when you do that a partition you know sort of in a traditional vernacular when you're talking about partitions say two-thirds of the disk is one partition the third of the disk is another partition and there might be like you know just a few hundred megabytes or a gigabyte for the third partition but you've sort of set in stone how that disk is going to be used from now until the end of time well with a volume manager you just add all of the disks to be managed by the volume manager and then at volume creation time you can specify some things that volume creation time you can say hey I want you to pre allocate a certain number of terabytes or you can say hey I want to make sure that the performance is always this in terms of i/o counters or anything like that or you can say I want this level of redundancy that was something recently added to LVM on Linux and so it's wait a minute redundancy yeah think about a traditional RAID controller I'm gonna create a raid 5 array or raid 6 array or I might have you know a rate 1 on the controller of some of the drives and have raid 5 or some of the other drives a red one is just a mirror it's the easiest thing and raid 5 is like you know you have one drives capacity worth of redundancy but that's that's about it in terms of redundancy raid 6 is you have two drives worth of redundancy not that you have one drive that only stores redundant information just that you get one extra you know so if you have five drives 20% of the space is used for redundancy information which happens to equal the capacity of one drive but that redundancy information is spread around on all the drives with six drives so you'd have two sixths of the information and raid 6 that's used for redundancy information and that's spread across all of the drives ZFS is kind of similar you can have a ZFS pool made up of multiple V devs and you can elect what level of redundancy you want for each of Ede eV and then within inside the pool you can also create data sets that have the properties that you want including things like compression and some of the other stuff the V devs determine the level of redundancy and the data sets determine some the access parameters so ZFS and an L VM that don't really line up exactly the but that's that's that's neither here nor there another really important thing to understand and keep in mind hopefully is not too much from the information firehose is l vm is just a volume manager you create the volume and it shows up like a block device and it has whatever we're done to see you specified so with Ray Guan and you've got a whole bunch of hard drives you're gonna have a mirror or you can have an inlay mirror you can have a three-way mirror a four-way mirror five way mirror with right five or eight six or a ten it's gonna change the underlying strategy that it uses to allocate data across all of those drives all that just gives you a block device though you still need to create a file system so ext to you or three or four it's the four x FS so btrfs is kind of an attempt at bringing some of the ZFS features onto a file system that can deal with multiple hard drives one of the quartz from that originally was I'm gonna create a btrfs volume that's gonna be raid one it's like what's my free space it's actually twice the available free space it is the raw free space but files are twice as big as you think they are because when it writes one file it writes another copy to another Drive the problem is that when you're doing the LVM approach where LVM is going to not care about the filesystem and create a block device but it's up to LVM to manage the redundancy information and like the CRC's and make sure stuff like that so it can be not quite as efficient whereas or something like ZFS it's built from the ground up for redundancy and you have okay believe I'm saying this but you have less overhead with ZFS then the sum total of the overhead that you have between the filesystem of the volume manager and all of the other component REE and so this enchilada is the building block of the software-defined data set now I mentioned backup ZFS has a really amazing capability built into it because it journals everything you can create a remote data set you can create a copy of all of our ZFS stuff somewhere else and instead of what normally happens with a traditional file system which is it has to scan the entire file system to look for changes and sends those changes to a remote system with ZFS because of the way that it creates a transaction log and attracts transactions ZFS doesn't have to do that the algorithmic efficiency of that is basically oh one no it's a constant because everything that has been added to the transaction log from the last transaction on the remote file system is what has to be sent over the wire so ZFS can just look at his transaction log history look at the blocks that have changed and just linearly go through the list of data that has to be sent to a remote filesystem this is really awesome but this is not something that happens at a file system level or even a piece of hardware level there's actually cron jobs and things like that involved so there's a software layer that happens in user space and no matter what kind of software-defined storage system you're looking at those kind of things happen with ZFS I use tags on the ZFS data set to determine how backed up it is so things like error template projects and some of our really important stuff will actually even get another extra copy and it's basically a mutable sort if somebody tries to accidentally delete it ZFS is like okay well I'll let you think you deleted it or all market is deleted but I'm never actually gonna remove the data set that contains the snapshot of that information and so the software-defined storage can actually create a whole bunch of those arrays you can say this is SQL databases this is that we need this level of redundancy we need this many off-site backups we need to store this many backups in cold storage and because those properties of the data set in which you store the data have those tags then you don't have to worry about explicitly managing that and in the environment where you have a lot of virtual machines for example and a modern NetApp system or modern VMware system software-defined storage means that you store your VM you know wherever it needs to be it's like maybe we've got a pool of terminal servers they're all generated by script so the data set that those are stored in doesn't even need to be backed up because they can be generated at a moment's notice from our installation script and added to our remote desktop or our horizon client storage pool or whatever it is we really don't need a lot of redundancy it was like lisanne and you know all the different vendors special-sauce can hit and enter the floor right there so that's what I mean when I say software-defined storage is a little bit of a marketing term for us on ZFS on the open source world it's a more transparent because I just created script here's my backup here's my thing that's running here's how its gonna do its thing with LVM and sent out some limits you can configure the same kind of thing with our sync and backups and the stuff that the volume manager supports things that the file system supports things that your vendor level software supports maybe things like you know you can't make a copy it can't make a clean copy of things like MySQL databases you hook into my spell setup replication but depending on what you do with your VM and how you go your deployment scripts setup you may be able to automate those kinds of things and just by virtue of this machine being this classic machine or having this tagger every how you got your system set up those things happen automatically so that an administrator doesn't have to worry about it you as super administrator can give your junior administrators something in another office access in another office and just by virtue of their machines being tagged a certain way or in a certain organizational unit or running in a certain VMware cluster or whatever your vendor specific stuff is just by virtue of it being in the right place you get all this wonderful checklist of all the stuff that's gonna happen with the data on that system Microsoft even has another different product in system Center just for managing the data and continuity because it's another way to build people right but really it's just data assurance that's all it is and so I'm really looking forward to getting this 1u server set up from gigabyte to replace our aging Google server it's got a little bit more horsepower you may think that doing all of this in software doing all this with the CPU has a lot of overhead and it's generally not a good things like well until in particular has virtual raid on chip and these are Hardware extensions that basically let all of the the raid stuff all of the device level stuff that's that is physically happening with the stuff attached to the server there's basically no real performance penalty it eats a little bit into memory bandwidth but usually you've got memory bandwidth to spare but while the CPU is doing regular computations it can also be doing vrock computation in parallel and so the amount of performance hit that you have is actually very little on competing platforms like Andy epic most of the SIMD stuff can also run con in parallel it's a little bit of a power thermals budget on the AMD side versus Intel so depending on what your power budget is even if you're running a lot of these rate type calculations on spinning rust because it's so slow and this is the overhead because the CPUs are so much faster than storage even when you've done 48 or 96 of disks attached to a single node it's really not that much CPU overhead to manage all the throughput in i/o but if you were to try to do that on a PCIe card oh it's a nightmare you shouldn't even bother to do that and that's why we haven't seen envy me raid cards because the kind of CPU horsepower that you would mean to even manage you know a four-way india me on a little PCIe card we don't that's not a thing the thing is gonna get hot and be unreliable and just terrible it'd be much better to manage that in the cpu and that's why we have things like the VT Rock that's why there's not really that much of a marketplace for raid controllers that involve nvme for anything other than a cache and that's why sniffing and thus was born software-defined storage and there you go that's a fairly dense fairly quick rundown of all the stuff that sort of gave us software-defined storage and ZFS and ZFS is really awesome CFS is gonna be amazing when it will properly support having mix to speeds of disks like it actually has some stuff built in to manage pools of disks that are a little bit different in speed but it really struggles when you've got a pool of nvme plus a pool of SATA plus a pool of mechanical hard drive instance like I just want one big volume and I don't want to have to think about it CFS doesn't do that yet that's the promise of ZFS but it doesn't it doesn't do that yet it's the promise of ZFS in general like I don't want to have to think about how my data is stored at a bit level just make it happen and make it happen across servers can do that but we're not quite there yet in terms of the performance tuning and some of the memory over there's a lot of there's a lot of things to worry about what ZFS but there are proprietary solutions like net apps modern proprietary solution it is actually very nice it does a lot of this stuff they've done all billions of dollars of engineering figure it out and it works really well and and it is the least horribly proprietary storage solution that's out there I mean they'll about EMC and that's not I'm one listed level one I'm sat down and I'll see you later
Info
Channel: Level1Techs
Views: 57,390
Rating: undefined out of 5
Keywords: technology, science, design, ux, computers, hardware, software, programming, level1, l1, level one
Id: uBfXdJGmWoM
Channel Id: undefined
Length: 17min 49sec (1069 seconds)
Published: Sat Aug 08 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.