Level1: We bootstrapped our own ZFS storage server: 172tb, extremely low cost

well thanks for joining us for another episode of the news be sure and check us out next week same time same channel be sure to hit the little bell at the bottom that'll be good see you guys see ya I was a good episode was a good episode I don't understand hopefully recorded correctly yeah I'm just gonna copy this to the network drive you go grab the Star Trek action figures we'll get this party started yeah okay all right excellent excellent excellent I uh I'm gonna beat Roy this time let's agree to that right now that's following okay as long as you can put wharfs head on data's body like turn again I mean uh wait wait wait the network drive is full how could it be full what Oh what's going on with this I thought you were gonna add storage I did it I added a terabyte a terrible win three days ago how do we how do we use up that much storage in that amount of time let's figure it out it's a bunch of encrypted files in Krista's folder how we we've we've talked about this with her before you know ever since we started the level one thing it used to be just us and her friends and some of the twitch viewers but now it was fear now it's it's YouTube commenters it's other youtubers the stalking it's gotten out of control it really she's really really good at it it's really scary I mean this is this is terabytes and terabytes of data so uh I mean I've tried to talk to her about it she just hisses at me and tries to claw my eyes never seen her in a while I think she's in her lair ah yeah she spends most of her time in there but I don't know what we're gonna do about this I mean how can we solve this brother if she's using a terabyte every week every three days well you know what I've never really had good luck confronting her with that why don't we just build the storage ray to end all storage rage and just call it a day it's gonna have to be like a hundred terabytes well I mean we could probably do 170 170 that should buy us like a month maybe more well it's gonna have to buy us at least six months because I don't think we're going to have the patriarch money to get something larger for a long time amazing with compression we might as well do it let's do it let's build a storage array well if you don't get bored and run away we're going to show you sort of the mindset of a System Architect we came to this video a bunch of different ways but ultimately thought that you know a system architecture what does it mean to be a System Architect because from a career you guys may be interested in that and approaching our video storage solution for level one we kind of were like yeah this is kind of a system architecture problem maybe we can give people a quick overview this could also be a jumping-off point for a whole bunch of different subjects not just for you know the level one channel but also for the Linux channel maybe even the enterprise Channel because this is all enterprise grade hardware that we're talking about for this stuff you know Christa's data hoarding problem aside you know dealing with a problem a typical nerd fashion sort of passive-aggressively will deal with that by adding more stories rather than confronting the problem head-on but ah it's probably fine we're all data hoarders I guess if you can stick with me you can sort of run through the minds mindset the thought process of a System Architect and so we picked up some stuff from Star Plus well actually this is from a different project but we picked up some stuff from surplus and you know data storage is a real problem because we don't really want to spend a lot of money but we want something that's good so what's what's the biggest bang for our buck and like a lot of things it's probably getting some used enterprise gear rather than buying a whole bunch of new stuff and so what does it really mean for for video storage like well let's let's turn it around let's talk about it in terms of system architecture if I were a system architect were brought in to look at this and I wanted to spend as little money as possible but end up with a reasonable result I'm going to have to look at what the requirements are so the real requirements are good storage of video a lot of video potentially because even just recording this you know this project is going to be 50 gigabytes probably a video for the time everything you said and done even just edit it down to a 10 minute video hey there may be more videos in store who knows for the different kinds of projects that we do there are other kinds of projects that we do we do stuff at Linux we do self the virtual machines we do stuff with other types of storage there's only going to be five people hitting the system at once that's on an enterprise system with only five people sounds like a developer organization or something crazy like that we want it to be as fast as possible we want people using this storage to feel like that they're using storage and their local computer we want whatever operations are running on the storage to feel as fast as possible as fast as it possibly can without also spending an insane amount of money and so what you know what are some of the low-hanging fruit options would go through my head as a system architect you know if I got my system architect hat on for doing those kinds of things like well a SAN storage area networks one of the first thing to pop into my head there's probably some of you guys out there in the enterprise where your companies places you work for still running these sixty-five tens or maybe you have this as an equal logic unit it's got 48 physical hard drive slots in it it will support SSDs 15,000 RPM scuzzy hard drives the whole nine yards it's got 40 gigabits of connectivity it's got 4 10 gigabit interfaces in two redundant controllers this thing's a beast it's completely nuts the first thing that you have to learn is a system architect is where the bodies are buried what the lies are and the first lie is that there's no such thing as hardware RAID and what I mean by that is that somewhere there is a computer doing software for your raid and it is important to understand what exactly that software is written to do and so it's like wait this is a hardware RAID controller yes and we've got some lovely b-roll of this hardware RAID controller this is an al-assad controller it's older but you know it's this is quote-unquote a hardware RAID controller but it's not really I mean what do we mean by that it's a computer on a card there's a computer a processor on this PCI Express card with some RAM and a battery backup unit so this computer on this card will keep running even when the host computer that's connected to through PCI Express is not supplying power anymore and what the software on this computer does is make sure that everything that was written to RAM on this card makes it to the physical disks the next time they're turned on so if there's an unexpected power outage or crash or something like that and this RAID controller is cut off from its disks whatever it had not written to disk will hang behind in memory on this card because of this battery and whenever the disks get turned back on it will be written to disk same with this guy this guy has a computer in it and it has a battery each computer has a battery actually and it's got 2 gigabytes of RAM I mean this this this is this is dim tards this should look pretty familiar this guy has two 10 gigabit G bik interfaces and an Ethernet interface but it's really just a computer on a card and the enclosure has two of them it's really just the software well now what about these guys what do they have for a computer well they don't have a computer they're disk shelves they're disk shelves that include a port multiplier which is another important component of the system I'll come back to that in a minute but they're really just serial Attached disguise the interfaces to a controller and something all right well what are we going to use for a computer in that case to manage the data to manage the array glad you asked we're going to repurpose the Google Search Appliance because it's not without a certain sense of irony that we do that into actually hosting and maintaining and controlling the data ok what file system are we going to use Rini ZFS ZFS is probably the file system that they're using on the enterprise it is an incredibly advanced file system it does have a ridiculous amount of overhead you lose performance like from the raw hardware there is a performance penalty associated with ZFS because it does so much stuff but System Architect this is going to be our primary storage we need some data integrity checking we need some control over that the penalty doesn't really bother me that much I'm thinking in the back of my head you know what are the options for interface speed 10 gigabit dual 10 Gigabit Ethernet for two Gigabit Ethernet maybe and that's pretty much it we could do you know Thunderbolt is an option I took a look at an accusatory bolt ports that is a really clever really low-cost option to deliver a lot of bandwidth to a relatively small number of editing workstations and honestly that wouldn't be a terrible solution in this scenario because of the bandwidth it really the software support is the only thing that's really lacking in that solution with this you know we're talking about 10 Gigabit Ethernet a little over a gigabyte per second in real-world performance that's going to be faster than any SATA SSD but it's not going to be quite as fast as a high-performance nvme SSD so faster than SATA but slower than nvme 10 gigabit I can live with that I can live with that especially just starting out now if we end up with with you know multiple editing workstations and multiple this that any other multiple 10 Giga interfaces would make more sense and be more utilized we could even use them in like an LACP group if our switches sophisticated enough but for now I'm just thinking at ten gigabit that's probably fast enough as a System Architect you're going to sort of have all the pieces but you're still going to have to do experiments in order to figure out the best arrangement of those pieces to get the best possible performance well our Google enclosure is really adele r7 ten so it's outfitted with right now 96 gigabytes of RAM and 2 6 4 X 56 70 CPUs I think 50 60 70 or 56 90 but anyway to 6 core CPUs 12 cores 96 kids around now I've got 12 16 gigabyte DIMMs it'll run with up to 192 gigs of RAM and that's what's going to be our final configuration we've also got five of these enclosures each one of these enclosures has 24 2 terabyte hard drives with 24 Giga bits of serial Attached scuzzy connectivity to the host so what I'm going to do is daisy chain 2 enclosures off of two controllers so that we've got a balanced you know for path system but one option that we have is maybe to only run one or two enclosures for now and leave the other two off until we actually need to storage with ZFS we can add those other enclosures as we need storage so we can keep them off save money on electricity and also save wear and tear on the enclosures that actually works out really well for us now we are planning to keep one enclosure the fifth enclosure back just for spare parts spare drives that might die spare power supplies spare controllers you know whatever we might need so we are going to have one enclosure that's just completely off another option would be that we could actually use two servers like to Google type servers and mirror between them or we could set up mirrored pools all this kind of stuff these are experiments as a System Architect that you should do with your real workload to see how it works ZFS is a hugely complicated file system it supports multiple independent data sets with different record sizes and things like that so think about raid where the stripes eyes on raid is kind of variable depending on like the sub component of the raid that you set up so I'll see you're running virtual machines on ZFS virtual machines on ZFS is one of the types of workloads that it performs the most poorly ad but you can if you really know your workload and you really know your setup you can tune that ZFS pool to be able to do that well the Google appliance also has eight two and a half inch bays so I've got some you know 500 gig SSDs and some four terabyte mechanical hard drives we could set up a second high-performance ZFS pool physically in the Google server and use that for our virtual machine storage and use that for high-performance storage and then use the external enclosures strictly as a archive now I've added a 400 gig Intel in VM II as a separate log device now I'm really only using about 100 gigabytes of that separate log device I've tweaked the ZFS parameters to keep more stuff in memory for a little bit better streaming performance streaming performance in the ZFS vernacular and the FreeBSD vernacular is performance where you're just you've got a a stream of data it's like throughput it's not I Oh operations per second we're dealing with mechanical hard drives operations per second is like 250 and you guys probably know that those Intel nvme discs can get four hundred thousand five hundred thousand I ops at the peak on on those physical devices so yeah I mean 250 versus that many I ops from SSDs SSDs makes makes a huge world of difference but again we're not spending thousands of dollars here on just storage we're trying to thread the needle with this so what sorts of other funds secondary uses can we come up with this it's like well it seems like a waste to use the entire Google server just as a front-end for the storage just on a and I relatively slow 10 gig Ethernet connection what some other stuff we can do well we can run virtual machines and we can learn something called docker docker is sort of a pair of virtualization technology and it works on Windows Linux FreeBSD you name it docker is is great because it's one of several different containerization technologies where you have a recipe file and the recipe file is like go and get me Linux gonna get me Apache go and get me PHP go get me this out in the other and the the virtual machine is built according to that recipe and then your data is stored independent of that recipe that makes it really awesome when you need to migrate things when you need to separate your system from a project that you may be running but it you know the project requires configuration of Apache or configuration of the database system or configuration of whatever and so there are lots of docker files out there for things like email servers and you know network boot servers and stuff like that and so two big things that we can run on our Google storage appliance that will save us time and reduce our headache one is a net boot workstation you guys watch a whole bunch of other channels you guys are probably familiar with this when it's new board time or when it's time to do whatever that means somebody's reinstalling Windows like 50 times for 50 different sets of hardware know what we want to do is run a Windows virtual machine that has no drivers or anything of any kind that Windows virtual machine is always on it's always receiving updates it's always a hundred percent up-to-date when we get new hardware in and we want to do a benchmark or we want to do whatever we're going to image that instance of Windows to the physical hardware and then install drivers I think we can automate that with PowerShell scripts I think we can automate that to the enth degree honestly because I do a lot of that kind of stuff and I think that that will make our lives easier and reduce the amount of time it takes for us to you don't have to deal with benchmarking and things like that it's not ready in this video but we'll get there and it probably would make interesting subject matter so it's like oh I need to reinstall windows just boot off the network and then install windows done or image windows to the physical machine and set up the drivers we might need one or two virtual machines maybe for one for Intel rst technology and one for AHCI something like that but because it's not going to have drivers on it setting up AMD or nvidia drivers from clean will be no problem and we don't have to wait for the machine to do update so you guys have probably noticed that you know some people have just court squirrely all over the place benchmarks and when when you look into it it's like Oh Windows was doing something in the background or Windows was updating your windows is whatever and you can change your preferences on that but Windows 10 has been terrible about maintaining your preferences it's like I want you know these are my preferences I don't want this I want this other thing Windows 10 doesn't doesn't really work with that the other thing is steam it really doesn't make sense for us to download gta 550 times a week to be able to do benchmarks well there's a docker image for steam for steam cache I think when lord knows we've got plenty of storage so we can just carve off you know 10 or 15 terabytes on this thing and make a 10 or 15 terabyte steam cash and then anything in our local steam cache will come from that system so I think that will work out really well for us damn a hundred and seventy two terabytes formatted I really didn't believe you but that's amazing never underestimate the power of over a hundred hard drives that the problem is solved - yes a great 10 gig Ethernet I mean it's it's free NASA at its core but we're able to run Windows virtual machines or legs virtual machines or or pretty much anything we want and of course we've thrown you know thousands of dollars of hardware at it to get the good performance but you know yeah so I guess at this point somebody just needs to go help her move her files onto this yeah you think we could just slip a note under the door they usually just come back out shredded when I rock-paper-scissors you for it God you windows like 90% of the time rock-paper-scissors- lizard-spock no now you win that a hundred traditional you know the last time I went in there I got tetanus well hopefully your booster shots feel good I wasn't that long ago that's been like six months but uh if I'm gonna do this I'm gonna take a camera okay and I'm gonna need some kind of light source because she doesn't allow oh great great you've got one all right good I had to get listen let's do this I guess I'm going to stay far far away all right yeah right thanks oh my god it's like it's like 90 degrees in here but there's no heat how is she generating heat oh it smells like pumpkin spice and burnt hair look at this trash what is she what's on the wall god I wish there was lied in here she stripped out all the copper know what she did with it aah Oh God stepped on something I'm not I'm gonna look but I want to know that Christy are you here I've got some good news can you just uh please don't try to scratch me please Oh what is this oh my god it's not just digital files this is darker than I ever imagined human hair booze what is she doing these these people's lives are in danger we're gonna have to contact them ah I shouldn't be here I shouldn't be here but I have to know what where's all this going oh god what was that used for what what is she trying to oh no oh god she knows so that starts the fs setup and it works really well for my needs so good to have all that storage you know just limitless storage for all the things that I need and you know nothing's out of place everything's where it belongs what you doing there nothing crazy doing nothing and we can expand it as much as we want we can have unlimited storage forever well you'll have to you can't just add a single drive with ZFS you'll have to add multiple drives preferably a shelf at a time but we could definitely do that you better start adding shelves then yes ma'am you guys should all like favorite and subscribe because if you don't I'll find out about it
