Tuesday Tech Tip - Intro to Ceph Clustering Part 1 - When to Consider It

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to 45 drives tech tips my name is Doug Milburn I'm co-founder of 45 drives and I'm really honored that our crew here at our tech tips production and allowed me to actually talk so thank you folks so look what I want to talk about today something that's that I absolutely love which is sat clustering [Music] we make our stern airline of servers and which are often used in stand-alone configuration typically with Linux on their Linux and ZFS is what we recommend the most people although lots of people put Windows on and things like that but you get to a point in an organization and you look at that and it's the single server solution as everybody's faced with you know rapidly growing rapidly changing storage needs and chasing it by adding single servers when you're a small organization that's the way to go for sure your organization gets bigger you run into some real problems so what I want to talk about in this tech tip is just want to talk about when to consider changing from a single server solution to satisfy your data growth and till when you would consider moving to a cluster so single servers let's go to an example on let's see if I can make this feel a little real I'll tell you a story about a sorry this story's not true okay so there's a company called exciting technology company UTC limited et coelom and it started up five years ago and they have are a really cool product that they sell over the internet and they get all kinds of data in there so and all kinds of them and all kinds of data means different things to different people in different organizations critical point so these guys start off and they have an hey they left me a whiteboard here cool thank you a little bit of colored markers I have so they start off and and they start up money I end up with a rack okay and in it right at the start comes a server hey and that's a storage server and they felt pretty good at the time and they got a near enterprise-grade forebay storage server and let's say they put that back five years ago they were quite pleased when I got some 4 terabyte enterprise drives so they got four times four terabyte drives so e.t.c limited when they start up they were thrilled they got this gear and they got its total of 16 terabytes raw but that served their needs quite a while and go back five years ago in your small company everything's great anyway two years later what do they do they need some more storage so they buy another server hey these guys had no money at the time that was Windows Home version and they put it on there and shared ode over the network everything was great all of a sudden they say okay we gotta get a little more sophisticated to get their next level and they got a neat bay and they put in eight terabyte drives and they said we got to get some raid here folks because we need a little bit of data security so they put in eight bay I see they got 64 terabytes raw and let's say they did a couple of of I did a raid 6 array there so they got six of those it's usable capacity so they got 48 terabytes in there hey that took them for quite a while they came up to today okay and now all of a sudden they're looking at this and they're going I need to buy another server and I need some more space and they're going I get now that point when you're a company and you got data that you know I'm talking about you know say maybe they have 25 terabytes of good quality data on there and a bunch of other stuff they just keep around ok but they're in the let's say once you get over 20 terabytes you really should start thinking about clustering and to see whether it's worthwhile for you here's the fork in the road that you got to look at sometimes if you'll want to find the easiest path for your organization and for its future so option number one is it's pretty simple we add another server in there let me put another server in so this future I'm gonna change colors here oh let's look at this so here's I go for this option I put in another server let's say I go I spin I'm gonna do the magic number 25k by an enterprise-grade server 16 bays 14 terabytes I get all kinds of space I build some raid arrays in there I get my data security on there that's pretty cool but you know I'm getting more sophisticated now this has a you know san OS Linux on it the enterprise level Linux and then it's gonna software raid and whatever else these other systems are there now legacy they're off on different opera systems I'm starting to get a zoo at this point know I can do this by okay I can go down this track I'm gonna meet my storage needs but I get a zoo what are my problems problem number one I got physically defined separate storage spaces so I got to remember where I put stuff this is not a single storage space from a user point of view here's problem number two my newest data is typically going to go on my newest server guess what time localization of data access it's generally the latest data that we access most often so now all of a sudden my growing organization is going to be preferentially hitting one server okay that can create a performance bottleneck okay here's the other problem what happens if anyone is everybody if you're an IT infrastructure if you're by T professional or manager you got to keep this stuff up and going that stuff needs to come down for maintenance every now and then okay but these things are all mission-critical to you they're single points of failure if any one of those things go that goes down your organization has data it can't access okay so what's that mean yeah it means you're coming in and they at night you know the rich soil I'll make sure your closure files before you go home and because we're coming in for maintenance in the weekend okay and it's a hassle right and if you get a zoo the gate single points of failure performance bottlenecks but it's a viable path thing so here's the other fork okay so this organization can't look at clustering so what would clustering look like a clustering would look like the following instead of putting in a single server here they come - well I'm going to put us in equation here they come 245 drives and they buy a starter cluster package and they're going to buy three servers because three is a minimum first soft cluster okay they put three servers in there they got soft clustering software that manages all the storage inside those servers and three servers is a lot of slots let's see it start off with our smallest so and by the way you can get a starter cluster package Moss from about twenty K so the price of a good enterprise-grade you know 16 bay server and with support you can get a cluster pre-installed software and a enterprise level sport package from us so it's about price neutral to getting a good quality enterprise grade server let's say we got that server okay and put in five 14 terabyte hard drives in each one of them okay there's five times 14 terabytes love my white board this is great thanks for the markers okay and and I got five of each of those so I have in what's that that's 70 terabytes data protection scheme I'll talk about it later but I'm gonna do something like a raid array it's called a race you're coding on that so that my data I basically I'll get 100 40 terabytes you usable okay so this is all within that 20k oh yeah I can take my legacy servers I can move my data off my legacy servers onto my stuff cluster and I can now tie in these legacy machines into my cluster and even though they're not symmetric they're they're they're smaller sizes and whatever else I can tie those into my cluster okay and from this point forward I end up with one storage space I only put five drives in each one okay I can just plug in more drives until they're full so I got a smooth expansion path forward okay and and and here's the nice thing so this is the Seth cluster software manages these things manage this hard before he just ties it together into one virtual virtual space okay so I never have to worry when I want to add another server in I just add that on the fly live I just put another server in I go to my set dashboard I just add that server into my cluster and I've just upped my capacity so I got a continuous scaling path forward forever as long as the server hard drive paradigm lasts I'm good for it okay other things that I get out of this and and this is one of the ones if you're thinking about whether it's time for you to move to cluster or not you should this is one of the ones that smaller organizations it's called small to medium organizations look at if this one's important to your organization by availability so by putting in the thing I'm gonna call and I'm gonna do my next tech tip I'm going to talk about about pools and and data security on it but by doing that I'm now in a situation and and by setting my failure domain to server I get to a level where at any one point any one of these servers and even we started with the three of them I can lose any one of those servers and it's completely seamless my why people are still out there I don't get the little groundhog issue going up from the cubicles it happens when a server goes down okay so server dies on you no problem oh other nice thing you want to go do maintenance no problem take your server down no problem pull the plug on the thing not an issue okay and you know one thing comes right back up it'll just catch up and it won't skip a beat so it just frees you from that that whole issue of you know that nightmare getting that call on the evening shift when you know a server went down we can't get the data that emergency and and they say though that ritual of night or weekend server maintenance so high availability is is just it could be such an asset it can be mission-critical actually becomes mission-critical once you have it our organization runs on soft cluster and it's just beautiful since we're in a soft cluster we have never had a storage outage 45 drives and our sister company we have about 250 employees in the company at this point and I'm going to run a store search cluster for about I'd say about four years probably and it's a it's a three server cluster we had but we started we had a very small amount of data we probably had sort of in the five to ten terabyte range and but we decide to go to clustering for high availability and it was something we don't regret today we have about twenty five terabytes of data and then on top of that we have like forty five drives video data you know we have tens of terabytes of video data on top of that and we just don't worry about storage we just don't worry about it nor do we rig low data security nor do we worry about availability so we got scalability and size we got and we got lack of of outages we have high availability so one more thing that Seth does Saif is constantly balancing the storage load over the servers happen in the background completely transparent to okay and because it's doing that it is sifting everything out it's spreading it as much as possible so when your clients come in to retrieve storage they're going into different parts of the cluster all the time so what happens as your cluster grows your performance scales linearly with it okay so you never have to worry about this performance bottlenecking that will happen if you're on a single server strategy right and it's just amazing actually the performance of things gets absolutely huge when that cluster gets huge the performance gets crazy it gets it goes way way up there and it's just it's just adding stuff in in parallel so anyway it's a really really great solution so and again the context of this we go into a cluster solution we start off with three servers and by the way if you're building a cluster like with just three servers it's a starting point there's even more magic happens when you add in more servers it just gets better and added more service so you start off with that and then you got a continuous growth path you got high availability okay you got scalability and performance and you got scalability in storage capacity so anyway thanks for listening to this tech tip and what I want to do for anybody's interested please tune in next week we're gonna go for a whole bunch of things about what Seth is next week and week after and maybe you'll get they'll get to do a few of these and talk about everything in the south so anyway thank you very much
Info
Channel: 45Drives
Views: 9,440
Rating: 4.980392 out of 5
Keywords: 45 drives, storinator, stornado, ceph cluster, storage cluster, storage nas, nas storage, storage server, big data, ceph storage
Id: yeAlzSp6yaE
Channel Id: undefined
Length: 13min 16sec (796 seconds)
Published: Tue May 26 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.