Cloud Computing - Server Clusters

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
please go to Eli the computer guy calm in order to view schematics code and more for the projects that you are learning about welcome back so today I want to talk about server clustering in regards to cloud computing so server clustering is one of those technologies that has been around for so long it's very easy to overlook and it's very easy to grasp what a poor component of cloud computing it really is so we talk about server clustering what we're talking about is we're talking about taking numerous physical or possibly even virtual servers and essentially turning them into one logical unit as far as the network and the client computers are concerned it's kind of sort of like a raid only for entire server so whenever you're dealing with a raid you take a number of different physical hard drives and you turn them into one logical hard drive for the computer to access then if any one drive in that raid fails you're able to simply swap it out put a new one in and everything keeps running basically the main thing here is that no one a hard drive will crash the entire system you're looking at that entire you that raid unit for storage right and that's kind of what we're looking for we're talking to start talking about server clustering we don't want anyone physical or even one virtual server to matter that much if there is a fire and it because burns up if somebody steals it if if a CPU fan literally just shuts down on it we don't want any one server to be so significant that the services for the network or for the clients are no longer provided basically what we want to do is we want to take all of these different servers turn them into a cluster so that if any one of them simply disappears it doesn't really matter this is incredibly valuable valuable in the modern world of IT administration because what this means is that it becomes much easier to be able to do things such as upgrade Hardware on your network so think about it if you have let's say one active directory server so you go one active directory server and you get more clients on the network therefore you need to upgrade the CPU you need to upgrade the RAM or open if you have to just buy a whole new box and do a migration right well if you have one active directory server on the network and you have to do an upgrade or whatever else you have to turn that active directory server off so it's now now no longer providing services for anybody on the network until until it comes back online then past that you actually have to do the physical upgrade process while the clock is ticking while the users are getting pissed off and then once you get finished doing all the the upgrades that you're supposed to do you plug it back into the network and you turn it on and yeah if you've done IT for any more than about five minutes you know you know every time you turn a server off you're not really sure that servers got a turn back on and so there may be a major issue and so what was supposed to be a pretty simple RAM upgrade or something like that and now you don't have Active Directory for a week or until you know Dell can send you a new server so you can get everything back up and running now so that's that's the old-fashioned scenario that's the old-fashioned scenario well the new scenario and again with Active Directory active directory clusters have been around for 20 plus years at this point if you have an Active Directory cluster what you can do is you can have multiple Active Directory servers for your Microsoft network if you want to do start doing upgrades right so let's say you have three Active Directory servers for your network what you can do is you can simply take one offline while the other two are still providing services for the network you can then do your upgrade so you can do whatever it is you want and you just you can sip a cup of coffee and go out have a little pizza right you got all the time in the world because you have two other servers providing the service that this one server is also providing so you can have two days to make sure you're doing everything perfect and then you boot it up and you turn it off and boot it up then you turn it off and you do a little load testing or okay that works great you put it back in to that active directory server cluster make sure everything is doing what it's supposed to do give it a couple of days make sure users aren't having any problems and then you take the second one offline you go through the same nice the usually you know migration upgrade process and then you you plug it back in then you take the third one offline and then the nice part about the cluster because you're sitting there you're thinking like a whole but if you have one offline and what happens if one of the other Active Directory servers fail in the cluster well if one of the other Active Directory servers fail in the cluster if you have a cluster of three servers one is offline for migration one fails and then now admittedly that final one probably has a hell of a load on it but it most like will be fine and it'll pack out it's not gonna be it's not gonna be happy with you but right you can literally have one server that year you're upgrading one server can completely fail for some reason and then again if you've got that cluster that other server can stay up and running and everybody is getting their services so this is one of the reasons why going to a server cluster environment can be so useful on so many many ways is again things like failover things like fault tolerance and again just being able to go through and be able to do basic migration tasks and an upgrade task your life can become a hell of a lot easier so when you start thinking about server clustering basically we're looking for three different possible services that a server cluster will be able to provide you the first one is a failover so again kind of like in the situation that I've talked about if you have two or three servers and a cluster and they're all running and one server completely fails then basically all the clients all the users are simply automatically routed to the other servers you have in the cluster and you don't even notice it absolutely wonderful thing again think about this raid but for the entire server if you don't see the value of that well you obviously haven't done deck for any amount of time then the second thing that you can look at for the reason for dealing with cluster ya dealing with server clusters is again since we are now using multiple servers multiple physical servers in a cluster one of the things you can do is something called a load balancing so let's imagine you have users going to your server cluster in order to perform tasks maybe websites again Active Directory all kinds of other different infrastructure type tasks well if all the users in network get automatically routed to one the server when let's say you have three or four other servers on the network one of the issues you can run into is you can literally overload that one the server well the other couple of servers are sitting there and basically using one percent of their CPU so one of the things that you can do is something called a load balancing and so with load balancing what happens is that the server's are able to communicate and depending on what system you have basically new users will simply be routed to the server with the least amount of resource consumption on so let's say you have four you have four servers in a cluster the first person gets ratted to the first one the second person gets ratted the second one third person gets routed third one a fourth person gets routed the fourth one fifth person gets routed back to the first one so on and so forth and so basically if you've got four servers there you can now have people routed to these these other servers in order to do load balancing one of the reasons that this can be valuable in the real world is instead of buying some Xeon processor that's gonna literally cost you $8,000 again a lot of a lot of people a lot of people don't quite grasp how much Hardware can cost but when you're dealing with servers especially once you start even really higher higher end servers even simple things like CPU the CPU itself can be just mind-boggling ly expensive right and so if you're sitting there and you got again you've got a Active Directory you've got website something like that you may not need an insane an insane CPU right you're thinking that you're sitting there gone really I'm gonna have to buy like this incredibly expensive CPU for my server well if you're thinking about creating a cluster anyway you might be able to get away with simply buying again if you're gonna have five servers instead of buying one CPU to deal with all the users and may cost you four or five six thousand dollars just for that CPU or not talking about the server CPU instead of that you maybe get away with being able to spend five hundred dollars on five different CPUs so you create five different physical machines you're gonna be using load balancing and so not everybody's going to hit the one machine they're going to get spread out amongst the different physical machines you have so that way you can have both failover if one of these machines fails the other for how many in the cluster continue running and you have the load balancing so that people get directed to whatever server is under the least amount of load and so this is something to think about where again like when you start thinking about things like hardware costs especially in the modern world sometimes it literally is not the wisest thing to buy it buy the most expensive hardware out there when I go out there and buy that insanely expensive Xeon processor when honestly again a four or five hundred dollar CPU but on five different systems will actually provide you a much better infrastructure than simply having one box that is insanely powerful and then the final reason that you may use a cluster is for things such as high performance so high performance now is big in the supercomputer ish world I don't know if it's really you call it a supercomputer of a supercomputer ish world where the concept is again we start thinking about cloud computing and we're abstracting all of these different tasks one of the things we can abstract out is the task of compute so so the functional task of compute and so one of the things you can have in a high-performance cluster is the idea of multiple servers that cluster can provide compute services in the consumer world you might see this with something like I like with Apple Apple has a compressor compressor is a piece of encoding software and so you can actually create a cluster of compressor servers essentially so what happens is you give your video file to the queue and then what will happen is basically the server will figure out which which compressor server is under the least load at the moment and then it will simply farm farm that task out to that particular server so you might think about like with high performance cluster servers this would be things possibly like VMware so if you're using virtualization and you're moving instances of operating systems around that might be considered high performance again something like a compressor server cluster so you sit in encoding tasks to the cluster and the cluster simply figures out where where that file should you get a code that or if you start getting into some more interesting things you start hearing things about like SETI so 20 years ago I don't know if it's still around said he had this thing that was pretty cool where's a piece of software that you could install on your computer and then what would happen is the the main SETI the search for extraterrestrial intelligence it would send little snippets of sound or audio or video to your your computer and then your computer would try to process that and figure out if there was anything interesting there and then simply send the results back to their servers and so that was way you could have just an extremely massive like just I mean I think there are a hundred thousand computers in that particular cluster just a massive cluster with all of these individual computers trying to form individual compute tasks and then sending back their results so when you start thinking about why would server clusters make sense again right now in the modern world of IT if you if you're not running server clusters I don't know what the hell you're doing because seriously from the failover the big thing is from a failover standpoint you know that you're now saying that physical servers one can simply just be vaporized disappear and it doesn't matter provide services that's a huge thing so you look you're looking for failover you're looking for some circumstances things like load balancing and then finally you're looking at things like high performance in order for for things to be replicated out so now that you know what a server cluster is and why it's important let's go over to the whiteboard so I can show you is some examples in the real world I'll talk about a high availability Synology nas cluster so anaz is a network attached storage device and so sign online he does some cool things with their high availability clusters I'll talk about that I'll talk about an active directory cluster kind of trying to explain how that works a little bit better and then of course we'll talk about VMware hypervisor clusters and how you'll be looking at that in the real world so with that let's go over to the whiteboard so I can explain this stuff a little bit better so the first type of server cluster I'd like to talk about at the whiteboard here is something called a Nass hi availability cluster so you have to make sure with whatever Nass product that you're using whatever vendor you're going with and especially the actual product itself whether it has high availability but if you go with the company such as sign ology they have high ability built in and so what high availability does is it means if one of your physical nas devices fails the users will not notice it all right so you think about the old way of doing doing a network with something like a file server so you have all your users they of course connect to a switch and then on that switch is connected your file server right you have your file server here and with that file server you know you've gone out there and you've made sure it's as reliable as possible so you know it has it has raid in there you even do a redundant power supply so you can have power supplies where we can actually have multiple power supplies within the one unit so if one power supply fails it stays up and running so you know you've got your raid you've got your done a power supply you're happy you're excited because now you know that your file server is going to be reliable but what happens if the motherboard fails like seriously sometimes the London boys just crapped out for some reason right as soon as your motherboard fails or your CPU fan fails or any of the other idiotic things that can happen as soon as that drops then basically it goes offline and obviously your users and no longer have access to those files which can become a really big problem because they will not be even have access to the files until you can get the file server back up and running again if it's a simple issue like maybe you swapping on a network card you know maybe it'll take you an hour or something but again with some of these old file servers something that you have to think about these old file servers is sometimes they run so long that when they fail it's almost impossible to find parts one of the issues I found in the real world was you would have file servers that were fine they visit there they're technically fine but they would fail after like six or seven years and then when we went to go find the motherboards for the file server they were they were impossible they were impossible it might take a week to get a motherboard in ship down for that file server so basically until the file server is back up running your users are not going to be able to how to access to the files so obviously that is not a good thing so what you can have now is you can simply have a Nass network attached storage high availability cluster and what this basically both of the NAS units are connected to the network so they're both connected to the network and interestingly enough they're also both connected to each other now how this connection works again depends on the particular product you're using but this is a cable for what's called a heartbeat signal so this is very curious so what's happening here is these two naz's they're operating they're functioning and the users the users basically get routed to one of the Nasus you know they the NASA's amongst themselves figure out which is the primary and which is a secondary it's not saying this is the primary and the secondary down here so they figured out what is a primary and then they start having all the traffic routed to whichever one is the primary the secondary then what it's doing is it's copying so the secondary is copying all the information that's put on the primary nass and it's also listening to this heartbeat signal so basically the heartbeat signal is it's a computer version of a buh duh buh duh buh duh right it's the heartbeat and so what happens is if at any point in time the secondary Nazz can no longer hear the heartbeat signal from the primary Nazz it will then auto promote itself to be the primary assuming that the second that the original primary nas is offline for some reason and then everybody will simply get routed to that secondary Nass then all that has to happen you know you as a happy IT person you can walk in you can take a look at the old primary nest try to figure out what's going on oh look the CPU fan got gunked up now just spray out spray out the the CPU fan with a little air or oh crap this thing really did die I'm gonna have to get you know a new a new unit UPS den and really it doesn't matter that much because as far as the users are concerned there's no there's no problem right because they they have continuously been able to access the files that they need and so they're not having any issues so you can sit back with a nice old cup of coffee wait for your who noon as unit be shipped in with UPS be a brew install it at a leisurely rate and you have no issues there so this this is a basic concept again it's a high availability now as you see this in other touches like storage and Sand units but it is something to think about and what's nice again if you go with a company like psychology you can actually get this type of product relatively inexpensively now don't quote me on it but I would argue probably for about $500 per Nazz unit you can actually get an AZ unit that has this high availability service built into it so again even with smaller networks now you can have an as high availability cluster to make your life a lot easier so the next example I want to talk about is an Active Directory cluster so this is generally at this point in time you're gonna be using Microsoft so Server 2016 2012 2008 maybe 2003 or 2001 of those things running but basically the important thing to understand about active directories all Active Directory actually is is a database when you log into that Active Directory when you provide Active Directory your username and password it then checks that username a password against this database it then C then gives your computer what's called an access control key so that you're able to use resources on the network and it also has additional information such as your first name your last name your phone number and whatever else you've plugged into Active Directory so the important thing to understand about Active Directory is Active Directory really is just it's just a simple database so a lot of the concepts that you'll deal with with Active Directory are the same concepts you'll deal with with my sequel or Maria DB or any of the other database systems so let's talk about the Active Directory and basically how kind of like an Active Directory cluster would work in the real world now the first thing to be thinking about with these clusters is they that the nodes or the servers in the cluster do not have to be physically close together right so if we do a little horrible little picture here the United States oh my golly uh yeah that's good it's good I was not a geography major right but let's say let's say this is a picture let's imagine it's a picture of the United States yeah go with me right and let's say let's say you have a headquarters office and your headquarters office is in Denver you have a satellite office in San Francisco you have a satellite office out in Washington DC and let's say you have a satellite office in Texas the first thing that you want to do is you want to create a cluster of Active Directory servers for your network so that if one fails all of your users aren't screwed so let's say oh I don't know let's say you have like 5,000 people in the company so literally what Active Directory server can actually provide the services for 5,000 users very easily in the modern world but that would be one box and if that one box physically fails and then nobody has access to Active Directory and that would be a bad thing so the first thing you're gonna do is you've got to have your Denver office and you're going to create an Active Directory cluster in your Denver office so what this does is this means if any one of these Active Directory server fails in this cluster it's not a big deal because people will simply be routed to one of the other servers in that Active Directory cluster so that's the first thing is you're going to be worried about failover but then something you maybe be thinking about it's like well you know that seems very inefficient from an infrastructure standpoint I mean if you've got offices in DC in Houston and San Francisco and every user in the office has to over over the network be able to get back to an Active Directory server in Denver in order to authenticate so that they could then get their access control key and so that they can then use resources locally right this this seems like there could be a lot of problems one you're gonna have a lot of bandwidth usage just for just for being able to do the logins the authentication and then to if there's any problem with a network right if for some reason the internet connection fails between your office and and Denver where the cluster is that means users in your office may lose access to the local printer or to the local file share or any other number of problems so having the Active Directory servers all sitting back at the Denver office probably isn't a great idea so one of the things you can do is you can actually enact a directory you can have sites so you can actually put Active Directory servers at each individual site so now not only are these servers here at your your home office your headquarters office synchronizing information and doing the load balancing amongst themselves but they can also start communicating with those Active Directory servers at your sites so that when a user in Houston wants to log on to the network instead of having to log to acting to get all the way back to the headquarters office they can simply log in to the local Active Directory server and therefore that saves oh that saves basically bandwidth so you don't have all the users and going over the network it's a security concern again all that all that authentication information it doesn't go over over the internet or whatever else and if there's any problem with the internet connection they're still able to log in to their local Active Directory server and they know they don't see any problems with this so this is one of the things to be thinking about oops one of the things to be thinking about and what you can have in an Active Directory cluster is the idea of not only are you doing things like you have the the failover and a load balancing locally but then you can put servers in other physical geographic areas so that they are closer to the end-user again something like this you may think about so this is an enterprise environment but basically you can have almost the same concept with something like Netflix or something else where again from a consumer level you put servers closer to where your customers are so that then they're able to login or get information locally versus having to go all the way back to wherever your main servers are but this is a basic example of why you would be using something like an Active Directory cluster in the real world now finally I want to talk about a VMware hypervisor cluster so if you're going to be dealing with virtualization so you're going to have instances of servers and you want them to be able to run in a cluster environment you'll be using something like VMware hypervisor so with this again we have the instances of your operating systems so let's say you have a 2012 server and a 2016 server may have some Linux server here and basically these are simply the instances of those server operating systems and so what you do can do oops as I keep doing that so one of the things you can do is you can create a VMware hypervisor cluster and so you have a rack and you have a rack of physical servers and on these physical servers they have the CPU they have the RAM they have a bit of storage and they have a VM or hypervisor so basically you install VMware on to these physical boxes these physical boxes are then able to connect to storage so your storage will generally be something called a SAN a storage area network right and so again that is actually a cluster of storage devices that are all doing replication and failover and that type of thing and your instances your instances of your operating systems are actually stored within this Sam so if you then want to make one of those instances live what would happen is the hypervisor would make a call to thus and it would ask for whatever instance that at once and then it would take that instance and it would literally start running it on whatever hypervisor has asked for that instance well the thing is with this this particular situation again one of things we have to be thinking about is what happens if a server or physically fails for whatever reason again CPU fan dies I bring up this whole CPU fan thing I know people think I'm probably like exaggerating this at this point you don't know how many servers I've seen crash just cuz that stupid has little $0.50 for a concei pu fan fails so I know it's so let's say the CPU fan fails ah CPU fail fan fell alone VMware basically you have the management software that allows all these different hypervisors to communicate with themselves and so you plug in to the management software what you want have happen in a certain event what will happen so you say if a physical hypervisor dies it is no longer recognized by by the cluster then you would simply be able to move the instance of the operating system that you're running to a different hypervisor and you're not really moving when I say move in an instance that's not really correct it's more like it's more like this compute is connected to the instance on the sand and then all that happens is when this when this hypervisor dies this connection is dropped and a new connection is created to the new hypervisor and so that way you can have a cluster where even if a physical machine completely fails then then you know again that instance is essentially transferred to the next hypervisor and everything keeps running the other thing that you can do with something like a VMware cluster is something you have to be thinking about from an IT perspective especially a management perspective is things like electricity cost right the more physical servers that are running and the more more electricity will use and there's more cost there this can become very significant larger enterprises and so one of the things you should be thinking about is what happens if you have servers that have low utilization most of the day but high utilization at some times during the day right this can be especially true with things like Active Directory servers so Active Directory servers most of the day aren't really doing a whole hell of a lot but right in the morning when everybody is showing up to work that's when they get hammered right and so that's when utilization goes through the roof and so do you want that Active Directory server running on very high-end hardware all the time burning all of that electricity or wouldn't it be nice to move that Active Directory server to a high-end Hardware when it needs it and then bring it back down to a lower and Hardware when it's not necessary anymore and so that's where you can do something where you can have like multiple instances running on a single hypervisor so let's say we could actually have three instances of servers running on this particular hypervisor so let's say you have a web server there's a crappy little web server running on this particular hypervisor maybe maybe I don't know you have a VPN server and then you have an Active Directory server and so these are running on this just crappy a little piece of hardware let's say it's got I don't know it's got 16 gigs or right well now probably needs more than let's say it's 32 gigs of ram you know it is a decent cpu you know it's good enough so so most of the time you're only getting a couple of users hammering you're getting going to your website you only get a couple people using VPN you're only getting a couple people using Active Directory so you're running all of these instances on this one rather a low cost piece of hardware well let's say it's 8 a.m. in the morning everybody starts showing up to work and they all start trying to log into Active Directory so when they all start trying to log into Active Directory obviously that pushes the Active Directory instance uses and usage up and on this particular hardware that may be too much for this particular hardware so what if up here let's say you have a piece of hard with 64 gigs of ram and as a xeon processor has all the fancy things so in the morning what you can have happen is you can simply say like when when this particular server starts getting to let's say 80% of CPU utilization you're just simply going to move the Active Directory instance up to that higher end server so you've now moved the Active Directory up to this particular physical server so now all the users are hammering the hell out of this physical server this instance on this physical server it's got more than enough resources to be able to handle it so while they're long and in the morning that happens and then basically at nine or ten o'clock when utilization goes down you put in thresholds when it goes down then Active Directory will simply maybe move back to that lower and the server and so this is a way you can have the hardware and you can have it utilized for for what actually needs to happen right at this point in time so let's say you have you have a web server let's say it's just an internal web server but let's say Human Resources has decided that they want to start I don't know a new soccer team or some softball team I guess it's what the corporation's do so they're going to do a new softball team so they want people to start logging into the web server you know doing things on the web server signing up for teams doing this about any other thing again normally the web server right for a year for a year the web server has sat down here it's kind of like three you know three users a day it hasn't really required very much in the way of resources now Human Resources has said hey we want everybody to go to the website so now all of a sudden all these users are hammering the hell out of its web server again you can you can have automatic rules so that the hypervisors this whole this whole cluster goes oh crap the web server is now getting a lot of traffic okay well this server up here isn't being utilized as much as this server down here so we're gonna move the web server up here so that all of the users get far better performance and then what happens if this physical of this physical server crashes for some reason again that stupid-ass CPU fan fails then it can automatically reput a web server back where it was and now the the physical resources are gonna get hammered to high hell and back because everybody's still going to it but it will stay up and running even at this point it's a little bit slow and so this is one of the things too you have to be thinking about with these clusters is again one of these you really start thinking about is things like rules and again if then else like if this happens what should happen you know if the CPU hits 80% what should happen okay well that happened but okay so if maegyr it sent the instance up to this other physical server because the first physical server hit a peak 80% CPU load but then what happens if that second hypervisor that second physical server what happens if that completely fails and then what then what is the rules there and so this is one of the things you really have to be thinking about with this modern world of eye-team is it's not so much about simply swapping RAM and hard drives and that kind of thing anymore it's a lot of this again okay so this this instance is gonna get hammered 8 o'clock in the morning so what should the rules for that be ok human resource is gonna hammer this so what should the rules for that be okay well this if this server fails then what should happen to the other instance you know these are some of the things that you really have to think about because once you set these rules a lot of times once you set these rules these are completely automatic and the cool part about automatic is when you set up rules to run automatically and you were a good person and you really white boarded it and made sure everything was gonna do what you wanted to do then it works amazing it works perfectly well and you can sip your cup of coffee and you could very slowly you know fix things when there are problems here's the problem if you create bad rules if you create stupid rules the cluster will do whatever you told those stupid rules to do and literally and again that's one thing you have to worry about in these kinds of situations is you can have one failure that could literally possibly cascade into causing a much larger problem because if you set rules and it follows those rules and those rules are really dumb then you can have a much bigger issue but basically this is a general concept again how those hypervisor clusters would work and it gives you gives you an idea of the clusters in the real world so now that I've shown you a high availability Nazz cluster I've shown you an active directory cluster and I've shown you something like a hypervisor cluster now I want to explain a couple of other things you need to think about what the configurations so if you have something such as a database or a storage cluster or something that you're going to have to think about is what is called your replication strategy and so this is again it's one of those very simple things that can become very important so again let's let's say let's say you have a server here you have a server here and you have a server here and this is all a database cluster right so this is a database cluster whether it's Active Directory or my sequel of Moorea DB or something like that and so the idea being that users should be able to go to any one of these database servers and they should be able to get whatever information that they're they're trying to receive well something to think about is that in order for that to work you actually have to have something called what's replication strategy so what a replication strategy means is somebody will write to one of the servers right so you're going to take a right action to one of the servers and then what the replication means is then once something has been written to one of those database servers it then has to get written to all of the other database servers in the cluster so this is important with something such as Active Directory so if you are using Active Directory you log in and then you change the password so changing the password is a right action so if you write to one Active Directory server well if there isn't a replication system going on if you then get routed to a different Active Directory server it will not have your new information and you will not be able login right that simple it's simply a record and a database and so if that information isn't written to the rest of the database servers you're going to run into an issue and so that's where you have the replication strategies and the replication strategies basically this says how often changes to to them to whatever database server or possibly a storage server how often that will be written to the other servers in the cluster now things to be thinking about with this is again depending on how how often things change on your servers there might be a lot of data right so again remember your internet connection is a finite resource so if you're making changes to your server and then all those changes have to get replicated out to all of the other servers within the cluster if that all happens at the exact same time and that may run into a problem basically you may be using too much internet connection and start slowing everything down for other users on the network so one of the things you have to be thinking about is okay as people make changes to one the server within the cluster then how are those changes replicated out to the other servers this is something we'll talk about later but it's just one of those things and what do want to give you a heads up on so you're thinking about like okay again especially like something with storage so so let's say you upload a video file so let's say your Netflix right if your Netflix and you upload a video file to one storage server in a cluster and you have all the other storage servers throughout the world again so if you're talking about a video file that video file might be 20 gigs in size might be 20 gigs or more Plus in size so if you upload something to your server do you want that literally replicated the instance that it happens if you did that means again simply here 1 2 3 4 5 6 7 8 9 10 that would that would be a lot that would be 200 gigs of data that would have to be sent out immediately and so something you need to be thinking about is okay well maybe I uploaded the storage server and then the replication strategy is that we copied to this server over here first and then we copied to this server over here a second and this server over here third and this server over here fifth or maybe we write to the main server and then at night when we're at lower load that's when it will replicate the data out so basically all we're talking about with the replication strategy is if you have multiple servers in your cluster and you write to a single because you're only gonna ever write to a single server at a time then the question is is how does that data get to the other servers in the cluster and this is something that is can become very complicated especially if you have larger networks and so it's something to think about now the next thing to think about with with clusters is something you never want to have happen it's something you just never want to have happen if it happens to you which is a bad day just just quit if this happens to you just just to be clear just quit and what that is is something called split brain and something called a split brain just say no to split brain if you get split brain just quit I'm just I'm just telling you just just walk the hell out go home quit so what am I talking about with split brain and why is it so bad well let's say you have again you have something like that that Nath's high availability cluster that we're talking about before so with an ass cluster right you've got all of your users you have the heartbeat signal you have the NAS devices connected into your switch or whatever else all the users are connected in and so then basically they all get routed to whatever is the primary right with the idea being that secondary wellcan will listen it'll listen to the heartbeat signal from the primary and basically when it no longer receives that heartbeat signal it will promote itself to being the primary and I'll say hey start sending all the traffic to me alright it's that great that's wonderful except here's the problem with this here's a problem what happens if this heartbeat signal stops working because some unplugged the network cable that that's used for the heartbeat saying all right what if somebody like literally walks into the server room they see a cable that goes from one storage box to another storage box and they go oh no that's horrible and they unplug that well that means this heartbeat signal will no longer be heard by the secondary because you literally unplug the cable that does the heartbeat signal well again with a lot of this stuff it is auto magical and it's Auto magical and so when it stops hearing that heartbeat soon all this will automatically promote itself to be the primary well here's the thing the primary the primary that's here it doesn't really care that much about the heartbeat signal because it is like really care what happens the secondary so it's not really listening to the heartbeat signal so if it stops hearing the heartbeat signal it doesn't really care so basically the primary the primary now as a sitting there going I'm still a primary knows and then the secondary goes oh crap the primary noise failed now I'm the primary Nazz so remember up until this point in time up until some desktop support administrator did something ass anomaly stupid in your server room up until that point these were identical they were the same they were equal well now right through the power of unmagical now what's gonna start happening is some users are going to be routed to the first primary Nass other users are now going to get routed to the secondary Knives that's been promoted to being the primary nass and so some files are going to get deleted and some files are going to be created and then this person initially got routed here but they log in after lunch and then they get they get logged in over here so the file of a that's no longer here that's not here because these two are no longer synchronizing with each other and basically what you get with split brain is up until the moment that this heartbeat signal failed for a stupid reason these were identical after that point you now have your users in the network willy-nilly getting around two to one or to the other based off of who the hell knows what and now they are becoming unique and this is where this is where you just quit once once Humpty Dumpty falls off the wall you can't put it back together again just walk the hell away because now at this point you have all of these users doing all the stuff that users do these two nas's are no longer the same and trying to figure out what the hell has happened can be a real pain in the butt and so that's one reason even if you have a high availability failover an ass you still need a backup right a lot of people feel like if I have something like like high availability I feel it failover for mine ass like I don't I don't need a backup or like with the sans if you have high availability nas or hot air or sand use it I don't have to have backup because I have redundancy yeah well again Oh desktop administrators stay out of my server room yeah so that's why you also always need a backup because if you have a backup then you can figure out when this idiocy happened you can then fix the problem so they could go back to being the same and then you can restore from the backup and then the users will only be slightly irritated but again when I talk about split brain can be brutal oh god it can just it can just be awful now the final thing to be talking about when we're talking about configurations and things to be thinking about when you set up your clusters is basically just good old-fashioned notifications right so notification systems are one of those things they're built into almost all software at this point but a lot of people ignore them even administrators because they don't really care again if your server fails you don't need an email that the server has failed because users are gonna start screaming at you ASAP so there's a lot of things like backup systems have notification the servers have notification every system has a notification system built into it but for a lot of administrators they kind of ignore the notification system simply because if a system fails they will they will know it very quickly well one of the things to be thinking about though in this clusters in this new clustering environment is the whole point of the cluster is if that one device goes offline that you won't notice it right if your sand is working properly if your high availability Nass is working properly if your active directory is working properly you can actually use lose a couple of physical boxes and as far as the users are concerned it might be a little bit slow but it might actually not be enough that they'll even bring it to your attention it's like oh ha funny you know it's taking it's taking five seconds to log into Active Directory today instead of two seconds right people aren't gonna come to you with that but you may have a couple of physical machines that have died for for whatever reason and so that's why you need to have a notification system that I created so basically notification systems I mean this is the old-fashioned thing can either email you an SMS you can do any number of things at this point but basically you need to have some system so that if your your physical servers your nodes in the cluster if they start failing you will get a communication that that event has happened cuz again in the real world something to think about is server rooms right you build out your server room you put your servers in your server room and you don't necessarily need to babysit them that much again especially if you automated a lot of the routines so you may not realize right you may have a cluster you may have a cluster of servers and you may not realize that one has failed so one may fail today and then a week from now another may family and you're not really noticing any difference and a few days after that ain't no there may fail and basically you may run into a problem if you don't have a notification routine going on you may have the problem of like the final one fails and then you're kind of back to square one because now so many so many of you the nodes so many of the servers in the cluster have failed that by the time you figure out what that there is a problem it becomes a much harder issue to deal with so imagine something like it sounds stupid but like again in real server rooms imagine if the condensation pipe for your your air conditioning system for some reason is leaking on to your servers right so you have a rack of servers right and so for whatever reason water water is literally just dripping down from the ceiling so it drips down from the ceiling it gets into the first server and again it may cause a problem where you don't even really notice that it's happen the server just shuts itself off a little bit more water drips down and that shuts the next one off a little more water drips down that shuts the next one off you can have some weird problems with this kind of thing and so just making sure that you have any notification system set up so that you're told when there is an issue that can be a real lifesaver at the end of the day so now you know a bit about clustering then I would say in the modern world of IT if you're not clustering your physical servers I really don't know what the hell you're doing again just just the fact that you can have a physical box just completely disappear somebody can walk in steal your physical server and your infrastructure is still up and running I just I just don't see how an IT person wouldn't want that especially now that hardware costs have come down again whether you're dealing with full-fledged servers you're dealing with you know Active Directory on real real servers or you're dealing with simply Scientology units right the price of hardware has come down to the degree that I really just can't grasp why you want to be running clustering just because it makes life so much easier again if you have a failure if you have to do a migration or an upgrade or anything else if you have a cluster your life is easy just sip that cup of coffee you can work at your nice own slow pace users aren't screaming at you because frankly they don't see any so that is a great thing again when you think about clustering just do you realize that this is a technology not a product so a lot of people get confused in the the tech world as tech professionals and they forget about the difference between technologies and products so technology is an overall thing like incremental backups is a technology right and then you have specific products that then implement that technology so how Active Directory replication works is specific to a Microsoft Active Directory so if you were using Windows 2016 you have an Active Directory server you will configure their replication and everything else based off of whatever the hell Microsoft tells you if you're using a Scientology high-availability Maz server you'll configure that based off of what sign I am a psychology tells you if you're using VMware or Citrix you will configure that based off of that products requirements so do you just realize that this is that the clustering is a technology and so it's something that you should look for in the products that you're purchasing and then with those products you'll see what how they tell you to set it up and you'll have to look to see if they have the features and functionality that you are looking for yet that's an important thing after that you know just remember the replication strategy so again if you're dealing with databases or you're dealing with any kind of storage clusters you do have to think about okay when I write to one of the nodes or one of the servers and the cluster what happens after that right at any point in time with a cluster you're only actually dealing with one physical machine you're only dealing with one machine so it's a cluster and then you get routed one particular machine and then if you make any modifications then those modifications or those changes or those rights those have to be written those have to be replicated out to the other servers in the cluster and so that's something that you have to think about now if it's something like a storage cluster so again you're saving things like video files pictures that type of thing the replication strategy might not be that big a deal right every hour replicate the data out amongst all the servers right you know you may have to think about link bandwidth considerations but you're not really worried about crashing any systems if the replication isn't very fast on the other hand if you're dealing with Active Directory remember Active Directory that has all the user account information for the network so if somebody changes their password and they're connected to one Active Directory server and then they log off later and they go to log back in but then they get routed to a different Active Directory server if that the password faint hasn't been replicated out to the other servers when they go to the other server that other server will be using the old data so when you go to log in it will fail and you can run into issues so thinking about the replication strategy is a really important thing and again it also depends on you know what products you're using are are the nodes in your cluster all local are they all on the same LAN so maybe bandwidth really doesn't matter are they spread throughout the country or spread throughout the world so bandwidth is they say no again thing those are some of the things you have to think about and then finally as I talked about again notifications this sounds like a stupid little thing because again all products is I think word has notifications built into it at this point everything has notifications built into it so it's very easy to ignore notifications or forget about them but again now in this auto magical world if you can have a physical server fail and the users don't realize this happened there's a good possibility that you'll overlook that it happened and then you may have more significant issues if you're not notified that these problems are occurring so those are some of the things be thinking about with clustering again this is a technology not a product so whatever products that you're interested in go out do a lot of research on the product make sure it does what you what you want it to do and yeah that's about all I'm gonna say so with that as always I enjoy doing this video I look forward to see on the next one
Info
Channel: Eli the Computer Guy
Views: 13,312
Rating: undefined out of 5
Keywords: Eli, the, Computer, Guy, Repair, Networking, Tech, IT, Startup, Arduino, iot
Id: z26BDjjoD4A
Channel Id: undefined
Length: 52min 24sec (3144 seconds)
Published: Fri Oct 11 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.