Fault Domains and Update Domains with Azure IaaS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yes and update remains firmly as they apply to infrastructure-as-a-service now previously I've talked about things like hi bait ability and how both machines work was a really quick recap I can think about I we started a cloud service and inside the cloud service I can create virtual machines and in Asia if I want a service level agreement around my virtual machines I have to have at least two instances of my workload insane called an availability set but what is that why is this thing important and it really boils down to the availability of my virtual machine in a kind of disaster and unplanned event and planned events like maintenance operations and really what's happening behind the scenes lowers your may seem magical in many ways lots of technology is as get sufficiently advanced it becomes indistinguishable from magic someone intelligence said that once and what essentially happens behind the scenes is what is agile agile is really at a very fundamental level racks and racks and racks of servers that is very intelligent fabric there's management there's roofs resource creeping lots of cool stuff there but fundamentally I have racks and I have servers in the racks and there's kind of a top of racks winching these things and there's other elements there's power supplies so in many ways I can think of rack as a point of failure the top of racks which could fail the power could fail so it simply think of a rack as a fortune to make this is something that could fail so commonality rack equals a fault domain so when I put virtual machines into an availability set this is any guy to do during creation or even post corrosion make it restart what does that do so if I put the ends in of a belazi said actually they take a step back from that let's add don't use availability sex let's say hey I want to put a service up in Asia and I'm going to create two instances on it and create two iis web servers okay that's great what's to stop Asia creating the V ends repair going to put one VM here and I'm going to put one VM here it's going to put on the same rank now chance of that happening may be fairly low just given the scale of a shoe and the scale of a stamp Estelle unit cluster but it could happen that's kind of useless to me because the reason the point two weeks is is might be for scale but it's also to protect me from some kind of unplanned failure boy sure has a problem this rack failed I want my BM on another anchor different rank and so wonderful ability set does very basically is when I put VMs and availability set it tells Asha to split them over to four domains so this is availability set so what it makes it do is Asia when it deploys VMs to an availability set it will split them over to four domains now today it's to four domains if I created for virtual machines they will be split over the same to four domains so I had to VM is in each foot domain find a point ten VMs and f5 in each full domain I don't get each one goes to a different rack the way it works today is it would be split over to four domains matter how many of the end then what's important when I use availability set is to never mix workloads because remember all it's doing is splitting them over to automate if I mix my sequel boxes and my domain controllers my eye is servers into one availability set I can absolutely end up with all of the sequel on this fault domain or the eye is on this hook main as you can't tell what's inside the VM so when I have different workloads I create an availability set for each workload so if that was my maybe sequel if I wanted employ s servers I would trade a different availability set for my I is service I would create a different availability set my domain controllers another availability set for some application etc for never mix workload one availability set for each unit of work so I want to make sure is split over four domains and I can see that so I have an availability so over here and I have three instances of it and you can see I've got my three iis instances and it split over two full domains full domain 0 and full domain 1 my third PM is back in full domain 0 we kind of round robin is how it deploys them so I get some assurance in the event of some unplanned failover failure sorry I would lose all my the ends my service would stay functioning that's how as you can do a service level agreement hey as long as you've got these two instances of your service and it's an availability set I can give you a 99.9 percent or whatever the exact SLA is these days I can give you that because hey I've got my own mechanisms in place to limit scope of failure to a particular rat particular fault domain so I feel pretty good that we can meet that SLA that's a full domain and you see this other thing called an update domain and you'll notice is each VM is actually in a separate update to make so what are these do now I really think of update domains more around platform-as-a-service but they do have an impact on is as well so in a platform of the service the platform of the services where I write my up a certain way I deploy my application to ash and I say hey Asia I want 20 instances of that thing running or I want five or hey scale that thing depending on load and it goes and creates the end with my application it or automatically that's the great thing about pads I don't worry about the OS or middleware staff or one time I just wipe my ass and I deploy the thick and agitates carrier milk kind of magical so in that world let's say I've got 20 instances of my app deployed and now I have a v2 when I deploy my v2 I don't want to shut down my service as I'm deploying my b2 on a certain number at a time and that's what update domains are I can kind of think of it as remove this from an in I go back to my racks of servers and I'm still even that has well I thought it was about over to halt domains but now what happens is I get these update domains as well and so maybe less aside for up domains below you have four colors so I'm going to be kind of simple is so my first instance would make beyond that for Dwayne my second instance would come over here my third instance would be back on this whole domain my full instance would kind of come over here now my fifth instance well that would actually come back to this update domain my sixth one would be bounced over here sort of thing and so on and actually I think in pairs like after 20 of these update domains so I can be very scalable but the way this works is when I rolled out my application it says okay well I'm going to update update domain one first so it would basically shut down instances one and five in this case update this is the new version once that's finished then it would go and shut down update the main - but you notice I'm only losing a quarter of my PM's at any one time cause if I had 20 update the main snow news at 28 for any one time so enables me to very granular control kind of the rollout as I push out new versions of my application in case now for is it actually works thinking about post updates behind the scenes those servers the host of the MS that's running an operating system now as it doesn't really patch as a texture onto patch stuff it really kind of just reboots from a new image that has all the patches in it but it's still some period of downtime there so how this works for is is there's five fault domains and for each availability set for your workload and that's why you can see zero one two I'm on three buddy got three VMs by five as he 0 1 2 3 4 5 10 R T 0 1 2 3 4 0 1 2 3 4 round Bobby but only two for Mike 0 1 0 1 0 1 etc that's what that means is in the same way as when they do this or host updating in Asia each update to make its updated one at a time Susan's doing its host updates all my VM is an update can main one so have they been 0 those hosts would be shut down first patched rebooted and brought back up my dns would then stop then it would update update the main to those VMs would shut down the host would shut out of a boot from the newest they would come back up then 3 then 4 then 5 then 1 2 3 4 5 1 2 3 4 5 kind of get that's the way it with work never obviously is only going through one cycle acts all the VMS in that first update the main those hosting update they go unavailable so that's how it applies to is so think four domains unplanned failure all of the things would disappear in that full domain until it's fixed update domain is doing plan maintenance operations they shut down one update domain at a time so if my ISBN to my availability set a split over five update domains when routine maintenance happens they only shut down one update domain at a time so they only lose a fifth of what VMs doing those maintenance operations rather the entire fault domain so again put it up de update main 0 first those games would shut down it would patch boot from the new image bring the backup once they're back up and running then it would go shutdown with the MS update domain 1 etc 2 3 4 and that's that so that's really the big difference I think about fault domains let's go over to unplanned failures update domains and split over five again assigns in a round-robin and that's for planned maintenance operations for is those - updates again Paz is also used when I think about updating my application Ben is we don't have that concept but that's how we're going to minimize the downtime of our workloads how that kind of made sense has that was useful appreciate time
Info
Channel: John Savill's Technical Training
Views: 41,995
Rating: undefined out of 5
Keywords: Azure, Fault Domains, Update Domains
Id: ilXx0cmmGz0
Channel Id: undefined
Length: 11min 17sec (677 seconds)
Published: Thu Apr 16 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.