VMware vCenter SRM: Concepts, Architecture

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to the VMware Site Recovery Manager five video presentation my name is Andrew L wood I'm a senior technical instructor with VMware education services this video is going to incorporate all the features of SRM 5 along with an installation tutorial so that you can get the maximum out of your deployment of SRM the first module that we're working through today is SRM 5 concepts in architecture we're going to be taking a look at the structural architecture and the pieces that make SRM what it is so SRM itself or Site Recovery Manager is nothing more than an end to end disaster recovery orchestration product and that's an important concept because there is no such thing as a disaster recovery product by itself you can't buy disaster recovery as a single entity it's really a tool that you get to leverage to be able to help you in recovering from disasters and so realistically we're looking at an orchestration product as as the primary tool behind this it allows us to automate the full recovery of virtual machines that reside on replicated storage in your data center and we'll talk a little bit more about the whole concept of replicated storage as we go through this presentation so the four-step disaster recovery process with SRM 5 is based around these simple elements first of all at the protected site when SRM is installed and configured properly the first step is to shut down virtual machines starting with virtual machines that you the administrator has designated as the lowest priority of those VMs and some folks will ask the question why I mean if there's been a disaster why would I bother shutting down virtual machines aren't they already shut down and the answer is perhaps and we can't necessarily take that risk because of the way virtual machines write to disk and the ongoing replication that we could have in the background we really don't want to take a chance when it comes down to shutting off these virtual machines or not shutting them off so SRM will try to shut them off if the shutdown fails it's no problem the recovery process continues as expected step two is at the recovery site that's your designated recovery environment whether that's in a data center that's right next door or whether that's across a wide area network link SRM will then prepare the data stores themselves the replicated disk components for the failover that would allow the ESXi servers in the recovery site to mount those replicated data stores and take the necessary actions to be able to start up those virtual machines the other idea behind SRM is one of the fundamental concepts is SRM gives us the opportunity to run what's known as a shared recovery site so our recovery site the one which is going to take over in the event of a disaster could in fact be running virtual machines for production purposes for another division of the company or perhaps your tests and dev environment so if we're expecting that those same ESXi servers to start up in bound virtual machines because of the failover the idea is to shut off those virtual machines that are considered unnecessary to make room for virtual machines that are going to be required when we go through the failover process so in other words mark those that are considered non-critical to be turned off first in a safe fashion so that we can make more room for resources that are going to be necessary for those virtual machines to start up that are considered our mission-critical applications in the event of a failover and then SRM finally starts those virtual machines at the recovery site I'm the interesting part about this is the SRM administrator has full control in the recovery plans over designating startup orders dependency sequences and all of the necessary pieces so that your entire environment can come back essentially without interaction and that in my books is probably one of the biggest benefits a failure occurs well that's terrible but the reality is our systems can take effect in such a way that they will then automatically restart and not have to panic the administrator themselves so the SRM features full centralized management is the key to this process we can create tests update execute these recovery plans and a recovery plan is nothing more than that it's a series of things that we want to do and that recovery plan as we'll see later on in the series could be to recover from a failed storage area network resource or it could be from an entire failed data center and it's up to the administrator themselves to provide that particular recovery plan in with these disaster automation processes that we build into SRM you can build those recovery plans in advance you can test all of those recovery plans and that's actually a very interesting thing there's a lot of people who think they have a good recovery plan at pencil and paper or maybe even an automated one but it's never gone through full testing and what most people are surprised to learn is once they do go through testing and it fails of course now they've got more downtime than they ever planned on and it could fail in such a way that they're not recoverable and that's generally considered a bad thing for most companies we can automate the execution of said recovery plans and then the integration is fully built into vCenter itself such that we can recognize the virtual machines that we have in our environment we can also integrate with our storage replication vendors themselves so if we've got our storage environment is providing the replication for us we can leverage that and have full awareness of that process going on in the background one of the coolest things we brought out with SRM five was the additional feature called vSphere replication and what this now does is for shops that are not trying to do storage area network based replication or have not invested in that technology we can deploy something very simple which allows our ESXi servers to perform the replication on a virtual machine by virtual machine basis so given that this makes full SRM implementations much more affordable for small and medium business organizations and on top of that even for the large enterprise sites in workload environments where they're not perhaps as mission-critical was those they designate for their storage replication which is the most expensive of the storage that they typically run with their environment it's relatively efficient it's managed on a virtual machine by virtual machine basis and the way we deploy it as you'll see in the later modules is a pair of virtual appliances which for ESXi administrators or vSphere administrators is something that is becoming more and more than norm for deploying these utility type environments the other thing that SRM 5 the table is this feature called planned migration and if anyone has ever had to go through a data center migration where they're relocating a data center or they're repurposing some systems from one data center to another there are some challenges behind that and SRM 5 gives us the ability to do what we call a planned migration and behind this one the concept is very simple shut down the virtual machines on the site that we're looking to retire or the the components that we're looking to retire synchronize the disks that are under the covers as of the state of the shutdown and then replicate give give the replication engine a chance to replicate that data across the wire to the new production environment and then startup those virtual machines in the new production environment the interesting part the reason this one differs from a disaster recovery or a true failover is that if there are any errors encountered along the way we're going to stop that process and that's an important feature if you're looking to do a planned reliable migration if we have an error we're going to stop we're going to let the administrator fix that error and then continue onwards or restart the migration perhaps next weekend or at the next most convenient interval so a very very powerful tool for anyone who's looking to migrate systems between different environments and that might also include doing things like changing the IP addresses to all of the virtual machine workloads that you're migrating across and trust me for anyone who's ever done that task manually that's a that's a feature not to be underestimated a very very powerful tool the other feature that we bring to the table with SRM 5 is the concept of an automated failback so if we assume that we've performed a failover operation we have the opportunity with SRM 5 to use what we call a single button re protect and then subsequently potentially even perform a fail back the problem with going through a fail of our operation is that now you're not protected and what ends up having to happen is we need to rebuild our recovery plans such that the recovery plans work in the opposite direction weari protect the virtual machines that failed from the protected site to the recovery site originally such that then if there happens to be another failure on the new protected site we can then have those VMs fail back a very very powerful set of tools one-button reaper TechEd and one of the features that our customers have long been asking for ever since the release of SRM 1.0 so we're very proud to be able to deliver that particular feature set with this release the other thing is that we support various district disaster recovery topologies so the idea is this if you have a protected data center your production data center you can go and acquire a dedicated data center for failover purposes and you would need to have a number of ESXi servers running essentially in standby mode they would be fully up and functional but not doing anything else with the intent being that if we have a failure in the protected data center that ESXi server in the recovery data center would be able to take over and run those virtual machines that have been brought into the new recovery site great not a lot of customers have got the money to be able to deploy that the active active failover process has become very very popular with SRM the idea is that we've got it protected and a recovery site maybe you've got a New York City in a Chicago office your New York City office is going to failover to the Chicago office if there's a failure event or vice-versa Chicago will failover to New York City with the assumption that not both of them would fail at the same time I think we've got bigger problems if that type of a situation ever occurs but the good news about this is that we can actively use resources in both sites and then we can take steps through the SRM automation process to shut off those unused virtual machines to make room for some of those inbound virtual machines as a failover happens to so active active failover full bi-directional failover each acts as a recovery for the other okay now so you're running full production on both sites you could argue that active active and bi-directional are very very similar most customers won't see a real difference between them because realistically it's just a matter of choosing if we're going to shut off some workloads to make room for the in comings or not and then last but not least the concept of a service provider for SRM and that would be the shared recovery site process could be useful for branch to central office deployments also useful for service providers who are looking to provide as RM failover sites for customers who are going to protect their sites in the primary environment that pretty much covers the overview of the feature set of SRM 5 to get more information about this we can always go to vmware education services we run to maintain a website in the back end which has got all of these things vmware.com slash education or if you're looking for certification services vmware.com slash certification you
Info
Channel: VMware
Views: 125,301
Rating: undefined out of 5
Keywords: VMware, SRM, vCenter, vCenter Site Recovery Manager, Site Recovery Manager, Disaster Protection, SRM Concepts, Getting Started with SRM, SRM Architecture, SRM 5
Id: OBQNuT_z9ro
Channel Id: undefined
Length: 12min 23sec (743 seconds)
Published: Tue Oct 02 2012
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.