Top 15 Veeam Backup & Replication Performance Optimizations

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
right we're ready feels right at time hello everyone my name is Irina and first of all I would like to thank you all for joining the session today there are so many interesting sessions that are running in parallel and this is very hard to compete with Gustav and all the versus announcement so thank you for joining us first of all while we were getting ready for the sessions we were thinking about what who want to achieve and without that our goal is if each of you come out of this room with at least you know one useful tip after the presentation we would think that we achieved our goal so and also you can use this presentation afterwards as just a quick reference at least what to check on maybe I'm missing something before going and opening the port case about for performance with our support team so let's start with and first of all I'm going to hand it over to Rasmus to introduce themselves everyone thank you so much for attending the session my name is Rasmus hasslein I work in global education services for being and if you're sort of wondering what what do you do well you might be familiar with the beam certified engineer training program I'm basically the technical manager of this program so I create a course where I create exam questions trainer trainers and drink Biman I also go down and often teach some of these or these trainings I've been working for VM since January last year I have a little bit more to say about myself surface my name is erina scary to say I join theme 2009 I used to work in support team and I was the fourth member of the support team supporting being backup application version three and then I'd switch to management back in the year I I joined the cell department and now I'm in the role of the cloud system engineer responsible for entire Indian region I come from st. Petersburg and this is a pleasure to be here especially because it was snowing last week still in st. Petersburg this is my average view outside of the window this is a beautiful city but we have a job good st. Petersburg but we only have two seasons which is winter and July so it's really a pressure pleasure to be here I appreciate thank you V for bringing here and for giving me a chance to to visit more lands and to we were some useful tips ok let's start with the topic done and we're going to go with the some easy tips and also maybe dig down into some more complicated some more complicated points so first of all I would like to talk about our backup mode and choosing the right tools to protect your virtual infrastructure we have as you know more most of you program the over beam and as you know we have two moles that are specified in the user interface which is reverse incremental and incremental but in fact there are three of them and the one with hidden which is called we called forever forward incremental I'm going to talk a little bit in this so first of all inverse incremental mode is what you have is you have it is even it is even shown as the user interface but it is slower just because when you process with virtual machines this results in three times more EO on the repository which means that following our best practices you need to either run some type of operating system based repository which is based on Windows or Linux just to have a local data mover that is running on Windows Linux repositories oozing away beyond the only cage we can run something on top of it all right so you can run a reverse incremental thing with the virtual machines that we recommended that have not much changes over time between your back up windows because it will basically how it works is every time you run incremental cycle the full backup and deal and the incremental will be transformed that's why it will result of the load of the repository so this will this will be three times this will be a transformation and your full backup of will become will be always the latest one so the good use case for the reverse incremental would be for example if you would like to upload your full and latest backup to the tape on a daily basis and that would be don't tell you still your instrument is just a little bit nervous so don't delay use case you might have is sometimes you sort of want to have a very long chain of backups and obviously you could also do that with forward but in the case where you want to have let's say for example 100 restore points or more you might want to be better off going down at the path of to full because it's there's a very high risk at some point something could happen in the middle of this chain and if you're using for incremental and something in the middle is broken you would only be able to go to this point but if you take the reverse one obviously the same thing is still the truth but at least the latest one would be the full and you would still be able to restore to the latest point in time right what's work with traditional forward incremental this is a very traditional story we have one read operation of one write operation we take the change locks we after the full run we take the change log compress deduce it and put it on the storage so it's going to be one al so just choose the right mode for your workload depending if this is Howard convectional sequencer or just applications that doesn't generate a lot of changes over the third one that is hidden here is this is actually the setting which will enable you the third one which is forever forward is when you do not enable any active or synthetic backup operations for backup operations construction and in this case now what would happen is that and by the way this is the way how I'd fold our backup copy job works it's a bit tricky but with forever forward incremental you've got the chain and once you hit your attention policy once you reach the number of restore points you have specified in the chain your full backup will be merged with the oldest incremental which will also result in the trip in the transformation operation but this is something you know golden needle so it will be in the middle between reverse and forward so it will be 2 2 x io on the on the repository also mentioning the repository to speed up things with the repository it's not only about operating system based repository but also you can set even if you are backing up to the remote location you can set up let's say engage way proxy this is specific role or a Windows box in the embalmer application server you can set up a diesel proxy which will be local to this repository even in the basis I just announced both network share and feeds gateway proxy will do all the transformation operations locally and basically help the repository with conform operations and emergently for back up through the incremental talking about active and synthetic feedback sorry I'm talking about active and synthetic full backup so accessing backup would be the human resources your network and it will be basically the just another full backup who will take it from the production original virtual machine analyzing to virtual machines and sent the senses to the repository so this will result in more traffic instead of this and we'll still have a tip on this later on we have a synthetic full operations which also would require you to to have some local data over to your repository listen to the full operation what it does is instead of are going to original virtual machine or will essentially analyze which already what you already have on your repository what change you have and we will create construct and you pull back a file out of the existing change on your backup repository but one thing to you need to know about this is that in this case your new full backup file will date back ends of your latest incremental because we do not read any new data from the original virtual machine so basically if you ran your incremental file last Friday and your synthetic food grade is on Sunday and on or on Saturday you will still have the data as of Friday so this is just something to keep in mind when you enable some typical backups if just a single thing I wanted to add in on the previous slide here as well and I with with axioms and synthetic full because sometimes there could also be some other good use cases for deciding which one to go with you might have a setup where you're doing backups directly from from the backup job to diva location appliance for example and we typically break duplication sciences up into two groups one being integrated dealer kind sirs and non-integrated and when we say integrated at the publication appliances what we mean is for example extra grid or it could be HPE star ones if you have a license or catalyst or it could be Dell EMC data domain if you have a TD boost license if you don't have the catalyst license or the DD boost license we would consider them non-integrated instead and why we have this differentiation is because if we take non-integrated deduplication appliance and say ok let's do a synthetic full then basically we will have to pull all of the data out of the TD box which means it gets rehydrate this process takes a really long time and then you shave it all back again it would probably be significantly faster to use an active full ok so you put some load on production but for the deduplication box you can finish this process really fast on the integrated boxes you don't have this problem you simply just run a synthetic full me we don't actually read anything out we just tell the box hey please create these pointers to existing blocks it still takes a little bit of time but it will be many many times faster and having to read everything out and put it back in again for the trade footnotes and I think one in oversight there often are is one thing is going down the path of what kind of source back up what kind of sauce trains fourth line am I using here it would probably often be a backup job of some sort but it could of course also be a replication job but if you're looking into the replication job you have to keep in mind what kind of transport mode am I going to be using on the target side so we have the table right on the slide and for the source side you can almost always use any transport mode there are some limitations and these are typically the limitations that come into play especially when we start looking at the target side of a replication job so for example in the case of replication jobs you could use network mode you could use a virtual finds mode but only for the first run would you be able to use 2xa in a direct NFS that's simply because you can't do in from an incremental resource with direct same so there would be some sort of snapshot already because you have a replication machine sitting there well as soon as you run it more than one time you'll have restore points on the replicas we use the points equal snapshots on on the hypervisor side and effectively you would no longer be able to use to a direct NFS on the target side now that could cause some problems that we saw the far left with with the two last options or network of virtual finds now keep in mind for network mode when you connect either to extract data to put data back in to the ESXi management interface be my imposes a limitation of roughly 40% so if you have a one gig link that would still only be around 40 megabytes per second perhaps that you could extract from from dis interface even though that nothing else is happening on it and maybe that's not a big necessarily problem but just keep in mind that it's a limitation and especially on the replication side you might be in a situation where all of your backup rocks is a big physical machines and that's that's perfect but on the target side you probably would want to consider just adding in some additional virtual backup properties you can easily go in and exclude them from being used on the backup jobs if you don't want it to be used but you can then have them being used for targeting in on that disaster recovery host but a replication job meaning you could go up to the full size of the bandwidth I would like to adjust five cents by me because our proxy has a different transport mode but also you can set it to work in automatic mode which is started with mine and this will automatically switch to live what's into that recovery mode which automatically sail over to the secondary mode if you make it automatic but still you will have your back done even though this backups are not done in the direction axis but just when you are testing the when you are setting up the environment for the first time anyway when you are testing just make sure that your proxy if you expect it to work with saying directs and access make sure that it is only set to direct and access without any you know failover options without sailing over to network because in the or it is not set to automatic because in this case you will just see the performance degradation but you will not understand why is this because the impact replication will just automatically analyze this and just choose the mode that is available even though there xn is not available and also I have arm that was a recent case the customer came it was a hyper D customer and came to me saying that everything was running fine but at some point he was he saw the performance degradation on his on his hyper-v and he was wondering life so we started to dig into real-time statistics and found out that case he was using all false proxy which obviously should be faster so what happened is he had this proxy mode set up to of host but failure to own house just in case something's gone wrong with the proxy so this is what happened that we happened to him his - the environment failed to processing in on host mode and that's why the performance was not that bad was not that good as he experienced before and the same case I have this nothing to do with the performance but all the same case with hyper-v of course proxies and support because fourth can be even not opened on the on the production hyper-v code so when they perform in the on host mode you can you can actually see the back of failures Welbeck of sailors is no performance at all so somehow it's related but yeah just - yeah just to give you an idea what to what chess cause what to check first if this is hyper-v of hosts on host if it's sailing over maybe the ports are closed or it just doesn't perform very well because it automatically failed over to on host mode because our goal is to give you back up to make sure you're compliant for the policy that you have a lot for Store point and yeah this is this is how we do it I think it's important to recognize from from beings point of view slow backup is always put in a no backup at all I think once that something hits the fan on that they you probably will recognize that that was probably a good prioritization in case you need that restore for the different the components we have there is one and slightly new role we introduced called to be my own super rolled a question yeah because the and we store when you're using the direct and like perhaps rather to you network and reason when you do this more so the question so that's that's a good question so the question is if there is a difference between using direct same mode on the backups versus on the resource and I think this is this is really interesting topic the issue is when you let's say you have example a ten terabytes a pile soil and you need to restore this machine well depending on what happened let's assume something about having a machine but technically it's still there so whatever happened was only inside the case the operating system it wasn't some kind of catastrophic hardware failure something like that well what kind of options would we have what we could do and go ahead and perform full vm restore and that's fine but then you have to restore basically all ten terabytes but during full resources this tick box at the bottom that says incremental restore or quick rollback the problem is just that direct saying doesn't support incremental resource so now if you have two choices maybe have to restore all ten terabytes using it or xane and that might be quick but still it's a huge amount of data because in reality what is it that you want you to get it's the previous restore point that you have on file and maybe that's only ten gigabytes because there's the change rate inside this file service perhaps unnecessarily very big you could of course use network mode but in some cases we probably suggest that you have and sort of unhand virtual proxies that you could use just for having those restore run because again once you come back to the network mode we have this 40% limitation that we hit and by having a virtual proxy even if it's only purposes to do resource you can bypass that limitation completely on the other hand you could also go with something as simple as instant VM recovery and just get an online rich quickly that way but then you have it running from the backup storage and you have to get it migrated so I think it's important to understand all of the tools that are available and then prioritize in the specific situation that you have to the side well in this specific scenario which one would make the most sense and probably we've talked about the file server it probably often just being an incremental restore to send those plaques those those few chains box back does it answer your question feel free to ask questions throughout the presentation is there any questions okay well for the foot amounts over here sort of touching on the subject of resource we have a lot of customers with multiple sites sometimes you have one inside maybe data center type location and you might have multiple remote office branch office locations and sort of strange limitation we had in some of the older versions was that in the scenario you have right on the slide here if we didn't have to mount to a role then you would have backups being read from the repository so I have this remote site I might be doing local backups that's perfect but now I have to transfer the data from the remote repository all the way back to the main data center to be processed by the backup server and then sends all the way out again that was obviously not the most optimal setup who we introduced the notion of amount so the problem is just when you edit or setup a new repository you need to select the Moundsville so now this processing that was previously handled by the backup server can be handled by a different machine that we call the mount so it could be the same machine as the backup repository so the problem is just sometimes people weren't necessarily paying too much attention when they set up the repository they might have accidentally selected a machine and a completely different site causing cross-tracking to go across demanding a final editing you have some some banelings maybe 10 megabits going to these sites and locally and one gigabit what now you just lost huge amount of performance on that restore on the backup jobs we have this setting called Storage optimizations which in reality is actually what controls the block size so default setting is local targets which will create a one megabyte block size now we typically assume compression and data reduction in total of 50 percent so it's a rule of thumb which means that the actual block that will get this is by Franco fkp now it is probably a good setting for most people but you can you might have reasons why would want to go in either direction now for example if we say let's go one step up on the screenshot and then that would be the local target 16 terabytes plus now this will cause you to have basically almost no duplications that the block size will go to four megabytes instead it's the one but on the performance side if you have really performance storage you would most likely gain the ability to to perform the backups and a resource a little bit faster then you might be wondering well when should I use this as many target for example what what is a good use case a well basically what it does is just it has a significantly better duplication because four-lane target it will be 256 KB so we chop it up in many more blocks for the same original block which means in reality we can do better the implication this is probably not something you want to enabled all of your jobs because the cost of doing this is quite high on the memory and the memory is being used on the back of repository so this would be four times more memory so it's always all about finding those good use cases and a good use case here could be for example to have your exchange servers in a separate job and then you could consider enabling main tag optimizations on that specific job and good reason for that would be that in all the versions of exchange you have this single instance so somebody emails out some PDF company-wide well it would only get stored once but in newer versions of exchange it will be actually stored one time in the mailbox databases for every single person who received it because of this new behavior we can probably gain a lot of t-to plication on Doe specific virtual machines could be maybe around 30% reduction you can get but if you have 50 terabytes exchangers that's a lot of storage that you might be gaining here just keep in mind that you have to have two over here the on memory suddenly and you would need to size accordingly on your backup repositories on the online storage it's a quite important - yes there's a question so you're talking about the top setting with enable inline unification okay so the question is if I'm starting my backups to it in duplication and fines should I leave the enable in mind a diseased location settings enabled yes you might see some information if you go on to the knowledgebase articles or maybe some some tee-do appliance vendors recommendations often the reason white says to turn it office typically cressets some somewhat outdated information so pay attention like it especially if it's a team KB does it say this is recommended for version 7/8 or which version specifically is it because what we might have found is some setting works really well and then over time we figure out that hey maybe tweaking this or maybe we changed something on the back end change levels now the thing is in this case here leaving it enabled just means that we obviously have changed block tracking for example from from the hypervisor so we only pulled out that to change blocks but sometimes we actually get more blocks than than we really needed and that could just be sort of serial blocks that we're pulling out and if we have this left enable we can just filter them out immediately instead of having to transfer them across the network to some other boxes then have the PD Box process it so just leave it enabled and you'll have slightly better performance at least still ok so for the underlying storage on the back of repository you all won't have to make them considerations right you can't have everything so if you want fast storage do you want it to be achieved if you want it to be reliant all of these things is something you have to sort of pay to make it a decision in Indian on a business side sometimes you can try and think a little bit out of the box often I see people immediately say well we need like 10k 15k storage for this repository because we need to have some fast resource later on and that that's good planning but just trying to take a look sometimes on these online raid sizing tools so you can get some some performance numbers instead of saying let's get a stack of 10k drive sins run great 506 it might be that if you bought Nearline size 6/8 RFI thrives because they're so much cheaper you get so much more capacity you could probably get away with running rape ten on it and now you just eliminate it in the higher rate county that you have from from the parity calculation is on rate five or worst case even rate six more importantly don't skimp down on the rate controller that you're using on these machines sometimes we see just using the most basic default rate controller you can get from from the serve and that's not necessarily the best idea put in those few hundreds of dollars perhaps make sure you get a good proper rate controller with a good queue depth and secondly make sure you add in some write back cache sometimes you can also depending on the controller add in like a second cache which could even just be some SSD that you connect this will both ensure on on the intake of data but also being able to push it out again really fast especially if you do restore from from the newest possible theta question ten installations with very high-end HP a good flash back to battery back running raid six didn't just faster the low end with rape dead I think it all comes down to what what you want working budget you have Indian if you take those things and put in front of written it'll probably be even just faster but the read should be just as fine on under eight six it's of course the rights that we are usually having problems with on on your eight five and six but I think the most important thing is always you want to make sure that you're making active decisions we don't know what you I put your specific environment that's something you need to figure out the important thing is understanding if I make this decision or another decision what competent practice that games have and if you don't understand that it could sort of come back to you a little bit later on this one box is a natural this one bird in version 9.5 and despite of the name proxy affinity this setting is actually set on the repository so you just this is quite easy just right-click the repository in select which proxies are able to write to this particular repository which is quite useful in dispersed environment where you have a lot of remote offices and write offices that you protect so that you want to make sure that the traffic is kept within these within this remote office but also the side effect of this proxy Center oh is that you probably know that mean back from replication before starting the job it actually enumerates all the components that are configured and it requests the information from each and every individual proxies like how how could you how well you are configured to my virtual infrastructure in which mode you can read any data from my virtual environment what is the data what is the where do you write also to which repositories you have access and this also takes some time so when it starts gestured well especially this is a big environment and win back your application server tries to request all the information from the components which are configured also takes some time so once you have this process integer or the environment replication already knows which proxy is connected to which repository and basically doesn't waste on calculating this as well Bogg safe in Ichiro is not a strict rule so we will break it in case this is something wrong something goes wrong with proxies and proxies not able try to this repository so we will still find you another proxy if it is configured to write to this particular repository so this probably is better to treat it as a priority list so we will will try to do as you said but again because our priority is to deliver your backups we will will break this rule in case something goes wrong so the setting is set up on the repository and you can have multiple repositories of course right into the multiple processor to write to the thing repository sometimes it's good to have like a hands-on examples understanding well one thing is having these remote office locations and you can do some optimizations there but what if I just have one site would it still make sense to use the setting and often answer will be absolutely sometimes from seed customers let's say they have example for physical machines and each of these machines might act both as a packer proxy and as a backup repository now if we try and imagine that your production storage system is running for example one fiber channel well then we could say each of these backup roxies has a one or more connections sixteen gigabit for example down to the production storage so it's pulling data really fast but the job that's being processed by this proxy is actually be set up to start a backups on the second server not the same one SPSS processing right now in some cases the network might actually be slower worst case it could be as low as one gigabit because it's just a management network in between the easties so us so something now we're pulling data up using 16 gigabit and then sending across the network to the second host using only one gigabit causing a massive performance bottleneck right here so just going like I mean already set down to each 50s for physical service and disable back of a pass there's one I want to use proxy number one bag of repository two I want to use proxy number two and and so on this one is very famous recently I think we talk about this a lot and this is utilizing RIA fest on the on the backup repository in fact reassess of course is the Microsoft Microsoft file system it's not our invention but we make a good use of it and this by using by using VFS and I would like to mention here that it's it's important to understand TFS does not appear in the Windows 2016 or - oh sorry 2015 but it's like version 3 I think but it's the thing is about this block alone technology that is inside our windows - 2016 only so these functionalities belong to work with the latest release off if you use 2016 as your repository and this riff as brings a lot we talked previously about all the transformation operations and we have a lot of them it's synthetic full synthetic for operations when you need to make the blocks and read and write from the same disk you have the operations with the reversed incremental when you transform a 2 this is 3 times the Oh you have the operation of forever forward incremental when the merge happens and at the end he also this is where also very useful because backup copy job into backup copy job you have our grandfather father's son policy that you can enable this is the only job that you can basically enable archiving for your backups so this grandfather father son policy we also use synthetic operations so we do not pull the data from the production by default unless you check boxes so with real fast these operations will become faster like much faster and I will show you also the example and what is what is happening to what is happening when the user FS will see the traditional picture how the how we construct the synthetic form on the regular depositor so we will just read the blog by block all the latest works that we have and we will greater than 34 which will definitely who fill up your storage because it is it is a normal full wrongful backup that resides on your regular repository what is happening in case of real fast this is the screenshot from our lab and you can see that it takes 40 seconds just to create a synthetic pool you don't because the thing is that we don't really need the blocks physically and instead of this real fast allows us to create maybe to the existing blocks and this does not also fill out the storage because it still stay on the same level something something important to note here that just in case that you need to migrate your vert your backup your existing backup that you landed on these real fast repository if you try to move them to another one just for some reason this will result in rehydration so you will not be able to preserve these space savings when you try to move the backups from Griffith repository to to a regular one channel reassess so how do we then unlock the capabilities of r EF s because you might already thinking wow this sounds fantastic I'm running bag of repositories and something different than Windows o 2016 maybe already using 2016 but you set up an TFS backup repositories for example well first of all you need to have reoffending needs to be 2016 you can easily migrate the existing backup files you have over to the new backup repository you can change the job just make sure to have rescan in between so that we understand the files are there now but we still won't utilize three FS block cloning API because it's the existing chain so how do we fix that well you basically have two options you could right-click on the job and say active full just make sure that you need to have enough space to create that second floor on disk and you need to keep a non viscous and time and so the retention policy lets us kick out the old original backup chain the other option is to go into the existing job and edit it and then enable compacting compacting and we'll simply create a new mtv BK file copy all of the blocks into the new vbk file you can sort of say the fragmentation of the VBA file and then as soon as a Spanish just leaves existing vbk the new one sort of is placed in place of the existing one either way you'll be able to then afterwards enjoy the new benefits of three FS some of the new features we have we've had over time and we understand that a lot of customers have been very loyal perhaps over the years maybe I'm sorry units started around Version three I don't know how many of you started around Version three I know I didn't but that might have been customers upgrading upgrading operating over time and that's amazing of course but the problem is or necessarily problem but what happens is that if we introduce a new feature and you'll make an upgrade that new feature is not going to be enabled by default because we want to make sure that upgrades are non-intrusive for you so you might get some performance gains just because to the back-end engine is optimized more in newer versions but it won't suddenly turn on any increasingly features and that could have a negative impact on your environments so pay attention to the release notes when new versions are coming up because often there could be some small check boxes somewhere some new features and minor improvement but you might have to go and actually enable it especially for existing jobs if you go ahead and set up new jobs you would typically get most of the of the new settings at least if they're based on the jobs so one of those new settings that came some time ago is the setting up a VM backup files this is not a backup job setting it's a repository setting so let's just assume that you have a backup repository and you go ahead and create a new backup job by default what would happen this so you add in ten virtual machines to this job then you go and let the job finish you go down to the processor and take a look well there will be one single vbk file that contains all of those ten bits inside that's that's how it's been for for years and years with the new setting if we had enabled this on the repository before running this job within Lotus finish we'll go down take a look well then we have ten DBK files because there would be one per virtual machine being backed up now why is this important well the only caveat is you basically losing some duplication here because the duplication happens inside the same chain and now we have one chain that file but on the positive side for performance reasons previously you have that one right chain going down one connection down to the storage and you might have really fast performance storage for your backup repository but one right shred might often not be enough to maximize default strawberry performance potential so simply by going in and enabling this one sitting that would mean you now have multiple trips now would it create ten threads all at once maybe this is where we always have to take that two consultants card and say it depends so what does it depend on or there are two settings that can influence how many connections you have concurrently one thing is how many concurrent tasks that you allow on the backup proxy the setting is how many times did you allow on this backup repository in the end the limitation will be mainly how many you have on the repository but obviously if there's numbers less on the proxy then that's going to be the limiting factor just like when we talked about the re if s and before if you want to make this change on your repository well this is how this is the settings so you would edit a repository go to advanced settings after the tick box right here but even if you've set it up and clicked okay if you want to utilize it for your existing jobs you still need to either run an access full or you can have compacting as soon as that compacting has happened one time you could just remove that setting again from the backup job if you don't want to keep it indefinitely yes question on your main repositories you're using oh you can so the question is if you can have different settings and you can you can easily have this not enable on the primary one and have it enabled on the second everyone that doesn't make any difference or order reverse but just for performance reasons it probably would be a good idea to have it enables especially also because backup copy jobs they don't work backup files you actually even happen you even can have a big one backup file but still into individual virtual machines from this backup file to copy it over so it doesn't matter yeah the second setting that we want to just highlight and hopefully most if you already have parallel processing enabled but it's just one of those things where we still from time to time see customers who's been running a replication for a few versions what it just didn't realize that they could go ahead and enable this one setting if you don't have enabled what will happen this one backup job will backup one virtual drive at a time so if you have 100 VMs in each of them has one VM decay will go to VM number one drive number one once it's finished VM number two drive number one you could of course have multiple backup jobs running in parallel but it's it's not extremely scalable and you probably would want to go ahead and able this once you go ahead and enable that then we'll basically just see okay I'm running this job now how many proxy tasks do I have available how many tasks don't have available on a proxy on the repository and we'll just take it to the maximum based on the resources that you have available and it's just that one take box so where do you find this setting or you'd have to go to the main menu so if the one in the top left corner to use interface go into options and then we have it right here so it's it's a global setting there is no way to enable or disable parallel processing per minute per job once you start enabling this you might perhaps especially if you had it disabled before you might see in sort of negative impact on your production soit's especially if you have really really big physical servers because today it might not be unusual to have one or more physical proxies that have let's say two sockets with ten or even more course each will suddenly have this really big amount of connections down to your production storage holding backups out now if you have backup running during production hours you might suddenly start seeing some calls from somebody saying hey disciplic Asians behaving really really slow suddenly because the latency on a production story is increasing obviously we don't want like you're getting angry customers on the call and I'm sure on inside the internal business you probably have the same feeling what you could do in this setting is not enabled by default so you have to go and enable it same place as the parallel processing which is just sitting on top of it enable this then we'll monitor the latency on your production storage for you and we see the two default values here being 20 30 milliseconds you can adjust the es as you need but once we see that hey we're not holding that first threshold in example with the 20 milliseconds we'll say okay well maybe right now I'm processing 10 virtual drives at the same time but once one of them will finish ok we'll see is it still above the stress hold if yes you just only be processing nine and then an eight and then seven until we see that number is going down a little bit then we can start increasing the amount of concurrent tasks again even if we start adding new tasks in but then the latency school keeps going up well then we might hit that second threshold which is basically Trott a--let's will just completely slow down the performance of the backups until we see that the latency goes beyond below the threshold that you set up now there's two settings here global so there will be applied to all of you of your data stores or all of you volumes in case of hyper-v but there is a small box here that configure button you could go in and say well maybe these numbers I put in globally a good for the majority of my data storage volumes but you might have a few applications that are perhaps very latency sensitive if that's the case click here add them in set up different values for that one or more data stores and then save it in then we'll make sure to abide by there's no specials when we are processing virtual machines residing on that data store just because they come from our pre-sales team I need to mention here but things setting these are settings when you configure the latency for each individual data store first of all the entire I owe I owe control is only available in two editions Enterprise enterprise class and the latest setting wishes the advanced settings for per datastore latency control that will be only available enterprise class just so you know hyper-v 2016 considerations and Microsoft will release this hyper-v 2016 we are very happy about this because it also delivers us native change block tracking maybe not all of you know but because beam just works and it just worked with 2012 r2 and with 2008 r2 but- when Microsoft did not have this change block tracking so what we did is we put a lot of force to to create this what we called the file system filter driver and this file system filter driver was installed in each and every hyper-v hosts when you just add them to the when you present your hyper-v to to backup infrastructure to your backup replication server we automatically installed this file system filter driver to understand by ourselves our vault has been changed since the last backup run so that next time we go we know exactly what to what to what to take and what to process so now with 2016 that was yes the previous version now with the 2015 this file system field to driver from beam make it mean filter driver is done and we have what is called Raziel and change writing on hyperlink but the downside of this is that you need to be careful with this since on 2016 when we detect this is the operating system running on your hypervisor we will not install any file system filter driver on this and what can happen is we will just switch to the totally old-school method when we go and we basically possible the entire version which and then we'll go to the backup file and then we'll compare what we have on the production and what we have in the backup file and we will identify the works which has been changed and then we'll take them so it will it will take even more time than for example you would do the full back so that's why it is important to know that you need to upgrade to version 8 this is the way how you enable this native change for trading otherwise we will just go real old-school and you will spend a lot of time on processing your incremental cycles what we expect it to be really fast and the cost version should be 9 this is something something to watch out with hyper-v 2016 in order to save some time on incremental processing with this native are 30 I think producers you're running had to be in production at all who might be working as consultants working with customers running hyper-v this is really important to understand we can still use the old filter drive and all those so try and imagine you have a customer or your own environment and you're running 2012 r2 and you might be in a process where you're planning to upgrade this to 2016 well that that's fine and might be even be going through this rolling upgrade process where you have a class that it contains both 2012 and 2016 at the same time we can still use the filter driver on the host that hasn't been upgraded yet but for the ones that have been operated in the same cluster we would not be able to use it anymore then we go to the legacy or the old school filtering mechanism you'll still get an incremental file that's not a problem it will just be significantly slower so you want to make sure that you get this upgrade process finished don't prolong it for months and months get it done once you have everything to 2016 cleanly then start getting the cluster version upgraded start getting the virtual machine versions of creatives and then you can utilize the full benefits of pursuing change tracking and gain a lot of performance right there we talked a little bit earlier about using network mode on on VMware for your transport mode and I mentioned that we had this well not be in per se but VMI imposes this 40% limitation if you're in a multi-tenant environment might be that you really want to keep using network mode instead of doing with any of the transport modes oh for whatever reason that you just really want to use network mode if that's the case there are basically a few settings that you can use two of them on the VMware side so you have to go to the shell and troubleshooting mode gets to some sort of command prompt and then you can make these changes and we have the knowledgebase article that describes a little bit more in detail how to do it from VMware it's a it's a 2:05 3 - OH - now what it does is it tells you to tweak the NFC buffer size and the cache rush interval the second item is on the beam backup side so we have some registry settings that can go ahead and increase how many connections we create from the backup proxies downside and interface and just having a few extra connections will allow us to do obtain just a little bit better performance we've put that registry key just for you on the bottom of the slide as well the people setting is 7 how many how much should you set it - well I would recommend this don't just pump it up like a falcon from the beginning try and increase it a little bit at lovest a little bit get some some good data there so that you know what the what kind of performance gains you're getting because as always the problem is you can overfill this in a way did you read go beyond the tipping point and now it's actually getting slower than it was with a perhaps slower sorry a smaller number so to pay a little bit attention here the second item we have on the slide is if you're using virtual backup proxies it could be a good idea and then you would often be using the virtual appliance transport mode also known as hot air just keep in mind especially if you're using BMX entry which would probably be recommended by VMware and be mian well then these backup proxies would probably have a lot of CPU cores maybe 6 or as many as 8 we typically don't recommend making those proxies larger than 6 8 and you would have more benefits of just creating a second one then assigning that additional CPU to the second one but once you give these big virtual backup proxies there's a lot of processing happening naturally on the CPU part but also on that virtual NICs sending data back to some background repository and if you don't have RSS or receive site scaling enables inside the operating system of this virtual proxy then all of this network traffic will only be processed by the very first CPU core this could perhaps become the boggling by enabling this and it should often be enabled by default but you could have various reasons why it's not could be you've done windows upgrades or maybe you have some kind of policy setting whatever just make sure you have it set up and you can easily go to to the command prompt and run some commands to see that it's actually enabled once it is that the low down from the virtual NIC will get spread out across all of the CPU cores that you have available and very painful talk about backup copy jobs first of all backup copy job you can you can of course that that's right you can of course create just one giant backup for the job put aside all virtual machines and and just just translates to the off-site repository or to the cloud repository of your service provider but maybe that would be a good idea to chop it up into small small blocks first of all this will give you the idea about how much time how much time depending on the bandwidth that you have and the data that you would like to transfer how much time you will spend so this will definitely finish at some point in time but splitting it into multiple jobs also brings you to the safer site because as you know we do not let's say you have 20 30 50 virtual machines in this bag of coquito edges transfers it to the - to the off-site repository and we do not have real human disconnect once the connection is dropped we will not be able to start from the front we point disconnect so we will and that will be real with pity if you if this happens on your 49th virtual machine so we'll need to go me to go and transfer all them from from from the very beginning to the to the off-site location so this will be a good idea to chop it up in a smaller box so that you just minimize the risk of the job Taylor and that they fail all at the same time the exclusion is when you use again here I'm talking about when you use backup quality job with direct direct data transfer this is not the case if you utilize one accelerators as you know we have a built-in and these one accelerators they are actually able to resume the connection so this is not the case of the one accelerator but if you use direct transfer just make sure that this is just you do not put all the eggs in one basket and then just in case something goes wrong with the unstable wing you will just transfer it all over again also the compression level the compression level on your backup job on the regular backup job because your local repository can be different from one that you use on a backup copy job and actually you can you can change it here and depending on the load that you that you on the CPU load that you have with your proxy if it's not the maximum and you have some will say 10% you can afford 10% additional CPU utilization maybe it is wise to enable higher higher level of compression and switch to highest or extreme and in this way we will just compress it more in this way you will send Wells data to me to the target over the flow link this one is based on a true story got it from Skype test of our support guys and the story is that the customer customer of ours started to complain if I'm not mistaken all the Spiceworks that service provider did my service provider and he was quoting his backups towards the cloud repository so the service provider started to demand him twice more money for twice more capacity while they calculated beforehand he thought that he's on the safe side he knew how much data he has and he was really mad and people were suggesting to him to just go and check the GSS scheduling on the back of pocket and this is up to what I what I see day to day is just people do not very often check their existing settings and this is the screenshot from the settings that the guy had so he enabled archive for archival purposes he enabled weekly backups monthly quarterly and of course that was that were the real full backup files that he actually had of his repository and the service provider was correct charging him more just because he has a archival enabled on the backup repository so to avoid the situation like this please check your GFS mode on the backup copy job because it's not it's not not really in the Whizzer so you bill you need to go to the Advanced Settings and check whether you have a simple retention or you have simple detentions plus GSS for your convenience we added here is the restore point simulator this includes also the simple retention when you want to storage at 17:30 restore points and also some grandfather's for the some offices so this will calculate for you not only the capacity that you need for your backups but also V the space that is required to work on the transformation operations and intermediate full backups creation so this is the link right here you can use it and this is very clear very clear case states how much space you will need and it is especially very important you know when you rent a space when you when you use cloud repository and the source provider charges even the based on the capacity so we hope you really enjoyed the sessions a day before you do go we have two really really important tasks for you the first one is to open the mobile app right now make sure you go and click five stars the second thing is please go down and fill out the survey whatever thoughts you have it would love to hear them appreciate all of the feedback that will help us say have improved both this and future presentations again also just wanted to thank the arena for agreeing to do the presentation with me thank you for us and thank you all for joining
Info
Channel: Veeam
Views: 10,750
Rating: undefined out of 5
Keywords: Veeam, Veeam Software, Virtualization, Backup, Disaster Recovery, Availability, Recovery, Replication, performance optimization, data protection, data recovery, Veeam Backup & Replication, best practices, performance tweaks, tech, technology, VeeamON, VeeamON 2017, Cloud, cloud computing, Veeam Availability Suite, data repository, data store, backup repository
Id: yRvb32706Cc
Channel Id: undefined
Length: 57min 24sec (3444 seconds)
Published: Tue Jun 27 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.