DBA Fundamentals: Backups 2: Restores

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so now let's talk about restores which is really the whole reason that we do backups there are three common kinds of restores first off one-offs when someone comes running into your cubicle or calls you up on slack and so oh my god i dropped the customer's table or i updated too many rows this is by far the most common reason that we do restores is for emergencies that's what i'm going to focus on inside this module the other two things that we do a lot are testing your backups like restoring just to make sure that the backup worked and then recurring scheduled restores like when you constantly have to refresh a development server number two and number three are important but i don't cover them inside this class there's only so much that i can teach you inside the span of a day so in here i'm gonna focus on number one how to do a one-off restore the mechanics of it like which gui you use how you issue commands how you write them out in order it's going to depend dramatically on how you take the backups for example if you use vss snapshots if you're backing up your sql server with the same tools that you back up your file servers your domain controllers etc vss snapshots vss backup software vmap assure etc they make this really easy because they've got a gui and they just list out here are all the times when we took backups which one of these would you like to restore to you choose the one you want and you're out of there sql server native backups are more complex they take more time yes management studio has a gui built in to help you with that but that's still only really kind of the starting point they're probably going to be other things that you want to do with that and that gui only works for just one database at a time the big picture of what you do with native restores is that you decide what's the date and time that i want to aim for because it's up to you you can pick any date and time that you want then you restore the most recent full backup before that the most recent differential backup if you're using differentials and then all of the transaction logs up to that exact moment in time now i have one slide on here where one slide is like okay here you go if you're gonna do vss backups do this if you're gonna do full backups do that and you're like well brent i want you to show me all of those i don't bother doing that because here's the deal you're gonna want to go to the documentation every time that you do this every time you do this doesn't matter whether you did it before three weeks ago or three years ago features and backup products change even in sql server something that you would think would be fairly static by now sql server the way that we do backups and restores changes over versions uh great examples of transparent data encryption and how the compression has changed just over the last couple of years when you have to do it you have to get it right you know your users are standing outside your office or banging you on slack going hey and they're just you know my biological clock is ticking like this so you want to make sure that you get it right you want to go read the documentation every time i read the documentation every time because i don't have to do restores that very often full restores aren't that big of a deal but when i want to get to a point in time like during an emergency i still hit the documentation every time so i'm not going to talk about the mechanics inside this module which commands that you issue in which order what i'm going to talk about is if you step way back in the big picture what are things that you have to worry about during each of these scenarios because in each of these scenarios there are some pitfalls that will kind of backfire and catch you by surprise let's start with restoring just a few rows a table or an object when i say an object i mean things like a stored procedure a view a function this for me is the most common reason that people you know knock my door down and they're like oh my god we have a problem right now i just ran an update with a where club without a where clause or i forgot to highlight the where clause and they only want to get certain rows back or they want to get a specific table back i find it really strange that even as i record this in the year 2020 sql server still can't do this it's one of the top 10 most highly requested features if you go to feedback.azure.com which is where we track all that we where microsoft tracks all of the requests for sql server i wrote a request to do this years ago and it's up in the top 10 in terms of requests and sql server still can't do it third-party tools can back a decade ago when i worked for quest software this is one of the big selling points of third-party backups quest lightspeed idea of sql save all these tools i don't really care which one you use they're all good enough if you need to restore individual objects frequently these are fantastic tools but if you don't have those what you're going to have to do is you're going to have to restore an entire database and then pull out the specific objects that you want never ever ever ever ever restore over the top of an existing database because let's role play for a second i'm going to tell you exactly how this goes wrong so i'm going to be a user i'm going to be all oh hey uh uh dba could i i made an ax oh a terrible mistake i accidentally dropped a table could you could you do me a favor and restore the customers table i don't care which date and time you give me as long as it's the most recent one you can possibly give me okay thanks oh no you can wipe away everything else in the database i promise the customers table is the most important thing and you can see what's going to happen you know what's going to happen is if i try to restore over the exact same database they're going to be like oh we're missing the last few rows from the sales table there are a few sales records that weren't in there so never ever ever restore over an existing database instead restore over to a new database name with new file names you want to make sure that you don't run any risk whatsoever of trying to overwrite the production database because i've seen people try to restore to a new database name and then when they run into errors they start doing all kinds of forces and the next thing you know they trash the production database the people who asked for the restore usually know how to pull individual objects out so that person who said hey you know senor dba can you do this for me what you do is just say okay i restored the database to a new name sales underscore from yesterday go get the stuff out that you need you have one hour after one hour i'm going to go delete that restored copy and you always want to give them kind of a time deadline because otherwise next thing you know that database is sitting around for years and we're backing it up and doing check db on it every single day next up in our gotcha is restoring just one database if someone says oh you know what we totally screwed the pooch on on the development copy of the sales database it doesn't even matter if it's in development production disaster recovery if someone asks you to restore a database still never restore it over the existing database restore it under a new database name and ask them hey i just restored that sales database and it's on there a sales underscore from yesterday can you go check it out make sure it has what you want because sometimes it doesn't like sometimes they don't realize they deleted the stuff so like six months ago or the stuff was never there and they made a mistake after all this isn't their first mistake today they've already made one that's why they're coming to you hey can you just go check it sales underscore from yesterday and make sure that it has the data that you want when you're okay with it what i'm going to do is i'm going to rename the database that ordinarily you would have had me to restore over i'm going to rename that as like db underscore to be deleted and then i'm going to restore or rename the one that i restored with the right name this way they still have both copies online for a brief period of time so that they can work out which pieces they need to pull like some kind of frankenstein's monster that they're going to reassemble back into one piece now if that database that they're asking you to restore has any kind of high availability or disaster recovery if it has replication log shipping uh database mirroring always on availability groups any of those features when you restore a database you break that in some way and the way that you break each of those features is different failover cluster you don't necessarily break but the problem is if it's in a failover cluster it probably has some of these other pieces in here as well if you go to restore a database that's in any of those it gets more complicated because you have to figure out what you broke and you have to reset it from scratch and i'll give you an example because it's kind of popular always on availability groups you can't restore a database that's in an always-on availability group they can't be in the status of restoring unless you're like seeding a replica so what you have to do is you have to take the database out of the availability group blow away the original restore over on top of it add it back into the availability group and then seed all of the replicas that's why all that stuff tends to be out of the scope in terms of a training class i'll tell you it's not in any of my other training classes either it's kind of a big just red warning flag that if you have any of these you're going to be hitting the documentation in order to figure out what steps you need to do next and that brings up an interesting kind of sidebar when people think about making their databases more highly available when they think about putting in stuff like database mirroring always on availability groups replication the more available you want your database to be the more you try to turn up the five nines knob the more expensive it gets the harder it is to do this stuff there's no set it and forget it option inside a sql server for this kind of thing this is why one of the reasons why companies want to go towards the cloud as you go towards the cloud some of these things are taken out of your hands you don't do availability groups database mirroring log shipping in order to protect the sql server microsoft google amazon whoever your cloud vendor is manages this stuff for you on the flip side they also manage the restores for you and the speed with which you can do restores tends to be one of the things that keeps companies on premises there is no sla for house service level agreement for how long it's going to take microsoft google amazon whoever to restore your databases so in a truly mission-critical environment my clients who are like it's got to be up all the time it can't go down and when we have an oops query we have to be able to restore it as quickly as possible those are the kinds of clients who still tend to stay on premises so that they can control the restore processes themselves just as leaving that out there because i know sometimes people are like my skills are terrible they're useless i'm not going to be available in the cloud anymore this is if you're good at this kind of process this is one of the reasons why you do still have a job for the long foreseeable future so next up is restoring multiple databases if they tell you hey this application involves these four databases can you restore all four of them together oh does that sound easy it sounds really simple just like if you're used to restoring files go restore these same four files now if you're doing vss backups like if you're backing up vms with the snapshots it works fine because it's snapping all of the databases at almost the exact same point in time and you're going to be like what do you mean brent it is the same point in time there's only one snapshot ah there's a catch hold that thought with sql server backups they're done one at a time in serially in order and they're not all caught at that exact same time in order to demonstrate it what i'm going to do is i'm going to take the three databases on my desktop lab i run the stack overflow databases from 2010 2013 and current i have all three of these databases at the same time are on my server at the same time so i'm going to back up all three of them i'm using backup to null just because that goes extremely fast it reads the data but then it doesn't actually write anywhere so i'm measuring the best case scenario for backups just streaming the data out as quickly as i can i'm going to execute all three of these and then i'm going to look at how long they take to run see the red arrows down in the results section it tells you when the first backup finishes when the second backup finished and when the third backup finished they don't all finish at the exact same time now here i'm doing full backups i'm not doing transaction log backups but the same premise applies the larger your databases are the more active they are the more stuff is in the transaction log there will be a delay as your jobs because what your jobs are doing is they're going through all of the databases in order doing all of the log backups they're doing all of the full backups there will be a time delay in between databases now in my case because i'm doing full backups we're talking about a 20 second to two minute range in between these and you might be thinking well brent maybe that's because your desktop is slow no go look at your own backup jobs and go see how long they're taking if it's taking you 30 minutes to do your backups and you have 10 databases you better believe that when you go do a restore they're all going to be at ever so slightly different points in time the problem with that is that user transactions can cross databases i can run a transaction here that says go add a customer into database 1 and then over in database 2 go file an order for them go insert an order for that exact same customer well depending on the order of my backups and the way that i do my restores if i restore both database 1 and database 2 i can have a point where there's a sale without a matching customer or opposite i can have a customer without a matching sale you're probably looking at this and going who would ever write code like this oh the stories that i could tell you some multi-database applications like microsoft biz talk actually plan for this microsoft biz talk is designed to put a bunch of stuff across different databases and so they use this thing called marked transactions they put in marked transactions and they say whenever you're doing restores you want to restore to a specific mark transaction time and at that marked transaction we can guarantee that all of the databases are at the same point in time that also means that you're going to lose data if you restore to that marked transaction because other transactions may have happened afterwards they simply won't be in the restore at the time you go to do the restore and microsoft documents and talks through this extremely specifically nobody probably reads the documentation most applications don't most applications just aren't even aware that this is a problem so when i go to talk to my end users with a multi-database application that's usually the first time that they've ever heard of this problem so they are in for a big surprise when i say hey look here's the deal when i restore there's a possibility that your transactions are going to have data in one database and not in another i just want you to be aware of that before you ask me for a restore so that we can start to plan for this together because this isn't my fault so you can either change your application to work like biztalk does and that's on them it's not on you as a database administrator you don't have the power to change their application to work with marked transactions like biztalk does generally i'll just hand them the biz talk documentation and say would you like to implement something like this this would be you implementing something like this they're like i can't read and i'm like okay yeah we're gonna cross that one off number two i can do my best effort i can just restore the most recent backups for all databases but i'm gonna tell them this database is as of 9 15 am this database is as of 9 20 a.m this database is as of 9 25 because we do our backups in order just so you know this is why some of your databases may have transactions in one and not the other and i can't give you the list of which ones are in there like which ones are missing data because sql server doesn't work that way or number three i can use the stop at parameter which says when i'm doing restores stop all the databases at this specific date and time in order to do that what i would have to do is go through make an inventory of all my databases and i'd have to find the point in time that's the oldest restore for this is the oldest available restore for each of my databases and then find the weak link if i've got say a list of 20 databases and one of them's oldest backup is 9 am and the rest are all at like 903 and 904 then i got to restore everybody to 9 a.m and just stop there it doesn't matter if i have additional log backups that go to later times for other databases i can only go as far as my weakest link went then i'm going to restore all the databases with stop at and use that parameter for the specific date and time that it was my weakest link there it's a whole lot of work and there's a microsoft has a write write-up on how to go do it and it really is a whole lot of work when you read through it this is also one of those examples of why you want to practice your restores ahead of time if you go to read that knowledge base article when you're in an emergency and you're trying to do a restore you're going to be like oh there's a there's a lot of words here and a lot of commands that i've never really seen before i i don't think i'm comfortable doing that i don't blame you i'm not comfortable doing it either this is why i talk about going to the documentation every time you do stuff like this but if you're going to do that if you go in armed with the knowledge that my database has or my application goes across multiple databases then this is why when we talk about if you want to turn the nines up if you want more availability it's going to take more work i'm going to have to prep and plan those restores ahead of time this helps to inform management when we talk about why i have to practice restore so often so that i'll be ready when this kind of thing strikes always strikes when you're not looking for it next up in our challenges list restoring system databases so system databases are master model msdb and i always say tempdb but here it doesn't matter because you never restore tempdb it's just master model and msdb you probably shouldn't if for some reason someone put a user object inside a system database that's a big sign that they may probably don't know what they were doing but it's probably an accident you don't want to restore these and in fact you can't restore these while sql server is up and running without jumping through all kinds of hoops so if someone's coming in to me and saying hey brandt i accidentally put a table inside a system database and i didn't know why it was wrong can you restore it for me i'll say sure what i'm going to do is i'm going to restore my backup of the system database to a different name you can remember restore databases to a different name if i take mat the master database if i need to restore the master database for some reason i'll just restore it as master underscore restored or master underscore for alex so that that way alex can then go get the stuff that she needs out of there and then i'll go drop the master underscore restored database and i'll teach alex hey don't put things inside system databases again because i can't restore those quite as easily bonus points if you go through and patrol your system databases looking for things that users created time for another side discussion sometimes if you go out there and you google you'll see advice or bang whatever i support alternative lifestyles if you google for advice on what you put in system databases you'll see a lot of old crusty dba saying things like never put stuff inside user databases thing crusty dbas always sound like that never put stuff inside system databases it's bad and this is what they're talking about the reason why it's bad is because it's harder to do restores that's also why you'll see those crusty old dbas go be comfortable putting their own utility stored procedures stuff like sp blitz sp blitz cache sb who is active they're comfortable putting those things inside system databases the reason why is they don't ever restore those objects they simply whenever they build a new server they go recreate those stored procedures from scratch it's also why with sp blitz our system health check kind of stored procedure with sp blitz i warn you when you have objects inside system databases and you'll see it fire sometimes it'll point out things like uh uh red gate sql backup you know third-party monitoring tools will put stuff inside system databases they do this because they're like we don't uh need a database to do anything yeah you don't because you're just dumping everything in a system database i see how it is it's okay if third-party apps do that because you would never go restore their stuff anyway we're really only worried about users putting things in there like seriously one time i saw a web application that had done a deployment and put their user and password list inside the master database so every time a user was created it got a new line inside there every time a user went to log in it got a new line inside the system databases and they're like yeah we did this accidentally right at the beginning of our application and we've never been able to correct it so of course then they're backing up the master database every five minutes if you do have to restore a system database oh let me jump back uh yeah no if you do have to restore a system database like if something went horribly awry inside any of these system databases you probably can't even trust the server i wouldn't do a restore because something truly went to hell in a hand basket great example i've had sql server updates likes uh cumulative updates and service packs go terribly horribly awry and absolutely hose an entire sql server and host some of the system databases and i'm like you know what we should just really you know throw a match behind us on the way out the door we should just cover this thing in gasoline and throw a match on the way out the door i don't trust anything about this thing i'm going to build a brand new sql server from scratch i should say too so jumping back one on here so there are knowledge base articles about how you restore master model msdb and how you go about resuscitating resource db you don't some of you this probably the first time you've ever heard of resource db it's a hidden database that contains some of your system objects so there are knowledge base articles from microsoft about how you go restoring these and it involves doing terrible ugly things like putting the sql server into single user mode i'm not a fan of that because by the time you're at that point the server's down no one else is able to use it i would rather have you going in building a more reliable sql server if you want to do these things you can i know some of those old crusty dbas are like every senior dba should be able to restore the master database i'm not a fan of that level of trivial pursuit if you think it's rare to do user database restores wait until it's time to do a system database restore that's really rare and you want to hit the documentation every time i always want to give people a rough uh idea of how many times i have to do something because i don't want you to think that i'm talking down to you like this is something that i do all the time i restored a master database once 15 years ago and i'm like oh no no no i am never doing that again i am lighting a match and walking away from that server every time i ever have to do that so now speaking of which whenever we want to throw a match behind us and walk out restoring a whole new c or restoring a whole sql server to the same place like when i say the same place i mean the same geographical place like the same data center the data center around us is fine but that one sql server caught fire and all hell broke loose if you're using vss backups it's easy because this is really what they're good at vss backups are taking a snapshot of the entire sql server they don't have to it's just that that's how they're usually configured as you take a snap of the entire sql server and then you go into their gui you just pick the exact date and time on a timeline that you want to use as your source and then boom within a matter of seconds or minutes you've got a whole brand new sql server exactly the way it was when at that moment in time but if you're using and that i should say too that that's also why i'm such a fan of vss backups i know a lot of database administrators don't like these because they take control away from sql server in this class here we're talking about fundamentals of database administration you're going to have role models out there in the community and in your peer group like local user groups who are going to say i don't like reasons backups they take the control away from me yeah it's because they're good and you suck so these things they're able to just restore the whole vm to a point in time that's what i want for someone who doesn't have a lot of time whereas if you're doing native backups like also to be transparent this is what i usually do because usually when customers bring me in they want mission critical 24 7 they want to be able to restore to an exact moment in time they want really high throughput some things that vss backups have a little bit of a challenge with in some cases just to be clear i do still use vss backups most of the time i end up doing this rather than restoring a whole sql server by restoring its full backups its differentials and its log backups and then restoring the whole sql server's os while restoring the os first what i'd rather do is just light the match throw it behind me and walk out i'd rather build a brand new sql server from scratch because let's be honest if the whole sql server was in such bad shape that you had to restore the whole thing there's probably a reason for that and i probably don't trust it plus to be honest i've probably been looking for a reason to build a brand new sql server anyway i probably inherited this thing from somebody who built it before me and they didn't have any clue of what the hell they were doing now this here is much harder work this is way harder than the previous slide if i go back to previous slides as a piece of cake and just oh point in time boom we're done with the restores this is way more work because what i have to do is i have to build that whole new sql server i have to patch it i have to restore the user databases if there is anything in the system databases like my own utility stored procedures i have to recreate them from scratch i also have to recreate things like logins permissions agent jobs schedules trace flags link servers the whole list of things that i have to go and recreate and if i don't have a copy of the original server i'm going to have a bad day because most of us don't remember every login every agent job every trace flag every permission we just don't have those written down so that's where you kind of have a little bit of a takeaway and you go you want to start running sp blitz today and building out the process that you would have to do to recreate this server from scratch you want to have an awareness of what's going on on the server and how it's configured sp blitz is part of the work that helps you get there the reason why i encourage folks to start thinking about that is that you're taking this training class today this is really just the starting line of your official dba career this isn't the finish line you're going to be doing restores for the rest of your life you're not happy about that but later on in your life you're also going to be building clustering always on availability groups log shipping replication database mirroring all those things that rely on the transaction log and you need to be able to reconfigure that from scratch whenever you do restores when you're doing this you don't want to wing it you don't want to next next next your way through the gui and then just try to remember from scratch what your sql servers have because this is the starting line in your career you're going to be working with more and more sql servers larger and larger sql servers with more and more complexity over time so that's why you want to start using an install checklist you want to start working off a piece of paper or obviously something in excel a powershell list whatever it is here are all the things that i need to do to bring a server back from the dead as i talk about all this stuff obviously doing this starts to sound a lot more attractive this is why when i talk about someone who's got one to two sql servers a hundred to 150 gigabytes of data on each sql server that's why i'm such a raving fan of this with this you don't have to know any of that stuff it's just that with this you also don't have an awareness of any of this stuff you don't know what the logins are you don't know what the agent jobs are you don't know what the trace flags are which is kind of a mixed bag if you just only have so many hours in the day maybe you don't want to learn what those things are so next up restoring a whole server to a different place instead of restoring it to the same data center let's think about restoring it to a different data center inside my first responder kit if you go and download my first responder kit from brennozar.com one of the things that i have inside there is a multi-page pdf and i like to print this out and give it to executives and it says for each of these situations each of these four situations high availability disaster recovery corruption and oops queries for each of these how much data are you willing to lose and how long are you willing to be down for i'm not going to cover the details of that inside this presentation i have a whole separate session just around rpo and rto recovery point objective and recovery time objective and we dig way deeper into that in the mastering classes but you pick out the numbers differently for each of the columns and one of the columns is disaster recovery we lose our network switches we lose our shared storage we have a fire a comet hits the data center we lose all our domain controllers in one data center when this hits when we're doing disaster recovery we're usually restoring our sql server to a different place like a different availability zone a different region a failover data center if you're doing vss backups this is really easy because what you can do is you can sync your vss backups from one place to another all those third-party tools va veeam zerto they're all able to sync their databases in near real time or sync their backups in near real time from one place to another so if you want to light up your production data or your production sql server over in a totally different data center it's relatively easy you go through the gui you check a couple boxes and off you go with sql server though that's different because with sql servers backups i know what i mean by if you're using sql servers native backups you can totally do the top one with the sql server and i highly recommend it it's fantastic but on the bottom side if you're using sql native backups to product your server you have to go build a brand new sql server from scratch you have to patch it all those things that we just talked about a couple of slides ago for restoring a totally cratered server in exactly the same place so now you can kind of start to see that oh brent i'm gonna have to be doing this multiple times under duress sometimes i have to do it to restore a sql server to the same place sometimes i have to do it when i do disaster recovery and when you're doing disaster recovery it's never just one server when you're doing disaster recovery it's lots of servers at the same time i know some people sometimes i'll talk to clients and their database administrators and i'm like hey have you ever tested restoring over to a different data center and they're like oh yeah we did it once it was about six months ago and i'm like okay so how long did it take you well it took us i think it took us maybe about an afternoon about three four hours and i'm like great now imagine all of your stuff needing to fail over to that other data center at the same time imagine we lose your primary availability zone imagine we lose your production san how quickly are you going to be able to get all of the servers lit up in a disaster recovery data center they're like oh that could take you know hours or days or weeks exactly that's where automation comes in this is why i'm a huge fan of if you're a production database administrator and you've decided that you want to do this for the rest of your life go learn product go learn powershell and use the dba tools framework dba tools is an open source powershell tool kit that gives you all kind of commandlets to do things like syncing agent jobs between servers syncing login jobs or logins between servers is originally started by a couple of community members who needed to move a sql server from one place to another and they're like oh sweet potato this sucks rocks and it does so they wanted to build powershell stuff in order to make your life easier for that i don't teach that here because in my weirdo consulting job i only work on one sql server at a time but if i had to go back and be a production database administrator for a data center full of stuff like half a dozen sql servers or more i would learn this in order to make my life easier especially during disaster recoveries where i need to fail over lots of sql servers quickly so to recap what we talked about inside this module when you're doing one-off restores hit the documentation every time because you're probably having to do something that you haven't had to do before maybe someone wants just a few rows this time maybe someone wants a whole database maybe someone wants groups of databases maybe they want to use the stop act command so i need to be aware of how this works every single time it runs the more often you do it you still need to hit the documentation because you're going to fall into the trap of thinking you know exactly what you're doing that's when things get bad and people start restoring over the exact same database that they actually needed to keep online for a while if you're going to take the journey of being a production database administrator full-time and you don't want to focus on performance tuning servers you want to focus on keeping them up and running which is what production database administration is then start preparing for server level rebuilds by relearning powershell and dba tools dot io
Info
Channel: Brent Ozar Unlimited
Views: 3,116
Rating: 5 out of 5
Keywords: sqlserver, brentozarunlimited, brent ozar, mssql
Id: c9tjmhNAaP8
Channel Id: undefined
Length: 38min 0sec (2280 seconds)
Published: Thu Mar 11 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.