PostgreSQL Backup & Point-In-Time Recovery

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi I'm Kristin and in this tutorial we're going to go over Postgres SQL backup and point and time recovery we're gonna take a look at a few slides just to explain some concepts and then we're going to jump into the command line to actually execute backups as well as do restores using point and time recovery alright let's talk about postgrads SQL backup in point and time recovery so appointment time recovery allows you to restore your database to a specific moment in time also known as the acronym P ITR it requires backing up the live database files as well as you need to archive the right ahead log this is the log of all of the modifications that are happening to the database in terms of updates inserts deletes as well as create it creating and deleting objects now a point in time recovery backup cannot be used to upgrade a Postgres SQL database you can use a sequel dump in order to restore to a higher version or you can use the utility on PG upgrade but you can't use a point in time recovery backup to actually upgrade the database to the next version you can only backup in a restore a hold database cluster so multiple databases can be on a post Crow's SQL cluster and you can't backup individual databases you can't backup individual tables if you need to do that you can use a sequel dump to do that because point time recovery does the entire database cluster and it can be done while the database is online so what are the steps for a point-in-time recovery in terms of doing the backup so first for your post cards SQL database you need to enable archiving of the right ahead log and then you need to develop a process for retaining these wall files are essentially managing them because once you enable archiving of the right ahead log your gonna constantly be generating these log files and you need to have a process in place for being able to delete them at a periodic basis copying and then retaining them for to another like s3 or another volume and just retaining them up and to the point you want to do a restore so once you have that in place you can now backup your live database files using PG base backup there are other ways to do it however this is I would say the easiest way to do is using PG based backup then you retain both the database file backup you have as well as the wall files to replay the database to a specific point in time all right what are the restore steps for a point in time recovery so first you restore the cluster files to a new directory so the what was backed up with PG based backup you restore them to a new directory and then you restore the Wafaa wall files to in the archive log directory then you create a recovery comp file in the cluster directory that instructs it how to do the restore you start the DB cluster and it will be restored to the point in time you specified in the recovery comm file or the latest wall file all right so those are the steps let's actually take a look at this in action okay we are in Ubuntu now and I'm going to I have installed PostgreSQL 10 the latest version as the time this video is being made and the first thing I'm going to want to do because I just relied on ubuntu installing it by default the default install of postcodes SQL doesn't not have archiving of the wall file enabled but before I do that I want to create a directory in which I'm going to keep my PG log archive now the postgrads user creates a directory that it stores in Ubuntu at var lid PostgreSQL so I'm just going to create a PG log archive directory in that as the post-grad user so that the postcards user has permissions to write to that file whatever user you're running postcards SQL as you're gonna want to have right access to that so I'm just going to run this command that will create this directory and since I'm used to do I have to go ahead and put in my password all right so that's director should be created now the again on Ubuntu when you install Postgres it installs the configuration files at this path where 10 is the version number so I'm going to modify the configuration to enable wall archiving so I'm going to run this command and this is the Postgres SQL comp file and you can use whichever editor you would like now by default the wall level is equal to replica but we can check that really quickly and we can see it is commented out but the default is replica so I don't need to change anything there basically it needs to be at least replica to do a point-in-time backup in recovery now we'll look for archive mode and by default it is off so I'm going to want to uncomment that and turn it on as well as I want to specify the archive command in other words how should those files be archived and I'm going to use the basic example that is cluded the postgrads documentation so it just does a quick test and it's going to store them in our lid PostgreSQL and the directory I created and then it has variable replacement for different file names that's that's going to be written so once that archive command is there I'm going to get ahead and write out that and exit the editor okay so at this point we have enabled archive log mode now what I'm going to want to do is restart the database cluster because when you turn this on and it's said that a change requires a restart so I'm going to go over here and use systems detail to restart the Postgres service ok ok now I want to create a database and populate it with some data so the first thing I'm going to do is go to this new terminal tab and I'm going to assume the role of the postgrads user so I'm now connected as postcards to my local system I'm going to go ahead and insert excuse me create a database as well as run this command that's going to create a post table and insert two posts all right that's been database has been created and two rows were inserted into this new table now I want to go ahead and do a manual log switch so I'm going to run this command just to make sure that the logs have been archived in Postgres ten this is the command to do a PG switch wall however in versions nine or earlier than 10 the correct command is P DS which underscore X log so keep that in mind if you're attempting this using Postgres version nine as opposed to ten okay so it did switch the logs because so I just want to write something out to that archive log directory okay so let's go ahead and do a backup now we're going to be using PG base backup here and I'm I've set it to a the format to be a tar and specified the directory where it's going to go so this is going to store it in the directory so go ahead and run that okay so it's returned and if I look at the directory I see there's a DB file backup directory if I look in that directory I see two files are created a base tar so this is all the database files but it also grabbed the archive logs excuse me the wall and put it in this tar file so it generated two tar file files so when we do a restore we have to restore the based our first and then the PG wallet are now if depending on the version you're using version 10 actually streams these wall files and creates this by default in versions before ten like Version nine you actually have to specify that you want to do the streaming but by tenon happens on version ten it happens by default so just keep that in mind if you're following along now I will actually look at this different backup command a little bit later so now that we have a backup I want to go ahead and restore this backup so the first thing I'm going to do is I'm going to stop the database and then destroy all the data in it so I'm going to jump back to a user that can do sudo so I'm going to stop the database cluster you know I can just check if it stopped by just doing a status says it's inactive all right I'm going to now be very careful with this so you're removing every file in this directory so make sure you're on a test system if you're doing it okay so those files were removed if I do a LS I see that that directory is now empty all right so that's been restored so now let's first reach using this tar command we're going to extract the base tar into this directory where our database is stored and again this location was chosen by Ubuntu by the default PostgreSQL installation for the cluster that gets created okay so those files copied in now I'm also going to restore the PG wall as well into the PG wall directory but again if you're using a version earlier then Postgres 10 you're gonna need to use PG underscore X log so it's not PG wall alright so those have been extracted now the next thing you need to do is you need to add a recovery dot-com directory that specifies how the data is going to be restored and this goes in your essentially your cluster directory okay so I've got an empty file here and I'm just going to put this information the restore command and just because they copy from because it wants to know where the wall files are I'm saying copying from this location you can also specify recovery target time I'm going to do that a little bit later but here's where you can specify up to a specific point in time when you want to restore but I'm just going to restore to the latest so I'm gonna place that command in the file go ahead and save it and exit out and now let's start the database so it's going to go ahead and start at the database database is started now let's verify the restore is successful so let's run to select all from posts and there it is or data has been restored alright let's do a actual point in time recovery so the first thing I'm gonna do is I'm gonna do another database backup this time I'm actually going to show you a different way of doing it using compression so for example I'm going to use PG based backup it's gonna be a tar file and I'm actually not going to stream the wall files into it so what that means is I'm turning off wall streaming so you're gonna need to keep track of where your wall files are the directory I'm in terms of the destination I'm going to specify the - which means it's going to pipe allow you to pipe it to a compression program so I'm using gzip if you have a lot of cores on your machine in a large database to make this faster you could use something like pigs which lets you specify how many cores you can calculate the compression across but at the end we're going to have one DB file backup dot tarted GZ that is one file that contains all the live database files alright so I'm going to go ahead and run that backup okay so that file has been created now I'm going to wait a few minutes from this point in time and I'm going to insert these rows because I want to restore the backup prior to when these rows get inserted into the database okay we're going to go ahead and insert these two rows and I'm just going to go ahead and hide the wall stop the database as before and then delete all the data do the restore of the new backup file so here we just have to do it once we're just restoring this file the wall files are still located in our archive directory that we created normally you would need to copy those back over from wherever you were storing those I'm going to modify the recovery kampf I'm specifying the restore command as well as a recovery target time so we're targeting to recover it right before these two rows were inserted paste those in there and save it let's start the database and then see what posts are there so you can see we just got two posts those two posts that we inserted later we're not there so let's take a look at what the logs look like in looking at the logs we see that in the most recent started the database it's starting a point-in-time recovery to the point in time we specified restoring log files consistent recovery stay reached databases system is ready to accept read-only connections so it's you can only do selects and then says recovery has paused and you need to execute this PG while replay resumed to continue so you can log in do selects make sure the database is at the state you want it in and then you can execute this command P SQL P SQL command that says to select PG wall replay resume that will enable you to set the database into a writable mode and after you execute that we can do another query and make sure that we still only have two posts and looking at the log files again okay in the logs we see that archive recovery complete and database system is ready to accept connections so we've successfully done a poignant time recovery where only those two rows are present I hope that was helpful if you want the commands used in this tutorial be sure to visit the link in the description below if you want to receive additional postcards SQL content and tutorials be sure to visit scaling Postgres comm and sign up thanks
Info
Channel: Scaling Postgres
Views: 40,773
Rating: 4.8648109 out of 5
Keywords: PostgreSQL, PostgreSQL Backup, PostgreSQL Recovery, PostgreSQL PITR
Id: Jvdtx-Smffo
Channel Id: undefined
Length: 19min 30sec (1170 seconds)
Published: Tue Mar 06 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.