Migrate Postgres to AWS RDS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome in this session we are going to learn how to migrate postgresql which is running on on premises to aws rds so these are the steps which we are going to follow so we are going to do the migration of on premises postgre sql server to aws rds to the postgre sql so i have a setup already established wherein i have a source setup wherein we will log in through the admin tool and we it is already running on linux server on the target side we are going to create an rds and we will try to connect it using the pg admin tool so there are a few things which i have already performed for postgresql on the source site so what i have done is on the source side i have installed postgresql 12 so using yum install and this is the link then i have followed yum install postgres sql 12 then postgres server and then finally i have initialized the database enabled postgresql 12 started it and now we are going to follow from here so for the source environment i am already logged in using the mobaxterm so let me show you that so this is so systemctl status suppose create 12 so you can see that it is running and on the source side since this is our demo environment we need to ensure that firewall uh and ac sc linux they should be in an appropriate way so our firewall should be in the state of stopped so i have stopped the firewall you can see it is in the stopped status and we will check the status of s linux as well so this should be disabled otherwise we will go into issues so this is disabled this is my source setup and you and the default postgresql user we will give some password so i have already changed the password of default postgre user on the source site so i have mentioned alter user postgre password and given the password and before we do anything let's go and create the target environment and then we will start from this step which says enable remote access of postgresql 12 because by default uh post gray doesn't allow the remote connection so we need to make some changes which are already made so i'll just explain what has been done so as your postgre can enable the remote connection so we'll go to the aws rds i have already connected to aws console and we are going to create a database so we'll say create database you need to have the login details so that you can create it so i'm going to create a postgre database so this session will also show you how you can connect to your postgre rds remotely so there are few things which we are going to uh perform so here is the version so my requirement is if you see the versions which are available it is starting from 9.6 and it is ending at postgre 13 which is the last available uh public release version so you can see that it is available and we wanted to take a lightweight machine so we will say just dive and test environment in the settings side we have to mention the database name so we'll say rds post gray demo and this is the master user name which is postgre so we'll give the password uh fit in the password again i have kept the password simple and now we will choose the lightest weight machine i have chosen two virtual cpus and 8 gb of ram and 20 gb maximum storage which i am going to use i'm not going to enable storage auto scaling we are not going to create a standby instance and from the connectivity side we will go with the default vpc which is not recommended in production environment since this is a demo so we are going with default to vpc and we will try to access it publicly the end points will be accessible from outside so that's not a good thing but just for this demo we are doing it and in the additional information we don't have anything so we are taking the default port which is 5432 we have different types of database authentication but we are going with the password authentication but if you want you can have one more layer of authentication and you can enable the im authentication as well if you have kerberos you can go with that also in additional configuration we are not going to give anything else that that can be changed later on so we are saying create database so this is going to create the database so once it is created this is going to take two to three minutes so once it is done we have to check the remote connectivity since this is a rds environment so ssh is by default disabled so you can connect through the uh the endpoint and the port with the relevant tool in this our case we will try to log in through psql or pg admin tool this is going to take few minutes in the meantime we will come back to our step by step migration and on the source side we were discussing like we have changed the postgre user password to some password which we have given then we are going on the source side it is all about the source side on the source side we have to enable remote connection so that other than localhost somebody else can also connect it is all about the application also so on the source side we will check the postgresql.conf file so i'm logged in as postgray user postgre is the database postgre is the os user and postgre is a database also and postgre is the uh database also so just be sure that you know all the three so what's what we are going to do is we are going to check the postgresql.com file and we are going to check what is the listen address so if you see the listen address by default it will be localhost so you need to mention the ip address of the host where your server is running postgresql server is running otherwise it is not going to be accessed from outside the next thing which we need to do is we have something called as pg underscore hba.com which is postgre host based authentication in this file we need to add one line which will allow the application ips or the set of ip address to connect to your source database using the password algorithm so this this is required otherwise it is not going to be accessed so these two changes are mandatory you need to make it out the next thing and which ib address has to be feed in here so it is the range of the ip address of your organization or your environment wherein you are working and once these are done you have to restart your postgresql server just to ensure that all the changes are effective so we are going to restart it and we will check the status so you can see that it is running and the these are the background processes which are running now we will try if everything is fine we should be able to connect so what is the default way of connecting you say psql and you you will be able to login but in case you wanted to get password authentication used you have to mention this so it is psql hyphen u which is the username hyphen notch is the hostname and just press enter and feed in the password so this is how you login the next thing which we are going to do is we are going to take a backup of all the databases on the source side so we have a directory which is already created so we are going to take a backup of all the databases over here so before we go from here we talk about the migration strategy in relational databases system or let's say postgresql so there are two type of migration strategy one is online and other is offline in online migration process what happens your source database will keep on running and you have to transfer the data to your target environment which is rds and your changes will keep on happening in the source environment because the application is still live on source and whatever changes are happening you have to capture that and you have to keep transferring them to the target side this is one of the most difficult uh situation because uh your application do not afford to have downtime so what you need to do is you need to have a mechanism in place so that your you can take uh you can create a replica on the target side in rds and your changes whatever are happening on the source side they will keep on replicating so there is something called as dms database migration services which is provided by aws so you can use that service so what that service will do it will get the connection from your source database you will have an admin user you will have the source ip of the server the user which will be the admin user and using the credential it will connect to the source environment and it will connect to the target environment as well which is your rds which is getting created once the connection is made what it will do is it takes a backup of your source environment in one shot and it will restore it over there so there are couple of types of of migrations which are inbuilt in dms one is what it will say is one time so we just take a backup and restore it over there and another is keep replicating that means whatever changes are happening so that is known as change data capture so whatever changes are happening uh keep on transferring those changes from your source to the rds so that's also available and the third one is both take one time and whatever changes are happening keep on doing all those so that is only uh that is good when you have an online application and they cannot afford to have down time for this migration purpose but in our case since it is a small database we are going to take a backup so what this cloud provider suggests is if you can afford a downtime you go with your native tool like pg dump or pg dump all so you can take a backup and you restore it using the relevant tools in the target side so the apis are already exposed so you can directly import into your target environment the only challenge in the target environment is you cannot access the operating system so either you have to use an intermediate operating system on your aws side you create an ec2 instance and from there you dump all the the backups and you execute psq from there and you can do everything from there and what is the most time consuming task will be when you take a backup you transfer it to the ec2 instance and from there when you restore it to the target side those are the time consuming activities so you have to plan it in such a way that whatever downtime is approved after the poc you can perform it in that duration so that's that's how about the migration strategies in our case we are going to take a backup and we are going to restore it so we are going to find the size of the database so what i have done is i have created okay it's all gone or okay you can see i have almost 100 database and this is dvd rental database and the size of each database is 15 mb that means almost 1500 gigabytes megabyte of data is there so it is just 1.5 gb so it is one point roughly 1.6 gb and there are 100 tables so i assume this will be closer to 1.5 gigabyte what you can do you can list down the database name using like this also using the catalog table or additionally if you wanted to see the list of the schemas or the namespaces you can do this these are the things which you need to do in advance just to capture the informations and you can list down the different types of objects which are available that will help when you perform the migration this is the post migration things which you validate and you can list down the users which are on the source side just to ensure that those users should be available on the target once the the migration is completed in our case this is just one user but you can create many and then you can see whether it is working or not you can check out the list of the extensions which are installed and you have to verify whether everything is available over there or not so there is a lot of restriction in rds for postgresql so you need to ensure that your application continue to work when you migrate to rds now in the migration phase we are going with the offline migration we are not going with the online migration which is dms so what i'm going to do is we are going to take a backup of all the databases on the source site let me see if the target is created or not it is created okay let's initiate a backup so this is started and what we will do is uh we will check the security just to ensure uh that it has proper security group for incoming and outgoing rules so it is getting opened i will see the details i have enabled the public connection just for this demo purpose inbound rule just added the inbound rule just to see what's there i'm not going to save anything so it is just we have selected postgresql and 5432 and this should be the public ip from where i will be connecting temporarily okay the backup is done we will see it it's very small backup roughly some 350 mb or something like that so what we are going to do now is we will try to connect to the target environment so how we are going to connect to the target environment let's pick out the details of the rds we need the end point just to ensure that we have the correct end point user id and the password because we just enabled the password authentication there are three types but we are going with one only sometimes it is very slow just go to the dashboard and pick out the database so this is the db identifier and we will see what is the where is it this is the end point so we are going to connect it through this i will knot it down over here so that we can test how it is how it works on the source side i have a connectivity with the aws rds or it has an internet connectivity as well so we are going to test whether from the source side we are able to connect to the rds or not you can see we are able to connect to the target side from the source and how do we identify you can see the post gray user the rds admin which is the database available only on the rds so we are connected to the target side now what we will do is we will connect using the pg admin as well so we'll create the server we'll have to mention the end point so this is the end point in the connection we will mention the end point just feed in the password and if everything is fine it should be able to connect so you can see we are able to connect so let me delete unnecessary connection so i have deleted it and this is my new rds instance which is created this is my postgre default database which is created and this is the rds admin which is created by the rds and you can see you cannot access it these are the restrictions which are imposed so what we will do is the next step which we are going to so we have all everything done now we are going to import all the databases so let's create the blank databases in case if it is required or let's go and fire the import how we are going to fire the import this is the commands the only challenge will be so i have mentioned psql hyphen u username which is the admin user hyphen h this is the target username target database which is rds then hyphen w which is the prompt for the password and then press enter and the i have to find out a way where and i don't need to give the password so the first one it gave error is for the postgre database which already exists so it is not going to recreate that and it has started the import i'm just fitting in the password because for every databases it does prompting and i'll find out a way okay i am not going to do it for the entire 100 databases we are just so every time i have to supply the password otherwise i can put it in the password file but let's quickly check out how many databases are imported so you can see dvd rental is imported dvd rental underscore 1 dvd rental underscore 10. so this way it is going to import all the databases one by one and what we need to check finally is the the list of the things which we have done earlier like what are the name spaces that are required what is the database size which is imported and the list of the users on the source side and the target side you need to compare all of them so just in a nutshell what we have done is we we have in on premises source database which is on linux it is installed with postgrad 12 and we have a target which we created rds postgre 13 and we enabled the public access otherwise that was not required in real environment and then we had taken a backup of the source databases using pg dump all and once we have the connectivity established we imported it i can continue the backup but it is going to take a lot of time i had so this will escape so i'll feed the password with copy and paste i should have dropped the databases which are which i'm going to import it again otherwise it is going to give me this kind of error wherein it is going to check each row one by one not a good way so you have similar feature in dms also when you your application breaks so what you can do is you can drop all the databases and then you can initiate it once again this might be slower because it is going the data is going over the internet if i'm if i would have used an ec2 instance in the same zone then the import would have been faster because it will be in the same environment here we have in connectivity between source to target over the internet we'll try to import some 10 or 12 databases and then we will stop the main challenge during such kind of migration is the downtime what is the downtime which is approved what is the size of the database and whether the application is so critical that it cannot uh assume the or it cannot go for the downtime so everything has to be considered before you migrate the database to the cloud so that's one of the limitation when we go to the cloud the size of the database always play a big role and you have to make your strategies accordingly so as to make sure that your database is migrated within an established time so dvd rental 20 database is migrated this dvd rental 22 23 24 and the 25 i'll close it okay let us cross check how many databases we have imported so far you can see there are 25 databases which are imported not all the databases are user databases there are few which are system databases the system data these are the two templates template one template zero this is the database of rds and this is the post gray default database which is created during the installation of postgre and by default here it is also available in rds and rest of the things which you need to check whether the users are imported or not if the users are not imported then you have to take a script from the source side and you have to import it on the target site i hope this is going to helpful because we have done so what what a w is also suggesting is you use your native uh tools for post gray you have pg dump and pg dump all which are good utility you can use it according to your environment and the migration should be successful
Info
Channel: E-MultiSkills Database Tutorials
Views: 325
Rating: undefined out of 5
Keywords: Migrate Postgres to AWS RDS, migration PostgreSQL to cloud
Id: 8GyrjHe2DS8
Channel Id: undefined
Length: 28min 6sec (1686 seconds)
Published: Tue Aug 24 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.