Beginners Guide To AWS DMS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi guys i'm johnny chibbers welcome to my youtube channel i'm a data engineer with over 10 years experience working monday friday primarily in the financial services sector i'm five times aws certified and like nothing more in my free time than making videos for this very youtube channel carrying on with the beginner's guide series we're going to look at dms this is actually a topic that a couple of you guys have requested in the comments section so i'll reach out to you and get your feedback for it as well so what is dms dms stands for the database migration service it's an aws offering that helps us move data from point a to point b through a migration so for example let's say we're new to the cloud and we have an on-premise data center and we want to take advantage of aws aurora we need to get the data from the on-premise data center to aws and that's where dms comes in let's say our on-premise data center is oracle and we want to go to mysql aurora well normally we'd have to plan a migration we'd have to get a project in gear we'd have to write all the sql all the statements all the schemas all the data and then plan how we're going to map a to b and that can take a very long time and be very costly aws dms removes this it uses a replication instance that sits in the middle and all you have to do is set up your source and your destination and you can map between it and if that doesn't work out of the box you also have the option of the schema conversion tool or sct this is a tool that helps us convert from one schema type to another so for example from oracle to my sequel it's not flawless but it definitely lowers the overhead compared to having to do everything manually dms can run as a one-off migration where we're moving from on-premise to the cloud or indeed you can move cloud to cloud if you really want to so we're moving on-premise to cloud in my example or alternatively you might have a data warehouse running out in the cloud and you might have an on-premise transactional database for latency reasons and you want to constantly push the updates from the on-premise to dms that's also possible where you have an ongoing change the replication instance actually scans the logs for change so that's the one thing you have to have enabled is the logs so detailed logging has to be turned on in your source database and there's different configurations depending on what you're using and the second thing is that every row must have a primary key because it needs to perform updates inserts and deletes it needs to have a reference point between the two tables and this is a primary key so all your data source must have a primary key attached before you start dms or else it won't work okay but why use dms well i already hinted that it used to be a massive pain to do the data migration it can cost millions of pounds get very complex and ultimately something always goes wrong dms kind of lowers this priority entry slightly and makes things a bit less complex and more simple if you're doing a really straightforward migration it's painless more complex things like stored procedures can take a bit of time but at least this way you're up and running quicker in the cloud rather than planning a year or two years in advance to try and move your data between on-premise and back up the aws but when use dms well we've already hinted at the first one's when you're doing a migration moving to the cloud dms is the obvious option lower barity entry lower cost simpler the project more cloud adoption quicker and the second use case i've already touched upon where you want to replicate between on-premise and the cloud ongoing so for example again you might have an on-premise database for low latency reasons and you want to just get that up to the cloud to take advantage of some of the bi and analytical processes dms is great for that where you can just do a replication that's ongoing after the one-time load dms will cost some money because i need to set up a couple of databases and do things in between it's going to cost me about 10 to 20 dollars so you can follow along or you can just watch it's up to you but join me in the console and we'll get going okay guys that's me logged into the console first thing i'm going to do is configure an rds instance the act as my source if you have your source um already configured then jump ahead in the video um to where i start to work with dms itself if not you can follow along and set up a source with me so first thing i'm going to do is navigate to rds by typing in rds and clicking once on rds we're going to click create database we're going to do a standard create we're going to create an aurora instance we want to be the postgres sql compatibility we're going to use provisioned we're going to do dab stroke test i want to call this music postgres then you need to remember an admin password so pick one of your choice but it has to be at least eight printable assy characters db class instance go to burstable classes and then pick the t3 medium so we can keep this kind of reasonably cheap in cost we don't want to replicate pick your default vpc say that you wanted the default security group make it publicly accessible create a new vpc security group and give it a name test dms it's really important to give it a new security group and make sure also you have a publicly accessible or else we won't be able to log in with a whole lot of bastian hosts going through an ec2 so we're just going to keep it simple slightly less secure but for the purposes of this demo it's absolutely fine after that let's create the database okay that's the database often creating this will take about 10 minutes usually while that's happening if you look into the description below there's a link to pg admin download so this is the client that we're going to use to log into the database if you're not really familiar with databases don't worry all you need to do is install and i'm going to show you the steps as we go if you are familiar then this is just another client that's going to let us interact with the database that we're setting up in the rds console as we speak to find the version um that suits your your machine i'm on mac os i already have it installed but if not i would click this if you're on windows click this after that just click through um accept everything and then we can pick it back up once this instance is created okay that took about 15 minutes in total so it's a bit of a slow process but eventually both of the statuses will go available the top one for music on regional rule will go available first but you need to wait until that writer rule is is also green and available here and you only see that by constantly clicking that refresh button so let's click into the music instance and down here you will see security groups so let's click on the test dms security group and we want to edit this in burned rule so let's have a look at inbound rules and that's edit that inbound rule and currently it's custom but what we want is anywhere and we want to see it so we're basically what we're going to allow here won't go into too much detail obviously because it's not part of this actual course um today we just want to do dms but it's interesting to know that basically i'm saying just like any internet into the postgres sql database that i've set up provided they have the username and password um not the most secure um use of a security group but it sees the hassle um because we're all different machines just delete um the instance if you're following along once done save the rules and that's it saved so you should also now have crazy admin downloaded go ahead and start that up it will come up into your browser okay that's pg admin up in my browser as you can see if we go over to the left hand side here on servers and we hit create and we want to create a new server i want to give it the name music music you want to go to connection next time we want is in the address so let's back on the rds instance so i'm just going to hit back a couple of times not changing that security group as we go and we're back on the rds instance you want to click on the writer so it's really important click on the writer and you want to copy this endpoint down here back on to the console and paste okay everything else is already default and that's fine so we didn't change anything but you need to type in the password that you gave your user so that's everything then if we save as you can see we've logged in successfully on the left hand side next thing we want to do is right hand click databases and hit create database and we want to call this you guested music so we've now created a music database on that rds instance and then inside music we don't have anything yet inside tables as you can see so click on music and click on this query tool up here on the right hand side so it's the database with the play and you can see that we're inside music okay so it's music forward slice postgres add music so we're inside music and music that's really important then at the bottom in the description i've also left a link to my github inside the github there's a create tables sql file just copy and paste that it's not that long it's only 16 lines so just copy and paste it in to that window of the query editor and then we want to run that query by hitting the play up on the top right bar here you can see it was successfully ran now if we refresh this gamer so go to this game is right hand click and click refresh we should have a table and that table is called artist if we click on tables and we click on that play button again and you type in select star from artist and again you hit that play button you can see that we've ended up with three artists in so that's our table we're going to use from source that's the last thing we'll have to look at postgres sql for today it was just a simple way to get a source set up so join me back on the console where we'll actually start using dms now we get to dms by typing dms at the top and then just clicking on the link that appears and you arrive on dms so the first thing we want to do is create a replication instance this is usually free if it's your first time if not i'm going to choose a pretty small instance so it won't have that much an effect i'm going to call this a music demo hopefully so i remember to delete it later friendly arm we don't need description i'm going to leave blank for now we want to just use that built in one bpc put it inside the default vpc publicly accessible make sure that it's clicked because we're going to need that apart from that if we click on advanced network and security just make sure it's all okay here and then i'm going to use the default security group so just leave it on default and click create that's the replication instance often creating while that's being created click on endpoint so we need an endpoint we're going to have a source endpoint and a target endpoint so if you click into endpoints you'll see that there's a source endpoint and a target endpoint i'm going to use the rds instance that i set up to start this video if you're using a different source endpoint then feel free to enter your own credentials so we're going to click um select rds instance we want the music instance um we don't want a friendly arm we're good we want to provide the access information manually and then we need to type in that password that you gave your instance database name is the database name that you gave it so we kept everything simple and just called it music type in music we don't need to do anything else here can test the connection but the replication instance needs to be up and it's still spinning up in the background so for now let's just create that endpoint and once our replication instance is ready we will actually use that endpoint to test however as you can still see it's creating at the moment so we'll go create our destination or our target endpoint so we'll create endpoint this time we're going to go target endpoint we're going to call this uh music dynamo because i'm going to use dynamodb as my target dynamo target engine is dynamodb so dynamo amazon then with db fantastic a role that can access the target so this is really important we need to give dms a role that can get to the target so at the top let's go to i am i'm just going to open this up in a separate tab that's create a rule let's create a rule let's go to dms which is right there dms click on dms next permissions i'm going to cheat as always in this video and give it full admin permissions you need to give it a full dynamodb permissions and vpc if you want to make it more secure but for quickness full administrator access next tags don't need to do anything there name uh i'm going to call this uh dms delete dynamo create that rule perfect let's just check it's there that uh i'm clicking dms delete dynamo right there so click onto it and then click the arn at the top as a copy and then paste the arn in to the box so i am create the role paste the arn in and create that endpoint that's that endpoint then created if we go back up the replication instances we can now see it's available that means in endpoints we can actually check that it's ready to go so use your replication instance it sits in the middle and then we have a source endpoint which is the music instance one and we have a target endpoint which is the dynamodb so let's go into this one and let's go to connections and you can see that nothing's working yet we can go to run test this takes about five to ten minutes so i'm gonna pause the video here and then we can pick it up once we're back don't be frightened if it takes a long time as i said when you run these connection tests the first time it's five to ten minutes every time okay um having said it was five to ten minutes that one took literally two minutes that was the quickest that it has ever took me to test an endpoint so thankfully it went to successful which means we could connect in to the rds instance that i set up earlier as my source and if i just go back in the endpoints you can see that i have my target so my source my target and a replication instance i have the compute power that's going to do the replication now i need to tie this all together which is done through a data migration task so let's click on that and let's click on create task give your tasks an identifier so i'm just going to call this demo because it's easy for me choose your replication instance it's going to be the vpc choose your source database so i've only got one it's the music instance choose your target which is the dynamodb we want to migrate existing data only this is how you would go ongoing changes and this is how you would do changes only so one-time load is what i'm doing we're going to do wizard we're going to drop the target tables if there were any there aren't we're going to leave it on the wizard for the table mappings here you need to do at least one selection rule so i'm going to allow everything so on schema click on enter a schema leave it everything else is a wild card so we're just going to lift that one table that we created so enter schema on default then just leave all the wild cards in because we want to select everything and then after that everything else is fine automatically allocate on run and create that task that task is now creating once it's created it will start to run automatically so this can take about five to ten minutes so just sit back make yourself a cup of tea i'm gonna pause the video here and then we can pick it up once it's done its thing okay so i quickly unpause the video there and you can see that's gone from demo created successfully to start in progress so i didn't have to click any buttons there it just starts off automatically so i'm gonna pause the video again and then we can pick it up once it's done i took about three or four minutes in total and as you can see the load's been complete the progress is 100 and the load is full so then let's navigate to dynamo and make sure that our rows and tables have actually arrived so dynamodb dynamodb tables and you can see that our artist table has arrived click on items and you can see that our three rows have arrived that were present in our database to begin with so that's working for today guys as you can see actually setting up your source and targets for replication at the beginning is more complex than dms itself it's a really good service to do those migration patterns that were a real headache before before such a thing existed so that being said i've been johnny chibbers i'll make everything for free as usual on my website www.jollychippers.co.uk and until next time guys thanks for watching
Info
Channel: Johnny Chivers
Views: 2,261
Rating: undefined out of 5
Keywords: aws, aws dms, data migration
Id: jlSfjk7yHEU
Channel Id: undefined
Length: 16min 12sec (972 seconds)
Published: Tue May 04 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.