Deep Dive: Amazon Relational Database Service (RDS)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
leaving the best to the last please welcome Julian Lau head of solutions architect ASEAN worldwide public sector AWS good afternoon so my section is talking about deep dive Amazon relational database offices before I started let me ask you kebab question how many of you actually use the Audi s services before quite a few how many of you is actually coming from the EPA background some so in my section that we talk a lot more about the Audi s offices and also some internal implementation why the Audi s offices is actually quite to our customer in terms of the performance scalability availability and also security if some of you haven't used the RDS services before you will you will be able to know a bit more about the details for those of you already use our services you can even understand a bit more about our internal implementation why the scalability and also the performance is really good to our customer so for the database services to be honest it's not something new for Amazon Web Services we actually provide a lot of different databases to our customer that will include the relational databases including the areola and also the relational database services for relational database services we we actually provide different class of the database to the customer for example that that can be my simple post great maria DB oracle and secure server we also support some non-relational database services for example the Amazon DynamoDB which is a no Seco databases and we also support the Amazon ElastiCache which is the radish or main caste implementation from us we also support the Amazon lab tune which is a graph database for our customer basically these are all the database implementation from a mess on Web Services for some of you are already running the database services on promises you may be also interested in how you are able to move your database to the cloud in which we have a services called a the best database migration services which will help you to collect the data replicate to the database running on our platform we support the full synchronization or and also capture the data change such a way that you are able to migrate your database in progress now let's move on to talk about the relational relational databases in the past the relational databases is actually very complex and very complicated in terms of managing the whole database you need to install it you need to ensure your your database is actually up-to-date your operating system is actually up to date you also need to ensure the security sometimes the job for the system administrator or the DBA is quite tough and then it's the reason why we actually implement a managed database services for our customer IDs we want to provide a lower TCO to our customer and also provide some high availability and disaster recovery across multiple data center capability to our customer so what is so special about the Amazon RDS basically there are few things number one easy to administer number two available and durable number three highly scalable and also then it's a fast and secure database for our customer let's talk about ease of administration in the past when you setup the database you may need to do a lot of tasks right in terms of setting up they're always hard to the OS hard during and then set up your database to the necessary configuration ongoing performance tuning or this kind of thing now with our audience offices basically you can use just one single management console then you are able to sell up the database in superior time Multi AC deployment customer really likes our design for the multi multiple available addition within the region such a way that when you bill your hitch a system you are actually building a hitch a system in Cross datacenter scenario and our ideas out of the box already support this kind of multi AC that means you are able to set up a hit rate data base across multiple data center so let's say for example if your database one of the database not Costa we are able to failover to another note on another data center if the whole data center goes down we are also able to failover to another database running on another days data center so that's why I said then let's say data center level of Heke for a customer RIT replicas a lot of transactional application actually have a lot of performance issue so what people trying to solve and how they are able to solve is actually make use of the RIT replicas to offload the read to the real only replicas so this is all of the box on the RDS arm setup you are able to set up the rear apical for my sequel post grid and all different databases out of the box just with few clicks in our management console although made a backup so data is actually very very important for us so I believe every single DBA here what one of your day-to-day job is actually make sure you would need you remember to do the backup you may write some script to automate the backup by yourself or your menus of some of the backup software to backup your database regularly if you are using the ideas all these different kind of things is actually already automated what you need to do is just come to our management console telling us what is your retention period and then we will help you to do the backup in every day okay then a snapshot whenever we talk about backup sometimes people people talk about food backup or or a take a snapshot how of the box in RDS is actually we supported out we support a accept saw such a way that whenever we do the backup we actually only take the difference on on daily basis for example and then this is actually creates lis reduce the amount of storage you need to have to start it back up so this is also one of the good features some of the customer really like out yes scalability traditionally when you want to set up a scalable infrastructure for the web here is actually quite easy to do so especially if you are running your infrastructure on Amazon Web Services we have the auto scale capability such a way that when you have no demand or web server farm we are able to scale out your infrastructure automatically and when the demands goes down we are also able to help you scale in your infrastructure such a way that the amount of money you pay for the infrastructure can be greatly reduced just a little bit more than what you need to entertain the incoming demand for the databases in the past you will sell up your database on promises the scalability is not that easy to do so but without yes the scalability is that your out of the box we allows you to scale your compute for the ideas either scale up or scale down on demand such a way that Oh if you seem not more demand coming into your application you are able to come to our management console say that you want a more powerful databases than we are able to change the configuration on the fly and the storage then is another key headache for some of the DBA think about the purpose of the database you need to store the data into the databases and the storage is solved you need to dictate the amount of storage initially when you set up the databases but you are using the ideas the storage is actually can be scale on-demand let's say for example tomorrow you may only running 100 GB of data then you create a like 150 GP or podium the next day you realize that all the sudden your database storage growing so fast you'll find that 150 GB is not enough what you need to do you come to our management console saying that you want more data maybe like you change to 300 GB then click a button then we are able to scale the storage on demand also the scalability here is good but we still not able to automate the scalability of the Compu of the IDS in instance for example as I mentioned if you want to have a more powerful databases you need to come to a management console and change the configuration by yourself what happened if Amazon webster's are able to provide the auto scaling on the database itself I will share with you a little bit more new field a new feature from from ideas in the later stage let's deep dive into the area as I mentioned on our database offering we actually provide different database surfaces for our customer one of the cool surfaces is actually what we call the Amazon aurora so what's all about Aurora Aurora database is actually our implementation for my sequel and PostScript basically what we do is actually we grab the source code from my sequel and also post grid and then fine tune that such a way that fully utilized the Amazon Web Services infrastructure architecture and then provide a better database offices to our customer and the good thing here is after we change the code fully optimized running on our platform we still able to come fully compatible with the vanilla version of the my sequel or the post grid database but what we what's the benefit here when after we changed the code of fully optimized is actually a few things number one there is a scale out disability the way that the D scale ability of the Box tell is a surface origin architecture will give you a lot more detail about the internal implementation of the architecture about the LDS database in the later stage we manage services you need to manage a lot of things by yourself you don't need to do you don't need to manage the security and all that so what is all about the the rural database the aware database is a key again we grab the source code from Mexico and post green and then we change the implementation such a way that that will optimize for the Amazon Web Services cloud infrastructure you can see that from the diagram when we set up our our architecture for the Aurora is actually split into multiple no master and also some of the replicas and the key thing here is we split the whole processing for the database engine into two parts one is the usual transaction processing engine let's say for example you you run a sequel statement insert update delete then the Seco de Mexico engine will responsible for processing the child up processing the statement and then end up another part into a what we call the store engine storage engine we basically turn the i/o the database IO and policy into another layer of processing engine to do so why we want to do that is really because we are able to make use of this architecture to optimize the performance for the my sequel or the pop database another key function or features here is really about the storage engine how we store our data files you know we have the multiple availability zones architecture as I mentioned earlier the multiple availability song is actually very very important for a customer such a way that you can buy you can build a change system across multiple data center now when we peel our storage engine we actually leverage more than two availability song is actually availability song such a way that you are scalability and also availability is actually across three and these 3 theta sends her and then with this kind of architecture with the storage engine implementation in terms of the replication the performance can further enhance in which we split the i/o into multiple multiple copies in total to be honest we have three availability zone and in the actual storage we have six copy of your data for one single databases so again what is the benefit for running the might the rural data is is actually automate a lot of administrative tasks for example again you only need to focus on your role as a DBA you focus on schema design queue reconstruction query optimization but you just offload all the tedious job to Amazon Web Services and we are help you we we are here to help you to automate the whole things including the automated automated failover Velib recovery set up push-button scaling or tomato patching etc etc the key performance gain here as I mentioned we grab the source code and then we optimize for the Amazon Web Services infrastructure architecture what is the outcome here's the outcome number one we are able to get maximum of 5x performance gain against the regular version of the my zip code databases 5x and also when we say that we are able to achieve maximum of 5x performance it's not just about a a a performance boost we are also able to scale the performance depending on your situation if you are the DBA you know that the scalability of the database sometime is so hard to do so in terms of maybe you are talking about the scalability with the total number of user connections or you are talking about the scalability about the total number of tables within one single databases or you are talking about the database size for one single database but you can see that from the table no matter which scalability situation you are talking about we are always able to achieve a really really good performance against the vanilla version of the Mexico implementation let's say for example for the user connection we are able to achieve up to eight eight times faster than the vanilla version of the Mexico or in terms of the number of table we are able to achieve more than eleven times better performance than the vanilla version of the Mexico or with the database size depends depending on the benchmarking tools you are talking about you Telesis bench we are up to 21 times faster or if you are talking about dbcc we are actually more than 136 times faster than the vanilla version of the mexico so the performance here is really really good for our customer why why we are able to do that yes a lot of people know that we have a really good infrastructure from Amazon Web Services we have the multi AC we have the region concept we have a really good storage system called the elastic block storage EBS but why all the sudden after Amazon Web Services getting the source code make a little bit changes put it back into the infrastructure we are able to do such a dramatic performance gain for the implementation there feel reason behind number one we do less work in terms of what in terms of fewer iOS and also minimize the network packet traditionally when you run a database system if you want to do a IO system for the database the i/o and the core processing for the database is actually quite Shakti you will need to do a log I owe to save the record save the data page data page before you are able to commit the transaction we we use some algorithms to optimize that such a way that we are able to minimize the number of i/o require or from the network packet transfers perspective when you celebrate cluster you want to replicate the data from the primary node to the secondary no that will be also quite check T if you are setting up a synchronize commit types of replication we are also making use of some optimization technology and algorithms such a way that we were also able to minimize number packet on the other hand we are able to make the host up more efficient reason why is because just like what I mentioned we basically split the processing engine and the storage engine into two different here we further split up the tasks required in two different tiers such a way that we make it more efficient so remember when we after we do this the database is actually all about the i/o or about the network attached storage performance and also the high throughput processing is all about context which by making these kind of changes we are able to achieve the performance I have just mentioned to you now let's do a next level of the comparison for the algorithms I have mentioned to you in terms of minimize the i/o minimize the network packet require why we are able to Det do that on the on the left hand side is actually a traditional Mexico setup with the replica on our platform basically you can see that we will have a primary instance on a c1 and then on the other hand we have the replica instance on a c2 basically we will set up this kind of database replication you will need to deal with different types of operation between two instance number one number two with the instant between the instance and also the storage system which is the elastic block storage you also need to do with different kind of operation that will include in the green light yellow light the pink light the blue line and the purple light which is the law the binary log the data the top array which is right ahead block and also the metadata which is the frame files so you can see there on the left-hand side because of the my sequel vanilla version of the Mexico implementation that needs to coordinate between the storage systems and the communication between the primary instance and the replica instance you can see that we are very active in terms of all the operation always needs to go through the storage system primary not in the Reno and then to the storage system on the other side now let's move on to talk about how a ruler to deal with all this kind of operation which is on the right hand side you can see that we mainly leverage the redo log for the communication the green is actually redo not if you are the database expert you know you will know that every time when the database trying to save the record is that you save to the Rideau redo log first and then there's a processing to to make those redo log to make the right against the data page right to the database so in the past it's actually quite complicated but now we are actually using the redo log only for the what we call the log structured data storage in which whenever we want to communicate between the look we just replicated redo a lot which is the only the transaction replication across multiple notes number one number two as I mentioned we actually split the processing engine and also the storage engine into two different here you can see that the on the on the on the three knocks on the above is actually one tier of the processing and then on the second year which is the the fuel storage are no you see in the diagram is actually the second year we have some compute to do the we do lock epoch application on to the storage engine now you can see that when we do that we actually greatly reduce the amount of traffic and also communication between all the different notes even though we need to do them to the communication across three notes with tree eight availability zone compared to the two knocks on the on the left hand side and on the other hand for the storage system we also greatly reduce the amount of communication across the multiple node in different availability zone the only keeper here is actually this diagram didn't mention the only keeper here is why we are able to only replicate if we do not instead of everything to to all the different player in the diagram is really because of on the processing layer we help on the storage layer we actually have some processing engine to says that we do a lot before we save into the databases basically what we do is actually in the past for the transitional database system we rely on the storage on the data page on the storage engine to do the theory that is actually quite IO intensive but right now we actually leverage the redo log directly instead of rely on the redo log write with a data page which is a two-step approach to become a one step approach we greatly improve the performance here okay we talked a lot about the performance well about the availability a lot of people when they run the system on our platform basically is actually mission-critical cloud is actually in new normal a lot of people running the mission critical system onto our platform and the database is actually one of the key components that needs to be highly available for the application such a way that you provide the best L SLA to your customer the best uptime to your customer so how we are able to do that again because the way we decided our ruler is actually fully utilize the three availability zone storing six copies of the data and then you can see that on the left hand side you are talking about the right availability if the whole data center goes down your database is still up and running and also able to observe the real theory or the right theory if you are talking about the real availability we also support even more higher standard for the SLA to the customer in which let's say for example you have we have six copy of data across three availability so you can see there in the diagram worst case scenario give three copies of the data already gone we're still able to support the read operation for our customer so that's why on top of the performance came with the Aria databases the availability for from running the databases is actually better than the usual way to set up the database on on on the cloud or on the on promises then what about the RIT replica RIT repperton is really good because you are able to offload some of the Curie to the RIT replica such a way that the master no only focus on the right operation right now with the arugula database we actually support up to 15 rib replicas 15 replicas and then across all these 15 rib replicas you are able to promote any Rebecca to become the master know anytime you want so we have 15 Rebecca what's so special I need to talk to each of the individual Rebecca also need to department which we replica is actually up and running before I am able to issue the carry so that's why we actually introduced a new feature called read our endpoint with low balancing and auto scaling such a way that from your database application point of view you only need to using one connection string and then when the connection come to our platform we actually have a auto scaling layer to scale out the replica or based on the availability of the replica to redirect the request to the corresponding rate replicas to handle the rig requests so from your perspective from the application perspective is that you greatly reduce the complexity in terms of writing your program code well about the ease of use so again the RDS database already helped our customer to solve a lot of problem and minimize their overhead for the day-to-day operation here is the diagram to show you how we are able to minimize the operation on the left hand side you can see there all the Audis or disturbed you need to do by yourself if you set up the database onto onto our platform by yourself or sell upon premises you need to do the a problem optimization scale length taking care of the availability backup always patching installation and all things if you're talking about the database setting up on the ec2 which is our virtual machine platform basically some of the OS installation server maintenance instead and power and all that already taken care by the Amazon Web Services now if you're talking about the LDS databases what's so special about this is we actually help you do to do more which is we're taking care of the scale scaling availability database backup software patching etc etc and you only require to do the app optimization as the DBA for example you may have you design some theory and then you'll find that some of the some of the index is not correct you may need to do some index and have to name on the database you still need to do it but from the database perspective a lot of things you don't need to do that anymore what about the storage again the storage system for the for the audience database is actually can be auto scale up to 64 terabytes there's no and there's no performance impact you don't need to shut down your database before you add more data to your databases this is this is a really good set up in terms of when you want to change the configuration you just change it and then we will apply the changes or the flag without affecting the database performance and we will also do the continuous incremental backup to the Amazon s3 and also do the create the user snapshot and all that or this kind of operation will not have any performance impact to your database our cluster with the rural database we also increase some advanced feature which is not very common if you run your a on-premises database that include the first database cloning remember as I mentioned we actually split the processing engine and the storage engine into two different here and then we also made use of the EBS snapshot to help our customer to to replicate the data so let's say for example you view all the sudden want to make a database clone to Chrome to another instance for whatever reason maybe for testing or whatever you can you can do it very quickly in our management console we just have a few clicks then we were able to replicate and chrome it database onto a new instance in so period of time what about the security and monitoring whenever people talk about the security people talk about few things secure data address number one secure data in process number two the mystery secure data in transit so how deers or the Aurora databases actually able to taking care of all this kind of requirement for example from the network perspective we are able to allows you to launch your database into your own PPC such a way that everything can be private you don't necessarily need to put a Internet gateway inside the vbc you can still able to access your databases and if you set up the database this way without the internet connection what is mean by that no one from the internet are able to access your database this is a really really secure setup for a customer for the audience we fully integrate with the I am based on the resource level permission control if you are the guys responsible for the database security you will love this feature security they arrest ideas for the all the audience databases we support the arm secure data at rest in which we have something called the chat box types of encryption you come to our management console say that you want to encrypt the database take a chat box then we are able to encrypt the databases all mentally of course the key is that we leverage our key management services kms to to to store the key and we also support cause region replication snapshot etc etc such a way that when you deploy the database you are if you are expectation on the security is really high all the stuff is actually out of the box well about the database activity monitor monitoring a lot of a lot of customer asking for the damn for the RDS Aurora we also supported them feature for our for customer in which we are able to ship all the log to the cloud watch and then cloud watch was forward or the log to the s3 and you are able to make use of the Athena or the quick to do the necessary analysis for all the database activity inside your database all this kind of thing is actually available to our customer today but in the past six months we actually introduced not more exciting feature to our customer which I wanted to share with you today so first of all about the my sequel compatibility in the past we support the my sequel 5.6 our versions now we also introduced a my sequel 5.7 compatibility in which some advanced features is already there for example the JSON support generate the column space of index all these kind of things in which if you are if you wanted to run the mexico 5.7 you are able to make use of the Aria to do so immediately and without any compatibility problem number one and number two you are able to get you you are able to enjoy the performance game immediately know mature remember the way I mentioned to you the way we set up the storage system is actually make use of the redo log so that's why we introduced a new feature called backtrack so one is all about bear track if you are the DBA one is the most hectic things you encounter from your day to day job because a Anu sir come to you may say that Ohio as a tantalate drop a table or I ran I ran a query d-list something into the database so in those kind of situation what you need to do you need to restore from the backup if you are lucky you do the daily backup you are able to restore the backup from the midnight yesterday but you will have some data loss right so Patrick you say you said hey feature allows you to go back to any time you warned in the past so what you need to do is actually enable this feature in the audience database and then said that how long you want to enable the battery for example 24 hour once you enable that what kind of thing you wanted to do for example at 9 o clock is attentively delete a table then what you can do is that you go to our RDS database data I to go back database go back to to the point that is 559 you are able to do that up to the pan to the minute in the phone so so that's why this feature is really good for our for our customer first of all you mentioned about the in the pho how long you want to enable the bat-trap from 20 from from from an hour to it's 72 hour the second feature is once we enable the features you are able to backtrack to any point of time you want very very convenient for the feature for our customer this is the super super cool thing for our for our customer remember I mentioned to you about the scalability of the database our out yes database is actually able to scale up and down you just come to a management console change the configuration for example in the past your may running to CPU 60 GB of memory then you'll find that that is not enough what you can do is actually change the way more powerful configuration for example for CPU 32 GB of memory click a button then we will scale up your database this is what we call the push button scalability cool feature but not cool enough from my perspective why we don't do something that is the whole scaling is actually automated this is something we ant introduced two months ago what we call the our ruler serverless so what is our esophagus basically you don't need to define what is the the number of CPU or memory you wanted to run your database you only need to mention about the minimum configuration in terms of the CPU and memory and the maximum number of CPU and memory you wanted to run and then we in the real time we are able to base on the demand to scale up your database in terms of the CPU and memory or scale down the CPU and memory based on the actual demand this is a really good feature on one hand we are able to perform better when the incoming or incoming the man of the Saturn pushed to a really high value and at the same time we are also able to help our customer to save the course running a database because we are able to base on the actual demand of scale time back to the more men optimal configurations performance inside we actually have a new features to show more details about the performance characteristics of your database on our platform called the performance inside and before I end I show you a quick demo on the on the LDS database because the the server list database is actually very useful so that's why I would like to give you a demo give you a demo on this part of feature so this is our management console I'm sure you're all familiar with this user interface and then I choose the IDS so I can click on the create database isolate the aurora you can see that we I'm also able to select some other databases but now because of these I want to show this Ovilus Aurora I select Aurora and you're also able to see that we have different compatibility my sequel 5.6 5.7 and also the PostScript now here's come to the most important part in the past you are able to specify the instance type for your database RDS database in terms of the CPU and memory for example if I choose the our photo X large I'm able to get for CPU and 32 30 GB of memory here is the new option called self Alice I don't need to mention about the instance type anymore I type in the database identity fire the username the password then I click Next now here is the most important configuration you can see that we have a new option here called the capacity settings such a way that you are able to specify the minimum a ruler capacity unit for example from 2 which is 4 GB of memory till the 256 GB of memory and we are able to scale the CPU on demand also so again the sofala socket once i click by the way once i click the create database we were able to fish in the cluster automatically and the resources provision is actually based on the actual demand in the incoming actual demand to the databases we are able to scale from down 4gb of memory to 256 GB and the and also of course also with the together with the associated CPU required so that's all the presentation and demo I wanted to share with you too a lot today so I hope you enjoy everything I share with you one of the cultivation usko bear tried new things by yourself thank you very much thank you everyone please join us for the network
Info
Channel: Amazon Web Services
Views: 2,689
Rating: 4.8571429 out of 5
Keywords: AWS, Amazon Web Services, Cloud, cloud computing, AWS Cloud
Id: S5EeyTQVU_4
Channel Id: undefined
Length: 38min 35sec (2315 seconds)
Published: Thu Oct 11 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.