AWS re:Invent 2018 - AWS Databases

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
freedom we've talked about freedom for builders a lot over the last few years and if you think about freedom freedom is for builders is not just about having all the tools that you need to build whatever you want to build at your fingertips but it's also being free of abusive and constraining relationships I can assure you that enterprises are singing in the dead of night and in the afternoon in the morning too and that's because the world of databases in the old guard commercial-grade databases has been a miserable world for the last couple decades for most enterprises and and that's because these old guard databases like Oracle and like sequel server are expensive they have high lock in their proprietary and they're not customer focus these are companies forget the fact that both of them will constantly be auditing you and finding you for some license violation that they're able to find but also they make capricious decisions overnight that are good for them but not so good for you so overnight Oracle decides they want to double the cost of Oracle software to work on a Wi-Fi rosette or Azure that's what they do or Microsoft you buy your licenses you've paid for your license system sequel server you're running them an RDS and they decide they don't really want to let you take those licenses you've paid for and run them an already house anymore they want you to run on Microsoft it's good for them it's not so good for you and people are sick of it they are sick of it and now they have choice and so this is why companies have been moving as fast as possible to these open engines like my sequel and Postgres and marija DB and if you want to get the performance in these open engines that you can get in these commercial grade databases you can do it but it's hard it takes tuning we have a lot of experience doing this at Amazon and so what our customers kept asking us was they said look could you guys figure out a way to give us the best of both worlds we want the open engines with the customer friendly policies and the portability with the performance of the commercial-grade old-guard databases and that's why we built Amazon Aurora which continues to be the fastest-growing service at this point of evolution in the history of AWS and what Rory gives you is it has both my sequel and Postgres compatible editions it's about five times faster than the highest-end implementations of these open source engines it's at least is available and durable and fault tolerant as the commercial grade databases the one tenth of the cost and this is why you see so many thousands and thousands of customers using Aurora which at this point you know we have tens of thousands of customers using Aurora this is the third year in a row that I've been able to show this slide and say that the number of customers is more than doubled and you can see it across lots of different examples you know Verizon is making a huge shift to Aurora from Oracle and sequel server and db2 databases they have or you can see it with Expedia Capital One or Astra Zeneca or Dow Jones or bristol-myers Squibb or Samsung or Ticketmaster tens of thousands of customers are moving now there are a lot of reasons as I was mentioning earlier and why people are excited about moving to Aurora but one of them is because that team continues to innovate and iterate on behalf of customers really quickly and they have launched about 35 significant features in the last year alone and they're too many to mention but I'll mention a few that people are excited about you know when we lost your or a server list you no longer had to actually even provision rawrr anymore you could just actually just them say you want to our server list it scales you up seamlessly when you need it scales you back down so you don't waste money you pay per request or we had customers who were really excited about parallel query which speeds up your queries by two orders of magnitude or we had customers who said I really want backtrack which is almost like an undo button in Aurora the that brings you back by a second to add it to a previous point in time just a couple days ago we launched Aurora global database which allows you to have a multi region or our database where when you write to one spot it replicates that data across multiple regions with a lag time typically of less than a second which gives you even better disaster recovery and lower latency reads all over the world so Aurora is continuing to iterate quickly it's continuing to innovate on your behalf and grow really really quickly but I want to talk about a different database trend that we're seeing that is becoming more and more significant and more and more pervasive and what's happened is that if you look at the last 20 to 30 years most companies have run most their workloads using relational databases and that made sense back in those days when those applications typically were gigabytes of data and occasionally terabytes so where you needed kind of complex joins and you know ad-hoc queries and and the data levels were at the levels I just mentioned gigabytes sometimes terabytes a number of things have changed over the last few years that are impact in people's receptivity to that idea the first is that people woken up to how useful data is at the same time that the cost of compute and storage has come way down in large part because of the cloud which means that most applications today are storing lots of terabytes and petabytes of data instead of gigabytes and occasionally terabytes and then the expectations for builders as well as end users of those applications is really different the latency requirements are much lower and they expect to be able to handle millions of transactions per second with many millions of people using the app simultaneously and then once you've seen over the last few years is that more and more companies are hiring technical talent in-house to take advantage of this huge wave of technology innovation that the cloud is providing and these builders are building not in these monolithic ways of the past but with micro services where they compose the different elements together using the right best tool for the right job and so all this has led people away from using relational databases for all of their workloads and let me give you a few examples take a company like lyft or take fortnight if those of you who have kids you probably know what fortnight is these are if you think about these companies lyft has millions and millions of passengers and then lots of GPS coordinates for where their passengers and and their drivers are and Fortnight has millions of gamers and then millions of gamer profiles this is a really simple data that could be stored in a simple key value pair where the key is the number is the users and the value is either the GPS coordinates and the gamer profiles and so what we did was we built a really scalable database that optimized running key value pairs a single-digit millisecond scale and at very large scale and that's what we built with dynamodb and that's why many many companies like epic with fortnight or likely have to use DynamoDB let's say that you can't even stand single-digit millisecond latency you want something even shorter like microsecond latency so Airbnb is an example for their single sign-on for their guests and hosts they want all the applications to work with microsecond latency and so what they do is and what they want because they want an in-memory database or a cache and that's why that we build ElastiCache which is managed Redis and managed memcache XI and that's what Arabi Airbnb uses let's say that you have data sets that are really large and have a lot of interconnectedness so take Nike as an example they built an app on top of AWS which looks at their athletes and connects them with their followers and then compares all of their relative interests well those are a lot of big data bases if you think about all the athletes and all the followers and all the interest and they actually have a lot of interconnectedness and if you tried to build that application using a relational database it would slow it down to a crawl and that's why people are excited about graph databases why we build Amazon Neptune which we launched here a year ago and is off to really a rare and start so people want the right tool for the right job and they want the right database for whatever their workload is so let me go back to dynamodb a second so as I mentioned earlier we have many thousands of customers who are running DynamoDB which is a very scalable low latency key value database and you see companies like Samsung and snapchat lifts an epic and Nike and Capital One and ge lots those customers are using DynamoDB and like you saw with aurora that team is continuing to iterate at a really fast clip another 30 significant features that they've added in just the last year or so and again too many to mention but some of the ones that people are excited about last year here we launched global tables which was the world it's world's first multi master multi-region database we online backups allow you to while the application is running and the database is writing to do backups of hundreds of terabytes without a disruption of the database point and time recovery was also very exciting for people but when we talk to dynamodb customers the thing that they probably struggle the most with still is how to think about provisioning the write and read capacity and if you're a business that has been using dynamodb for a while it has a large table or large database where you've been using DynamoDB for a while you kind of know how much read and write capacity you need you use our provision functionality you often time had an auto scaling policy so that if it turns out you have an unexpected spike you can scale but you don't have to live at that peak when you don't live consistently at that level but those same customers as well as many other customers have lots of tables and lots of databases where they can't predict how much capacity they need either because they have seasonality or spike enos or just there's their new or small tables and so what they tend to have to do is they have to guess how much provisioning they need and what do you think they do they at the peak so they make sure that application will function no matter what and they usually don't attach an auto scaling policy and that's a waste of money and what we have decided a long time ago at AWS is that we're always whenever we can going to try and do the right thing for our customers over a long period of time even if the cannibalizes revenue for us and so we tried to think about how can we build capability that solves some of those waste for people and so I'm excited to announce the launch of dynamodb read/write capacity on demand so what this means is that you no longer have to guess what capacity you need for read/write through book you can just set up a table for in dynamo dB say you want to run it in demand and we will automatically scale it for you even if you need more will skate up if it turns out that you don't need as much will stop charging you you pay by the request so when you know the capacity you need and you've been running something in scale it will still be most cost effective to use provisioned but for all those other tables and customers who don't know you'll be able to let dynamodb manage it for you and save a significant amount of money now we've talked about these purpose-built databases that we've been talking about key value in memory and graph and one of the things that we have seen is that a new pattern of database need is emerging and this pattern is driven by the millions and millions of sensors and edge devices that are everywhere in our homes in the office and factories and planes and ships and cars and the oil fields and agricultural fields they're everywhere and they are collecting large amounts of data and people have become very interested in being able to understand what's happening with those assets and how things are changing over time and so people are interested in what we call time series data and with time series data each data point consists of a timestamp and one or more attributes and it really measures how things change over time it helps drive real-time decisions so you can imagine give you some asset where all the sudden the temperature has changed significantly you might want to take some action on that asset and you see it across lots of things clickstream data all kinds of IOT sensor readings even devops data and the problem is as more and more companies have this need and this desire to collect and analyze time series data there aren't good solutions for them for how to use it in a database if you try and do it with a relational database it's quite difficult it means that you have to build these indexes which are really large and clunky and are slow to query or the schemas in relational databases are rigid here and aren't flexible enough as you keep adding more and more sensors and also the relational databases don't have the analytics pieces that you want in time series like smoothing and interpolation and approximation these are all things that you don't have and then if you look at the existing time series either open source pieces or the limited number of commercial services they're they're either just really hard to manage or in particular they just don't perform and scale well I mean they have all kinds of limits if you look at some of these limited commercial opportunities or services when you reach the data storage limits it just starts purging data who knows which data it's pershing whether you need or not it's just not a good solution for people who need to deal with time series and so we've been asked lots of times by our customers because we have a really really large and fast-growing IOT business and edge business if we would help here I'm excited to announce the launch of a new database called Amazon time stream which is a fast scalable fully managed time series data [Applause] so timestream is gonna change the performance of your time series database by several orders of magnitude it's a very different equation and the reason is because we built it from the ground up to be a time series database what keeps happening is people take these general stores and then try to retrofit them to serve whatever the emerging needs are but as you saw with fsx for Windows and fsx for lustre people want the right tool for the right job and so we built this from the ground up with an architecture that organizes data by time intervals and enables time series specific data compression which leads to less scanning and faster performance we have a separate data processing layer for data ingestion storage sharing and queries and we have an adaptive query engine that understands the location the resolution the format of the time series data the time if you look at time stream it'll be a thousand times faster at a tenth of the cost than using a relational database to handle this time series data it handles trillions of events daily so it's highly scalable it's got all those analytics capabilities you want in it right in the service interpolation and approximation and smoothing and then it's serverless you don't have to worry about the capacity we scale it up and we scale down for it so pretty exciting now I'm gonna take a semi rare detour if you'll engage me here and give you an idea about how we're thinking about blockchain this was interesting a year ago a lot of us got asked why didn't a double us announce a blockchain service last year at reinvent and even though we have a lot of customers who run blockchain services on top of AWS we've lots of tools for it people were curious why and what we shared was that we in talking to customers we just hadn't seen that many blockchain examples in production or that couldn't pretty easily be solved by a database and I think people assumed that that meant that we didn't think blockchain was important or that we weren't gonna build a blockchain service which was not true it just we genuinely didn't understand what the real customer unit is and again unlike maybe some other folks the culture inside AWS is that we don't build things for optics we only spend the resource to build things where we understand what problem you're really trying to solve and then we believe we're gonna solve it for you and so we spent the last part of 2017 the first half of 2018 talking to hundreds of customers about what is it that you really want when you say you like the idea of blockchain and what we found was that there were two jobs they were trying to solve but they were each a little bit different the first was that we had a significant number of customers who effectively wanted a ledger with a centralized trusted entry entity but where that ledger served as a transparent immutable cryptographically verifiable transaction log for all the parties that they needed to deal with and if you think about this this is something that a lot of companies meet you think about all the supply chains and wanting to have all your supply chain partners aware and you can see this almost every industry you mean on the slide you can see healthcare and manufacturing and government with the DMV and HR but you know think about how many of these types of use cases there are and the problem is that to solve this really well and really scalable is not so easy today again if you try and solve it with a relational database it's not really built to be immutable so you'd have to do a bunch of kind of wonky things to try to make that happen and then maintain it and there's no way to crypt or graphically verify the changes so the other way people think about doing it is they say well maybe I'll use the ledger and one of these blockchain frameworks but the problem is that you have to wade through so much muck and so much functionality that you don't need for this first use case to use the ledger you've to set up a network multiple knows and configure and all the certifications and all the members etc and the reality is that that ledger isn't that performant because it's built for a use case where it needs to get consensus across transactions of all the parties and so that was the first problem that we heard and these were the challenges people were having and really solving them and then the second problem we heard customers wanting to solve was a little bit different these were typically peer organizations that wanted to do business together and where they didn't want any centralized trusted entity they wanted to have complete decentralized trust and so all those transactions and interactions everybody would see and everybody would get to approve by consensus before they happen and again this was an interesting problem most of them are trying to solve this by using these blockchain framers however I ask you have you tried using these blockchain frameworks it's not easy it's a lot of muck and we had you know some of our very best developers and builders inside of AWS try and spend several days getting something real done and it was awfully difficult for them and so that's because you know you have to wade through all this functionality you have to set up all the networks you have to provision hardware and software you have to set up the certifications each member has to do their own part so these are two problems that are distinct that we heard and if you think about the way that we operate in the way that we provided building blocks and capabilities to you over the last 12 years we're always gonna give you what we think is the best tool for each job and these are pre two pretty different problems the people are looking to solve and so on this first one I mentioned when we were thinking about what we could do for people we had an epiphany which in retrospect was fairly obvious but but at the time it was an epiphany which was we actually had to build something like this ourselves in AWS a few years ago so as we had these services like ec2 it has and a bunch of these that had giant scale what they really wanted was they wanted a transactional dog every single data plane change that was being made because it makes things like operations much easier and billing much easier and we thought initially to build that in a relational database but of course doesn't scale for all the reasons we mentioned and so we built this service that we called QLD be inside of Amazon to be an append-only immutable transparent ledger and we said we could probably externalize this and so that's what we've done I'm excited to announce the launch of the Amazon quantum ledger database or QL DB which is a fully managed lettered ledger database with a central trusted Authority and so what ql DB gives you is it gives you that ledger where you've got that central trust Authority like a supply chain all the entries are immutable they're all cryptographically verified it's transparent to everybody that sees that ledger you grant permissions to it's much more performant and fast and you'll get ledgers in these blockchain frameworks because we don't have to wait for that consensus it'll be two to three times faster it'll be really scalable he'll have a much more flexible and robust API is for you to make any kinds of changes or adjustments or to use the ledger database and then it'll be easy to use it'll have sequel like properties they'll make it easy for you to operate so that's the solution to the first problem the second problem where you want decentralized trust across a group of people that need to be solved with blockchain so I'm excited to announce the launch of the Amazon managed blockchain which is fully managed blockchain service supporting both hyper ledger fabric and ethereal so this service is going to make it much easier for you to use the two most popular blockchain frameworks so for companies typically you know the number of members that they want in their block network and where they want some kind of robust private operations and capabilities people typically choose hyperlink and for those who don't know the number of members or want to allow any number of members to join where it's largely public they usually choose aetherium hypu Hydra fabric is available for you to start using today aetherium will be available in a couple months scales to support thousands of applications running millions of transactions really the the most exciting part of it to me is just how much easier it is to get started and to get operating a blockchain with a few clicks so in the console use choose your preferred open-source framework you add the network members you configure the nodes of the members and you deploy applications to the member nodes and just saves a lot of time is much more efficient so when we heard people saying blockchain we felt like there was this weird convoluting and conflating of what they really wanted and as we spent time working with customers and figuring out the jobs you were really trying to solve this is what we think people are trying to do with blockchain and we're really excited to give you both QLD B or managed blockchain service so when you look at this collection of database services this is what we consider database freedom it's not just the ability to use a performant relational database that's free from abusive and constraining relationships but it's also easy access to the right database tool for the right job modern technology companies are not vanilla their workloads are diverse and they vary depending on how much data they have and they're holding or what the latency requirements are or how much complicated joins there are of data sets or whether you're using time series or whether you want to ledger they're different and you can use a single relational database or all-singing all-dancing solution that somebody will tell you will solve everything as easily as you can use a hammer to build your house and if takes up every room but I would be very skeptical of that rhetoric very suspicious the reality is that having the right tool for the right job saves you time and money this is now your right nobody gives it to you in the way the AWS does where we have way more selection of databases and the right tools for the right job and I think you're gonna be excited with the new service that we launched as well
Info
Channel: Amazon Web Services
Views: 4,212
Rating: undefined out of 5
Keywords: AWS, Amazon Web Services, Cloud, cloud computing, AWS Cloud, AWS Database, Relational Database, Non-relational Database, Amazon DynamoDB, Amazon Aurora, Amazon Timestream, Amazon Quantum Ledger Database, Amazon Managed Blockchain, Amazon Quantum Ledger Database (QLDB)
Id: _K_VHKPeX-M
Channel Id: undefined
Length: 26min 44sec (1604 seconds)
Published: Fri Aug 30 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.