AWS re:Invent 2019: Using containers & serverless to accelerate application development (CON213-L)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

good morning everyone well it's bright thank you for coming here bright and early on a Wednesday morning at reinvent based on past experiences everybody spends Tuesday night working hard on technical content and it's difficult to get up early just kidding so my name is Deepak I have been at AWS for over eleven years I was here last year in the same theater just bothering gorgeous among other things I am responsible for the containers and Linux organizations at AWS as well as the open source program office and you know having been part of ec2 for over a decade it's been great to see and have a sort of bird's eye view of how customers started building applications on the cloud they started off as simple stuff a long time ago the application they've gotten more and more interesting and more taking advantage of the core capabilities that AWS gives you and helping us build newer and newer technologies to meet their needs my colleague David Richardson is going to come in about halfway through he's been in AWS even longer than I have in his 13 years at Amazon he's built some of the largest distributed systems in the planet things like cloud 453 EBS and currently leads our surveillance portfolio which includes things like lamda API gateway an event bridge so last year we were here and David spent some time talking to you about how Amazon made a journey from having this monolithic code base to a micro services as they call it these days we just call them services architecture but it wasn't just about decoupling a monolithic application into smaller components but also how we organized as a company to make it happen and in fact I think in as we have conversation with customers as well it's pretty clear that it's not one or the other they both have to go together this idea of two pizza teams and distributed applications and team autonomy is something that is built into our culture it drives a lot of how we think about building services how we operate and you'll hear me come back to that concept often today and the reason you do it is you want to give teams more autonomy the ability to move quickly more ownership and but this trade-offs that come with it one thing we realize that we have many many teams running really really quickly to build services that all of you use these days is sometimes you end up reinventing a lot of wheels so it became very obvious to us that we needed to start building quote unquote building blocks to help these teams be more effective and not have every team reinvent the same deal again and again these building blocks essentially end up in two categories the reason you do it is you want to reduce work to solve common problems a great example of that is Amazon s3 most teams almost everyone requires an object store somewhere where they can throw it out they know it's going to be reliable they know it's going to be there they know it's going to be durable message queues are a similar concept and you start building blocks like s3 and SQS to solve those problems and any team can essentially pick those up and go off and is off to the races sometimes you have really really hard problems and you need a team that's dedicated to solving those problems that can then the rest of the organization can leverage networking is one of them how many people here love building networks maybe a couple I can't see anybody so it doesn't matter all I can see these bright lights so networking is another is one area DynamoDB building large-scale distributed key-value stores is again a hard problem operating them perhaps an even harder problem so you want teams that are dedicated to doing that and then the rest of us our service teams inside AWS benefit from them the good news these days is in many cases many folks in this audience do so as well so whether you're trying to address common problems like storage networking hard problems like distributed key-value stores finding the right building blocks is critical but there's another part that's also critical is how do you expose those building blocks to people because if you just and I think many of you know that as you have more and more and more building blocks what are the right abstractions that teams can benefit from otherwise yes they can get to choose from a menu of building blocks but things get harder and harder in how you do it that's become an increasing part of our thinking what's the right lever struction how do we expose these to people how do how do our customers and us internally use them so this mission is become very very important to us so you know David myself many of our colleagues our product teams talk to hundreds thousands of customers every year and what that's really helped us with is thinking through ok we've been really good at building building building blocks we've been shipping them for years there was a bunch released yesterday on stage but what are the right level of abstractions that we can expose and how can we bring our best practices to our customers in ways that doesn't require them to think so much and we'll talk about that a little bit today so in general today's conversation our today's talk is divided into three categories I'll start off by talking about software delivery how can teams deliver software more quickly more safely what are some of the best practices and patterns that we have seen both inside Amazon as well as with our customers I'll talk a little bit about operational models and what those means especially in a service world and then David is going to come out and talk about architectures enough especially around even driven architectures in particular so hopefully this is somewhat similar to last year but covers new ground so let's talk about software delivery so last year connects now who runs our code teams was up here talking about some of the core principles of software delivery at Amazon one of the things that we learned a long time ago was that having a single deployment pipeline in a company doesn't make sense it makes sense if you're one application and you're deploying them but we have many many autonomous to pizza teams how do you allow each team to run independent to move independently to work independently and deploy quickly and a very simple solution to that and it works reasonably well he is having a deployment pipeline burst service so let's say I have a team they may have three services that the operating every service gets its own deployment train you have best practices that you can apply across all of them where they're doing config whether they doing patches where they're deploying software where the updating infrastructure because everything is code and it hopefully you're putting it in a repo somewhere you have the right checks and balances you can now apply the same logical and good best practices across the board code review is unit tests so on and so forth the key is allowing each service to have its own pipeline so that they can deploy independently and quickly and that's what really well there are many ways to do this your classic way is I have a new software system I deploy the whole new software system and there's a big restart and you just switch over to that as far as I can tell not too many people do that anymore the one that's most popular is what people call Bluegreen deployments Netflix calls it red black deployments we spin up a new version of the application and then you switch a load balancer or deserve starter serving traffic to that one what we've realized over time as applications get more and more complicated as they're built from many services is that you need to think a little bit differently about what safe large-scale deployments mean we call that fractional delivery and it's become a big part of how we deploy software AWS and I think the community calls it progressive delivery both work really well but the idea is that you start deploying and actually I think it's best shown by this diagram is you build your application you have a bunch of tests you start deploying in your pipeline but you have a system that's constantly evaluating how that is doing well that could be a combination of a controller that's looking at drift from what you expect it could be a set of what we call Canaries which I essentially replicated a set of service a set of repeatable tasks that we think mimic customer behavior traffic patterns etc what we do is they deploy to a small number of hosts in one availability zone in one region that's the starting point you let it bake for a while you look at the Canaries you look at traffic this is in production you slowly move that to then once that pass you move it to the next step as you're building up traffic to this new application you're slowly fractionally taking traffic away from the old application slowly as you'll do as all your tests pass you'll end up deploying globally different regions have different traffic patterns so you want to go zone by zone region by region have maybe different set of tests sometimes but the idea is that anytime something fails you just roll back and you can bring traffic back to the original state as well this has worked really well for us it's a big part of how we're starting to think about our own software delivery pipelines that our customers use but it's not just us who've discovered that a great example of for company that is taking advantage of these is iRobot but before I talk about how iRobot does it they take advantage of one of the fundamental they use lamda a lot and one of the things that lamda is allowed customers to do is take your infrastructure and your application and mush them into one thing so historically for example you had a team that would be patching your servers for you that would make sure that the environment was ready to accept new software then you deployed new software but what happens if you roll back if you've patched all your servers you put new libraries on them and you roll back to a previous version you almost have no guarantee that the new libraries will work with the old version of your application because it might have had different dependencies it's actually one reason containers got really popular but at lambda you can take it one step further because you can take care there is not your application is your infrastructure allows you to move very nicely forward to look at how iRobot does it I'll try and squint how many people here have heard of iRobot and Roombas everybody or most people so for the two people who didn't raise your hands iRobot is - robotic vacuum cleaners what Xerox was - photocopying for those who remember what photocopying is there are millions of robots all over the world and you know a bad software deployment will just roll off for deployment to the whole world at the same time and everyone's robot stopped working somebody will think that alien invasion has happened they don't want to do that they want to be very careful and very safe about how they deploy software and they take advantage of fractional delivery to do that so at i/o what everything is configured in code they deploy code fracture they have a set of unit tests which they have written as lambda functions they have the step functions based system that they used to scale it out they're deployed using a Jenkins based deployment system they make sure that it's as slowly-slowly as they're all out they continuously checked both on the cloud side and on the edge if things are reconciling if all the tests are passing and over time as you get more confident then they roll it out to the whole world and robot and the robot uprising doesn't happen this system of fractional delivery is one that may seem like it's more work but actually it can work really quickly because everything is automated your machines are making the decisions you are setting the criteria for what's important and actually in the end it's much safer because you're all blacks are happening automatically or less likely to have outages and this is a great example and if Ben Kehoe is anywhere in this audience you can always drive time with him to understand how this works so we call this application first deployment the key is you are focusing on what the needs of your application are if you have infrastructure that the application depends on you either pushing it with your application where it's just the same thing like a lambda does or you are taking advantage of containers we are sort of abstracting away the low-level hardware and focusing on the application needs and its dependencies in a containerized package that works really well but people want to make the sword easier they want us to they keep coming to us have said great you're giving us all these patterns how about giving us tooling that makes these patterns more natural we don't have to teach every customer every engineer in our company what it means to deploy a service what it means to deploy it safely water right methods to build what are the right scaffolding to build our robot is good at this they built all the scaffolding other people come to us and say please help us do it yourself so one area we have invested a lot is CL eyes and using command line interfaces as a way to help developers get there but the CL is almost an end-state a good starting point and one thing we discovered was patterns and ways to take patterns and represent them in code so a great example of that is a cloud development kit or CDK I think around GA earlier this year in CB care you can take patterns of our best practices and actually put them as cbk features or functions and what then you can do is you can rent these to the rest of your company and say hey if you do it this way it meets our best practices so for example if you want to run a load balanced service with ECS it's a few lines of code in cdk and things will just work really well but that's imperative code it's not for everybody if you're an operator you must like CL eyes you might like - what about them so what we ended up doing was building something a new generation of the CS CLI for those who have used Gen 1 it's essentially a mimic of the CS API every command is like the CSAP I command it a command-line way of calling the API ECI CLI 22.0 takes these cdk patterns essentially and builds a domain-specific language on top of them so with a simple command line call you can run a load balance service which will spin up services into availability zones it put a load balancer on top of that create the subnets all of that for you and it'll actually actually build the deployment pipeline as well to deliver that software to deliver that software so we are automatically building software delivery application first thinking into the CLI the idea is that things aren't static you aren't describing all the infrastructure upfront in a static file you are something that's constantly looking at what a pipeline is what kind of application you want to deliver what kind of behavior it should have some people call this get ups which is a term that's become popular in the last few years it's something that we think is great these kinds of practices I think as more and more people adopt lambda containers I think let's just start getting more and more common because it is a little bit easier and we have done something similar with lamb this is an example of the kind of calls you have with the CLI 2.0 you can initiate it you can create projects you can build an application you can have commands at the level of an application but you also have a whole software release process built into that CLI and now I think that's the part that we just didn't have in the past but you want to make it a lot easier for people to build and deploy applications the way we think are safe and good good good hygiene lambda is something called start right templates that are part of the service application manager Sam is very analogous to cdk you can have these patterns that combine things like package functions api's databases event sources into one template you can have best practices described in those templates and then you can use the Sam CLI the command-line interface to do local testing simulate these templates build a deployment by train and then go deploy these deploy these patterns and I think you'll see us do a lot more of this both on the lambda side and the SES side as well and in the ETS side we do that CDL the key is all of these projects are open-source so it's not just us building these patterns and deploying them if there's things that you like to do things that you think are more generally usable not just within your company submit a pull request submit an issue we're more than happy to to work with you and try and get that done so we've talked about software delivery software delivery is what allows teens to move quickly to deploy safely and as developers as more and more developers are thinking about the end-to-end aspect of what it means to build and run software the second part of it is how do you operate it and how do you give how do people have the right tools in in their hands to operate safely not just your teams but also your DevOps teams and your developers I mentioned earlier one of the challenge of the building blocks is you have a lot more pieces that you need to operate AWS now has 175 services I might be completely out of date by now and trying to figure out how to corral all of this together it's a bit of a challenge and we've started thinking increasingly about how we can help remove this complexity so the question we ask ourselves is what do developers need to build and run their application that's the question we're asking ourselves all the time and things that have come up again and again as we talk to them is people want to run applications they don't want to spend have it's true today there are entire customers who have entire teams whose job is crawling their optical their infrastructure on which their applications run whether they're building telephone modules where they're building cloud formation templates whether they're trying to build a whole platform that they then went to their application teams that's a lot of work for a lot of customers that they would rather spend building things that have business value it's a challenge they want scaling to be quick and seamless then I want to think about it too much and the last thing that's become increasingly important especially as customers in more security conscious regulated industries start picking up these modern practices is how do we build security and isolation into the designs rather than something here to work with a separate security team to work on because you move quickly if you always have to engage a team outside your or to figure out hey I'm a secure or not and so these are some of the core concepts and centralized teams that keep coming up so what do we do about it well the first thing to remember and start off is AWS responsibility model you've probably heard us talk about it I'd nauseum over the last decade is AWS as a shared responsibility model I like thinking about it in terms of ec2 if you have a box and there's a line in the middle of the box anything below that line is AWS responsibility we am down anything vm up is the customer's responsibility it's yours what a customers want to do is move us bond AWS to move that line further up but they don't want us to move it to a point where things become a toy or your flexibility goes away so much that it's not that useful so finding the right balance is hard IDW as the way we do it is we do it in five different ways so that there's five lines and you can figure out which one you want that's been our approach so far and it's probably not going to change and over time you might figure out once and for all that this is the right line and we'll just focus on that one but so you are making trade-offs could be knobs and there's customers who just want every single knob in the world and there's others who like I don't look at a knob all I want to do is run my software I had the slide up last year I don't know does this pretty but I had it up which is I said cluster huggers are the new server huggers and if anything the last year has made that conviction even stronger and here's what I mean by it so what of I mean this is a particular issue with container orchestration in general a few weeks ago I had gone to New York made a lot of banks and in inevitably every discussion was about or should we run one cluster with a lot of applications should we run many clusters each with its own application which instances should be used how do use namespaces just all this muck about cluster operations which when you if you think about what ec2 did nobody was thinking about which racks you were deploying to which particular rack on which server my optic server VM was running on before brought it all back and actually I did more complexity because there to think about 15 other things and so the cluster management that many people love myself included didn't mean running cluster for years and years and years but then artifact of how things ran in a physical world and I think a lot of what we've done in cluster Orchestra container orchestration in the last I would say five six years is almost a regression from what we had done in the cloud before that we made customers start thinking about capacity and that seems to be a problem so we've added what we call significant amounts of accidental complexity what is accidental complexity it was actually a term I'd never heard of till about two weeks ago or three weeks ago one of the senior engineers in our Jacob Gabriel's and we were writing a doc on simplifying the customer experience and he used these two terms essential complexity and accidental complexity essential complexity something you cannot avoid it is inherent to the problem you're trying to solve accidental complexity is the stuff that you don't care about but it's there because somebody's made it your problem a lot of what we are focused on and we should be focused on going forward is you have to focus on the essential complexity that's what you should spend all your time on how can we make that it's accidental complexity go as far down to zero as possible so the area that I'm going to talk about is compute capacity and surprise surprise for the start means service so if clusters have brought back a lot of the infrastructure challenges that we thought things like ec2 had taken away from people we believe server less sticks though brings it back and makes customers think less about it so it's foggy and lambda for example there are no bets there's no machine that you have to go babysit and figure out if it's running how do you bin pack on a machine which application should run together those are things you don't have to think about you are deploying a task you're deploying a pod you're deploying a lambda function though deploying an application those become your first order bits that's where you get Bill that's what you attach networks to that's what you're attached nearest stores to it's a much nicer model but we also have to build tools to help you do that much more easily but I think it's super super important that we start thinking about how we remove the complexity that we've added over the last few years where you have to start thinking about everything it takes to do cluster management or even multi cluster management seems like a waste of time now the reality is not everybody is ready for that people have to start simple there to start with something they understand so they are very likely going to start running containers on ec2 instances that's fine but over time they want to maybe move to forget and the constant challenge that we hear from a customers is should I start there should I move there overtime and they challenged us to start help it to help make it easier for them to make that decision or at least make it a late binding decision so yesterday we launched this concept called capacity providers for ECS and a capacity provider I think returns us back to this concept of a capacity is just capacity and the easiest schedulers and your api's decide how you how and where your application should run based on what you tell us so today ECI supports two types of capacity providers the first one is auto scaling groups which is essentially ec2 instances the second one is Fargate and the fun part is you can basically say in a config file that I want 80% of my application of this service to run on on-demand instances and another 20% on spot if the spot capacity goes away the traffic gets weighted off to the on-demand stuff if the spot capacity comes back you'll start running 20% spot again you are not doing anything the scheduler is doing that for you because you've declared in a declarative config file that you want 20% of your application running on spot instances you can do the same thing with fur gate and in case you missed it we also launch forget spot yesterday so far gate allows you to do savings plan for long term rather you know long term discounts you have on demand in our spot as well that's great you can have a capacity provider that's just ec2 instances you can have a capacity provider that's for gate that's what we can do today but here they we are going with this you can mix and match capacity providers you don't do this today it's going to come hopefully early next year which is you can start off by running everything in ec in on ec2 instances slowly over time by just changing config file he said now I want more of my applications run on Fargate and you can go one step further and said oh by the way of the stuff that's running at Fargate run 90% of that on spot instances or you can burst from running on ec2 instances and then anytime opportunistically forgets part capacity is available run it for me so essentially what you've done is you made things declarative you made figuring out what the right capacity is and how to run it our problem you're not thinking about it explicitly and if you squint hard you can take this concept a little bit further we also launched GCS for outpost yesterday there's nothing that says outpost can be a capacity provider for UCS or wavelength or a local zone and your scheduler in theory should be able to decide yeah that looks like this capacity in this outpost or whether I'm gonna run this service so over there and then Brian inside near the bluest region when there's capacity available there so that's the idea that we have of picking and abstracting away capacity management which clusters as we know them brought back and made them your problem and we don't want that to be the case so we're super excited about this kind of capability and what it enables our customers to do a great example of a customer that's adopted serverless containers for a couple of reasons is Vanguard Vanguard for those of you don't know it's one of the largest investment companies in the planet it they manage a five trillion dollar their five trillion dollars under management and they had some specific needs they are a regulated company they have a lot of high security web applications they have they want hardening of the container boundary in a pre Fargate world what they would have to do is figure out here are all my class their clusters which applications can run next to each other should they have soft multi-tenancy versus hard multi-tenancy but forget they don't have to think about it everything by definition has hard multi-tenancy that's a decision that's been taken away from them it's accidental complexity they don't have to think about every task gets its own network interface now you don't have to figure out whether networks you know do tasks on the same coast our sharing Network fairly or not that's gone away as well so that's an example of continuing to abstract away this after accident complexity and we believe very strongly that serverless whether it's containers or lambda do a great job of taking that away from you so you know vanguards pretty much moved everything over to forget for a bunch of applications that have regulatory requirements and that's kind of amazing to see the other thing we needed to do far gets great ECS customers got to have all the fun what about all the kubernetes folks a lot of the things I say about cluster complexity comes from the fact that kubernetes is single tenant and every cluster is has its own database and its own control plane which adds a lot of complexity from an operational standpoint now the best part is you with yesterday we added launched forget free cares what that allows you to do is take kubernetes pods and run them on Fargate you get the same hard isolation everything that the pod level your pod is effectively our node and so the operational complexity just goes away significantly and allows you to focus just on your pods and just on your services and we think that model is going to evolve over time and if you attend some of the other deep dives and forget talks at reinvent you'll realize that we're thinking further about how do you make it easy for you to build your own orchestration about on Fargate over time not just depend on AWS Orchestrator so let's get back sort of to the operational model that we have at AWS I wasn't kidding when we say we draw three or four lines I said five small like three there's ec2 it's right at the bottom of the stack the lines halfway through the Box you have EGS any KS with capacity providers with managed nodes for eks you are taking some of the accidental complexity away by managing the capacity pools there are applications run on and having to spend less time thinking about them with four gate you take that even one step further by making capacity not your problem period it's our problem running it efficiently is our problem lambda is the extreme case where you're just deploying these little tiny functions and really quickly with high concurrency and not having to worry about how do you the control plane or management plane with the container Orchestrator that can handle their complexity that's great you have multiple lines so so far I've just talked about forget on the container guy like it but let's talk about lambda as well tt Park actually could I interrupt and jump in here do you want to yeah yeah operations is kind of messy so why don't I just jump in Thank You Deepak [Applause] okay so we gave you sort of this very clean theory of operations and and it's what we strive for but of course you know sometimes things are a little messy and we can't quite live up to that so we're constantly trying to improve operations and and one one place where we know we've we've kind of let you down in the past has been the network performance or cold-start performance of of lambda functions when you were using them with V PC functions so for the past year or so a little bit over a year we've been working on radically changing how lambda functions are able to use V pcs to really just eliminate all of that and and we were excited to finish that up just last week right before reinvent so working with our partners in ec2 networking and there are hyperplane capability we've basically made a V PC to V PC NAT so that now rather than needing to scale up your your V PC connections as your lambda concurrency scales up we just do it one time when you either create a function or update a function so that process can take a little bit longer now but we've not only eliminated the latency that comes with connecting new functions to your V PC we've also eliminated your need to scale the number of IP addresses to your lambda concurrency so hopefully we've taken a lot of operational muck away so that you just don't have to worry about that at all on the other hand there are some times when you know more than we do and giving you just a few operational controls can let you continue to have an overall service experience for you for the most part can can handoff operational responsibilities to us but you can use your knowledge to improve your overall operation and so for the last few weeks we've added a lot of new capabilities with lamda to make it easier to work with asynchronous and streaming integrations so if you're working with Kinesis streams or dynamodb streams you now have greater control for both sort of low infrequent as well as super high-volume cases so we launched something called batch window where instead of waiting for literally a full batch size you could set a time limit so if you had rather infrequent Kinesis streams you didn't have to worry that if you set too big a batch size you might go you know hours with without an invoke you can set a time limit on the other hand if you have a super high-volume Kinesis stream or DynamoDB stream but you don't want to have to recharge you can now set a parallelization factor with lambda so we'll actually scale out and do anywhere from one to ten batches in parallel we along with that have added new logics so you can enhance retries and error handling so you can concet a max age you can you can bisect really just wanting to give you some more capabilities so you can continue to operate in a Cerberus fashion but on a range of data streams and then finally sometimes what we do to improve operations is actually just replace a bunch of muck that you've had to do in the past so with lambda with async functions they just kind of end you know we didn't really give you much help with that so you could write logic to publish the results somewhere but that was code you had to write so good news now with with lambda destinations you can just eliminate that and instead when you configure your function you can configure it to deliver a result to one of many destinations could be another lambda function could be an SNS topic could be an sqs queue and that way you can now move into configuration what before you had to own in code but in that sort of same spirit of sometimes we want to give you just a few knobs we're really excited last night to launch lambda provision concurrency and this is a capability that allows you to sort of address super lis see sensitive applications those where all of the work that we're doing to reduce cold starts hasn't been enough maybe because you know even after we've eliminated that Network latency and the work that we did last year with firecracker the cold start times we're still too much for your latency sensitive applications or maybe because your own code needed to take too long to do its initialization phase so now we've added a new capability so that you can have consistent double-digit millisecond start times on your functions by provisioning a set of concurrency ahead of time and it's very dynamic this is not a well I can only do it like once a day or once a month or something like that you can change this every few minutes it actually comes with a set of metrics that will tell you about your concurrency and so you can even hook that up to auto scaling so if you have a nice sort of time of day predictive workload you can hook that up to two auto scaling if you know that you have an event coming you can provision that yourself and you can turn it up and you can turn it down you can even combine it with burst ability so if you know that you you only want to deliver a consistent performance experience and you'd rather deliver errors you can set that at a cap or if you want to be able to take on more traffic than expected it can burst in in the normal way so really our whole thought process on this was we don't want you to have to leave lambda if you have an application where cold starts interfered with what you wanted to accomplish instead we wanted you to just be able to add in this new capability keeping everything else the same your function code doesn't change lots of customers will have a same function running purely on demand most of the time and only in particular events will they provision concurrency so we didn't want you to have to leave the the realm of lambda so we're looking forward to seeing how customers use this and and we know that the way you'll use it is you need an ecosystem of support so we were also very happy to have many of our partners in the the provisioning and monitoring ecosystem built in support for provision concurrency at launch so thank you to all of these partners for doing that so another thing we know is you know your architectures are a long-term bet and and recently AWS announced a new way to you know commit to purchasing ec2 instances and Fargate called savings plan and just as a preview announcement coming early next year you will now also be able to have those commitments apply to lambda so we want to make sure that as your architecture evolves you don't feel that your financial commitment has been stranded and that that's a reason why you can't evolve and adjust your architecture over time so lambda will be fully functional within your commitment and savings plan so overall as Deepak said you know our goal is really at the end of the day to help you innovate to help you deliver as much business value as possible as rapidly as possible and we know that operations is hard it's a critical part of working in the cloud so wherever we can we want to provide some some strong leverage some some gears that you can use at whatever level is appropriate for running your business but hopefully over time we can eliminate more and more of that so you can focus just on business logic okay so this is where I was supposed to come in sorry I can't go more than a few hours without talking about operations or I get jittery so hi I'm David as Deepak mentioned I lead our service portfolio and so I wanted to talk about application patterns architecture patterns which really at the end of the day come down to being about communication patterns it's why Deepak started out by talking about two Pizza teams you know two Pizza team is just there to limit the number of people have to talk with each other to make a decision you know because decision-making is hard and the more people involved the harder that is so that's about communication management and so our architecture patterns I really think about the earliest days of the internet the earliest days of the web and I think one of the architecture patterns that has has led to so much of the innovation we've seen over the last twenty thirty years can be called small pieces loosely joined the idea that you really want to be able to innovate in just little incremental areas sometimes that let's you do something big and huge sometimes small and incremental but the more that you can do that independently with loose communication leveraging what's already out there the the greater your ability to innovate and especially to innovate on a lot of fronts all at the same time so you know we think about there being kind of three core architecture patterns that support this API or request driven request reply driven systems event-driven systems and data stream driven systems so we'll talk about those and I tend to think about api's as the front door of micro services they really are that little bit of guarantee that the engineering team makes to everybody else that says this is what I'm promising I'm going to deliver to you and then behind that front door they can do whatever they want you know as long as they don't change that contract they're free to innovate whether that's to change the implementation to add new capabilities to to performance optimize that front door is critical and so inside of Amazon we even talk about that as our relationship between two Pizza teams we talk about api's as hardened contracts that's the thing that is a high judgment to change but if it's not in the API you're free to innovate and so you know even even something as simple as that still requires work there are lots of properties of a well-mannered API usually you want to do metrics generation you want to do throttling you want to do some type of access control you want the ability to evolve your API maybe to generate SDKs and so API gateway is there to help so that you don't have to spend your time innovating in just what it means to to be to have a normal well factored API you can innovate on what that API provides and so API gateway can act as that front door and is capable of integrating with a wide array of systems behind the scenes whether those are lambda functions or containers or instances it can even talk back to legacy systems on premise by using VPC and and Direct Connect so we have customers like realtor.com who are able to use api gateway as a large part of how they serve their customers anybody who's ever bought an apartment or a house knows that it's a very visual experience and so realtor.com serves hundreds of millions of images every day mostly through API gateway and using the caching that's built into API gateway to be able to support that load but we know that sometimes api gateway has been perceived as too expensive or too slow and so later this afternoon we are going to launch in preview a new version of HTTP api x' that are designed to be up to 70% lower cost up to 50% lower latency that will support built-in standard support for OAuth and OID C and a simpler getting started experience more of a one-click rather than the multiple steps you need today to define your api's so we're hopeful that this can help you embrace API management in places where it may have been cost prohibitive or not have the latency or simplicity that you needed before so api's are kind of present no matter how you're building things you know they're definitely present in a request reply architecture but one of the other common architecture patterns is is event-driven and that's one of the most common ways that people build real-world service architectures this allows them to have this nice property of decoupling services that can scale independently in a way it's sort of a back to the future' architecture you know enterprise message buses became quite popular in the late 80s early 90s in many ways for the same sort of reason enterprises are complex evolving organisations you know whether it's through mergers and acquisitions or trying to enter into new business initiatives or understanding some places need to be optimized and so that idea of loose coupling where you can have a common place a message bus a pub/sub system where you can publish interesting information from one set of systems that can be consumed and react - in other parts is a kind of long-lived architecture pattern and so you know we try to provide a large number of services that you can build in this architectural style whether it's an event sourcing pattern like with Kinesis or dynamodb streams or some of our core event oriented offerings to help just manage topics and queues and and events our goal is to have a set of capabilities that can remove a lot of the muck of building an architecture this way by being able to operate the infrastructure on your behalf and so it can really span from these sort of old traditional enterprise applications all the way up to very modern IOT systems which also tend to publish facts or to modern messaging systems and so customers are able to use simple cue service and we've recently added support so that you can now have ordering so we launched sqs FIFO and lambda recently added the ability to have native integration to process messages in order with SNS our notification service we recently added dead letter Q capabilities so that you can control what happens to a message if it isn't processed within the time that you'd like and then within events we've recently kind of taken our existing cloud watch event system and double down on it to create the Amazon event bridge service and I wanted to talk a little bit about that so we really see events which can be just very simple facts about things that have happened in systems as one of the core ways of building cloud native applications and a vast majority of AWS customers already program with the events that almost every AWS service generates through cloud watch events and recognizing that we thought well we're happy that customers are using the AWS events and that they're also publishing their own custom events but we wanted to make sure that as you all use more more software as a service providers you're able to program in the same way and so over the summer partnering with about ten software as a service provider such as Pedro duty and Zendesk we added the ability to very easily have your SAS provider publish events into the bus that you can then program with in the same way that you can program with AWS native events and so we're happy to have a couple new partners who just recently announced including MongoDB and their their managed MongoDB service so there's a lot of things that you have to do though when you develop with events you you have to have a way to publish events and we're working on making that easier you also have to have a way to consume events and to program with them and to be honest this is a place where traditional API is request reply api's have had better developer support you could really work with them at the level of a type rather than just a set of strings or JSON that you had to figure out how to parse and so we've wanted to try to simplify that and so Monday night at midnight madness we launched a new ability to create a schema registry and automatically discover the the schemas that are flowing through your event systems so you know you don't have you you can there are api's to program and create a new schema if that is what you want to do but for an even easier experience you can just turn on discovery on your cloud watch event bus and it will automatically detect the the keys and the types of those keys and publish them into a registry and then we have a set of language bindings so I thought I'd show you just a short demo because really the goal of that is to enhance developer experience we wanted you in your IDE whatever that may be to be able to have the same strong typing support and auto-completion support that you have existing with traditional api's also with event api's so as I mentioned we recently had MongoDB come and join as an event source and so this is the sort of experience you have integrating activating you know your your SAS account to start to publish into the event bridge so it's about as simple as that and so once you've completed that on the SAS side you can go into event bridge and start to interact with this new new stream of events such as by looking at the discovered schemas that the system has registered from that from that bus and so that's nice on the console but where it really gets interesting is to start to move into your IDE where you can also browse the schemas in your IDE and and take a look at those and so you know it just shows up as a JSON but then when you go to actually program with these you'll see that it can show up in inside a project that you create in your code you'll be able to have the the normal sorts of experience of browsing and tab expanding and having you know getters and setters and the usual sorts of things that will make it easy to work with with events so as I mentioned this is in preview we'd love to get your feedback you know as you use it and and we plan to roll this out early next year so you know request reply api's as the front door event-driven weather that's that's simple events on the bus or topics or queues and the third major architectural pattern is really data streams and this this can span a pretty broad range it can be data streams as events such as if you you need strong ordering and what to use an event sourcing type of pattern which works really well with DynamoDB streams where you're probably already interacting with the database and it sort of gives you a almost like a transaction journal that other systems can be driven off of or Kinesis streams but then it's also quite common for customers to to use especially Kinesis streams as a very large volume data processing system and and so you know that can be a great kind of connective tissue between different different portions of your system at high data volumes and that that's why we watch those those recent capabilities but we know that sometimes what you have to interact with are not sort of all you know brand new to the cloud cloud native style architectures like data streams but instead you have to work with existing relational databases and to be honest that's been a challenge with with lambda because they they just have different design centers you know with lambda compute is very ephemeral and it scales up near instantly to as many instances of concurrency as you want and then scales back down whereas databases with their connection pooling approach ten to one not quite as dynamic arranged and so customers who've tried to drive high scale or high burst workloads against relational databases have had to go through more operational muck than we would have liked so partnering with our RDS colleagues we've built basically a proxy that can help with that connection pooling and so we announced this in preview earlier this week where the idea is you can kind of think of it as as sides of the relationship between your compute and your database and the side between the proxy and the database can basically be owned by a database administrator and it can scale at that kind of the normal connection rates that that a typical client of a database would but then on the the lambda side that can scale to the sort of you know big peak to average ratios that you're used to with lambda functions and the proxy can be smart about reusing those connections and multiplexing that so you don't have to worry about that that's it's its main goal but we also know that managing security with databases can can be challenging nobody really likes having to distribute the database credentials out to every function let alone maybe the hundreds of different versions of functions that might need credentials and so you've been able to use secrets manager to do that but it's still kind of unsettling and so the other thing that we've done is make it so that you can split the security responsibilities so now only the proxy has to know about the actual database credentials and again that can be you know controlled by the database administrator and then you can use I am between the lambda functions and their role and the proxy so that you can use cloud native off to control that side of the experience so along with these kind of three key architectural patterns of api's events and data streams you know we know that most real-world applications need some amount of coordination you know especially in a server list world where you're often working with multiple services together to leverage each of their capabilities it can be helpful to have some coordination you can you can do that in code but sometimes it's nice to be able to move that into a dedicated service that's focused on that and where you can observe what's happening in a workflow and so many customers use step functions for that but just like with API gateway we know sometimes it's been too expensive and not low enough latency or high enough transaction rate for some your workloads the original design center of step functions our standard workflows is incredible durability and exactly once you can run a workflow that could take up to a year to complete and we have very strong exactly once guarantees but that's over engineered for some use cases and so yesterday we announced in GA Express workflows which are designed instead to be kind of paired with lambda type workloads that can be very ephemeral very fast we're now you can do up to a hundred thousand streams per second so much higher volume at at a substantially lower price point but shorter in duration and with the trade-off that these will now be at least once rather than exactly once so it's possible though to use these two workflows in combination so if you have things that it's okay to restart there's a little ephemerality that's a great use case for Express workflows if you have things where you really don't want to have to restart you don't want to have to manage a lot of checkpointing yourself you can use standard workflows and they'll even compose together so you could have a very long-lived workflow that you know may have relatively slow rates driven by standard workflows but that in turn could drive high rates of Express workflows so overall you know our goal is to be able to help you deliver as much business innovation as possible as rapidly as possible and so we've sort of shared a set of our common practices whether those are software deployment our operational model or application development and I think different customers go through different journeys in starting to adopt some of these sometimes it can be rather organic it's pretty common especially looking at use of lambda for people to start out with IT automation it's really easy to just glue a few things together through all of the integrations they may then move into data flow processing because of the deep integration between lambda and like Kinesis streams where if you have a lot of data usually you want to do some type of computation on it and so it's it's quite easy to be able to just activate some lambda functions especially with the enhanced scaling controls that we've launched and then often as customers start to figure out their micro-services strategy and how they they want to build applications they'll move into building service micro services so that that's a very common evolution but sometimes what customers decide is you know what I actually want to do is is we call it a service first strategy I would like to challenge my organization to when we're building new applications let's see if we can do it serverless first and only convince ourselves that we can't before we take on some other technique and really again the idea is to be able to have very rapid development to be able to have fast time to market that's really what this is about is we want you to be able to deliver value to your business without having to manage a lot of undifferentiated stuff and I think that's especially important in changing business conditions where you know business units all throughout which maitre ditional II not have been very technical need digital support they need computational support they need help from development teams and so being able to respond rapidly to iterate so that it's not a giant project but instead something that that we can experiment along the way didn't really fit well with a service first approach so no matter what approach you take you know we really encourage you to think about as as you are deciding how to go about building cloud native applications what what approaches can you take that give you the most agility so that you you can experiment you can iterate you don't have to get locked in to something that you decided several years ago and and there's no no way out so you know take on practices take on development techniques that help you with agility help us have us try to deliver as much elasticity for you as possible so that you're not in the midst of having to to control and to provision everything you know think about only the parts of provisioning that you really need where you know more than we do where that lower gear is going to be more helpful to your business and and where possible let us drive elasticity for you I can tell you it's a huge area where we continue to invest every place that we add a capability to let you provision behind the scenes we're also doing as much as we can to innovate so you never have to use that capability so we can just figure it out for you and then finally looking at the total cost efficiency which you know is hard and we keep trying to give you more guidance so that you can predict those costs but you know that's a whole range from you know the literal infrastructure how well you utilize the infrastructure can you use things like auto scaling or built-in scaling an EOS DISA T of services so you don't have to figure that out as well as your development costs so I know that's a whole bunch of stuff and it's a lot of practices and can feel overwhelming if I can leave you with one thing it's just just go build inside of Amazon we we have a leadership principle we call biased fraction and that's because there's nothing that teaches you more than doing things so we hope that you'll take advantage of these offerings just start building something give us feedback as you do we really appreciate you coming out and and spending the week with us at reinvent and here and will be available off to the side if anybody has any questions so thank you very much thank you [Applause]

Info

Channel: AWS Events

Views: 3,390

Rating: 4.8823528 out of 5

Keywords: re:Invent 2019, Amazon, AWS re:Invent, CON213-L, Containers, AWS Fargate, Amazon EKS, AWS Lambda

Id: IcXjZMRSCcU

Channel Id: undefined

Length: 58min 5sec (3485 seconds)

Published: Thu Dec 05 2019