Lambda low latency runtime | Serverless Office Hours

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] well hello everybody good morning afternoon good evening wherever you may be somewhere in our lovely world thank you so much for joining us live again together today for another installment of servess office hours we are streaming on the AWS twitch channel so welcome if you're joining us from there on YouTube servess land and also via LinkedIn live I'm Julian wood I'm a developer advocate for servus here at AWS and I'm joined by Speedy Gonzalo himself we'll find out why Richard Davidson Richard welcome back to Ser office hour how are you I'm good thank you thanks for having me again julan it's it's been a blast uh since last time I think we yes we had the pleasure of H remind me it was it I think we did snapart me right I think it snapart yes B me on checking it up but um yeah um yes so for people who don't know you tell us about your sort of Journey I know you're sort of based in Sweden you've been at AWS just over two years what's what's been your journey to what you've been doing at the moment yeah so it's actually two years now already so time time flies so I I uh currently working in a position where I help Partners to navigate the complexities and in intricacies of uh modern modern applications and how to best build them and deploy them on AWS so by modern applications we mean serverless and container applications so mainly helping the partners to excel with the that type of Technology on on AWS and AWS platform wow sounds fantastic well we're gonna get to what rich is going to be talking about today but just looking quickly back on the past week of servus in the world of AWS last week's stream was great fun Andy Booth from Human trffic uh Graphics joined us talking about Java AI ML on Lambda whole lot of words in there which you think would make things s uh really really slow uh Java AI machine learning models huge amounts of storage huge amounts of resources but Andy has hacked everything to do with everything and has come up with a really cool solution so and Andy was super cool and yeah doing this amazing machine learning where he's uh doing demographics data and that's all available on our servess land YouTube channel along with all other um servus office hours episode so yeah a great great B back catalog of super interesting things in terms of what's new in the world of a servus we actually two interesting things from Dynamo DB over the past week um the top one is resource based policies so this is cool this basically means that you can use Dynamo DB cross account and there's definitely been certain a chat among the industry of whether this is a good or a bad idea because obviously you want to maintain sometimes your uh bounded contexts and things like that within a micros service but if this unlocks being able to do migrations being able to connect something remotely or have microservices that can uh you know talk to some uh shared system this just makes it so much easier so that is great and then also um Dynam DB supporting private link so yeah this is Al great for people who want to connect to Dynamo Dynamo DB via their on premises workloads or um via their vpcs and they don't need to then have a net Gateway or something like that so yeah this is super useful for Dynam DB uh customers so uh great job in terms of what's new in the various compute blog posts that that have been coming out the second one down uh Dave Bo has put one a week or so ago about Cloud events so Cloud events is an open industry standard and using that with the Amazon event Bridge which has got great transformation capabilities and basically is just messing around with some of the envelope um format of the actual event and that's really easy the top one Andre stall has actually been on servus office hours before talking about chaos with Lambda well it's actually chaos experiments with Lambda so using the fault Injection Service in Lambda super useful to be able to add some latency add some errors and kind of things so yeah if you want to be testing out some strange how strange things would work with your lamba based functions that's a great blog post and um uh we are live obviously over here so please send us your questions and comments and say hi I obviously need to bring up the third one down because design approaches for building servess microservices I have spotted um that my 50 Cent who's otherwise known as Luca meser is on the stream so welcome Luca thanks for joining us and Luca and Matt Diamond have actually written a great blog post on design approaches for building serverless microservices and if you think your Lambda functions are too big or too small or right or or wrong these are some great consideration ations to think about uh my word of it is also the pragmatic Lambda function there's some wrong ways to do it there's some right ways to do it but they're all trade-offs between them and so Luca and Matt's post is is super useful and Robert tables thanks for joining us again Robert's a regular listener to Serv officeil um yeah fault Injection Service is really good to feel confident in production love it uh some other events that are actually coming up it is AWS Summit season so the Amsterdam Summit is coming up um is coming up soonish there are other Summits being added to this list um I'll actually be speaking at the London Summit I was working on my presentation today I'm speaking with Michelle schisel so that'll be fantastic if you are coming to the London Summit please come and say hello we' love to meet you um go to uh go to an AWS a partnering up for Eda Day event R an architectures day in London on the May the 14th uh we ran it about 18 months ago it was ridiculously sold out and was fantastic so this is installment number two um yeah loads of loads of talks are being worked on and some of been publicly announced so yeah really going to be useful for people wanting to build uh sort of a venten architectures also our friends at forer um are have a webinar on how to build seress enten architectures and scale and forer is a sort of seress database in the cloud and how they're using it with step functions and so yeah the link over there takes you to the registration page and tanbu who is our uh sort of top product manager for step functions and Ben Smith who's also our top expert on all things step functions will will be joining that webinar so um certainly take a look so that's the future well that is the past I'll sort of pivoting onto the future um Richard A Lambda low latency runtime tell us about this I know it is blazingly fast but what was the Genesis of this and why yeah that's uh okay I mean uh sure sure that's why we're here that's the the the meat I think of this talking about something completely different right yeah but first first of all I want to ask you what are you talking about at Summit that was interesting uh I'm yeah at the summit the London Summit I'm doing a slightly modified version of my servus best practices talk so um it's 45 minutes and we're just going to jam-pack as many servus best practices as we can so you can you know get some tips and tricks and it's going to cover Lambda and step functions and event bridge and you know how you can uh orchestrate and choreograph and your software development life cycle and understand some of the um bits of L that you can optimize and yeah some of it's definitely optimizing cold starts and looking at your code so yeah you are our Guru on that so I'll probably yeah and on that seg way like what is low latency runtime or LRT uh so first of all what's if I were to ask you one thing that you would you know that you see people are quote unquote finding challenging or seeing as a drawback with with serverless applications it you might say say it's cold starts right definitely I feel as though I've been I've been at AWS now what four and a half years and I feel as though cold starts just overtakes the oxygen in the room and yes it is a consideration but it is you know in my in my opinion you know way overblown as as a as a concern um you know affects less than 1% of cold starts we know that anything asynchronous you shouldn't really you shouldn't really care there's so many things from snap provision all to do that but saying that obviously there's always to do to optimize cold starts and specifically when you are running you know synchronous apis certainly it is something to to look at and you know we all all want to do the best to reduce it where we can yeah exactly and to your point I mean uh I think the whole conversation about cold starts it and the call that quote unquote problem it's it's not really that much of a problem right because when you take a look at production metrics you can see exactly what you say it's about you know 1% maybe two many case even less and so yeah uh so cold starts you know they typically Ur occur during the development of of your Serv functions when you do a new deployment you invoke it you can see it it takes some time however right optimizing for cold start is a good thing because the more you can optimize for cold starts you're likely to have less cold starts So the faster we can make any cold start means that we will have less cold start overall and that's the reason why is because because whenever your Lambda function is busy and gets get two uh invokes at the same time and hasn't hasn't had to at the same time before it has to provision an additional execution environment but as soon as it's done so and has two already ready to go it won't happen again right but if you can optimize for that so every optimization that we can do will be beneficial uh from both a user experience perspective developer experience perspective and also a cost perspective so in in this current landscape we basically have the execution model of uh you know supporting a whole heap of languages from java to python to J no JS uh you can even run custom run times on there we have net We have basically you name it we can we can support it right especially since you can build your own but my Approach here to try to tackle this problem is that rather than taking like an existing general purpose this programming language and execution model that are typ typically built for Server full or traditional server environments I try to take like a bit different approach to looking at it from the other way around so taking a look at you know the resource constraint environment that we have inside of Elis Lambda where we want to keep our resources as low as possible meaning the memory we want to keep the memory as low as possible from a cost perspective right with those premises How can I build something that are built for that purpose say 128 megabytes of of memory and the correspondent allocated CPU or vcpu that that that that represents right so less resources available and an execution model and a programming language what would what would fit that best right and the best answer is really go native build something in C++ or C or rust or something like that but in reality customers are are not going to you know rebuild everything in Rust because it's it's it's rather complex uh it has a steep learning curve uh it you know I I me myself that I'm pretty well worth with rust sometimes find it less productive than other environments and there's also a benefit of having something that you call interpreted approach meaning that you execute something straight from its source code think python think nodejs running JavaScript Etc so with those premises what what can I do right so I started taking a look at the at the problem from from this angle and I came up with you know JavaScript is a really good uh programming language right it's has an very active Community has a lot of support it's actually one of the most popular ways of running Lambda functions like no JS is one of the most popular round times with those premises how can I work around with that um and the good thing also about JavaScript is that it is has had a lot of innovation in the in the recent years so there's not only no there's also Doo and Bun and a whole heap of other run times and engins that exist that is able to execute JavaScript you can even run JavaScript in Java which is quite funny um yeah people sometimes confuse them to be the same thing but you can actually run JavaScript inside of java in in that runtime right uh but there is also a lot of really lightweight run types uh and lightweight engines so there's a difference between an engine and a runtime so what I what I did with with this project is that I took a JavaScript engine something that is uh is able to execute JavaScript but it has basically no apis so it has no console no file system no networking nothing like that but it's able to execute JavaScript and it's very very lightweight this project is called quickjs so it's basically a huge like C file 50,000 lines of C uh but it's very very lightweight so the whole engine itself comes in in in you know under one megabyte so in comparison with with uh what's running inside of node.js and Doo on bun it's you know up to 20 30 40 megabytes so imagine being 40 times less complex in a sense obviously that has some drawbacks but we'll talk about it later but that means it's a very lightweight package so can I take that and build some apis on top of it and also have a different approach here I'm going to use rust because rust can compile to Native or actually compiles to native code meaning that it will be very very fast and will have minimal coar so combining the two technologies rust with an apis and a lightwe JavaScript engine that exposes those apis and at the same time try to make it compatible or partially compatible at least with the noj specification meaning that you know it should be as low effort as possible of porting your existing node.js applications and make them brand on Landa with minimal minimal C starts so uh let me let me share share my screen here yeah certainly I mean just to pick up on some of the some of the uh threads in there I like the sort of focus actually on cost because you know a lot of Buzz has been happening all about well this is super blazingly fast but you know as we know with Lambda cost is directly uh coroller to speed so um you know anything that you pay per gigabyte seconds and the amount of memory configured so less memory uh less execution time is going to be lower cost and yeah I know we have customers that are you know running Lambda at absolutely ridiculous scale they're not necessarily complaining about costs necessarily but you know if you can shave milliseconds off of their run times at the scale they're running you know that's significant dollars that goes back into back into the company to do other kind of things exactly and it's not only just saving we talk about cold Stars right so it's not only just saving of the cold Stars which in fact can be up to an order of magnitude and that is like 10 times faster and even more so we will see it later in the demo here but it's also for the warm starts yeah and and the reason for this is again this simpler approach a simpler engine meaning that it actually works works better if we have low resources but if you think about it it's kind of it's kind of obvious really because nodejs bun Dino all of those run times they their fundamental components the engines they come from web browsers and a web browser is not purposely built to run you know the technology is not purposely built to run in a femal environment something that starts quickly and gets discarded quickly so they're built to be run in web browsers doesn't really matter if your you know if your Chrome browser starts in 500 milliseconds or 200 milliseconds nobody's going care right because you spend hours and hours in there and it will optimize over time so these all rely on a technology called jit or just in time compilation so meaning that every time a JavaScript function is called inside of the engine it will be analyzed by a profiler and the profiler will generate a machine code that will then get executed this is great for that long running task and for sustained workload when you do you know hundreds of thousands of iterations or even million iterations or browsing the web for hours right but inside of Lambda with that low memory uh memory footprint with that low memory resource we can't really rely too much on that fantastic uh piece of technology because it it requires a lot of memory and we can use that precious memory and the precious compute required to do this just in time compilation to actually do execution instead so this means that you know we conserve a bit of that that execution time for Lambda function in in a time so rather than profiling and trying to find an optimized way of of executing your JavaScript function you can actually start executing it right away it doesn't really matter if it's a you know if it's slower over time because we we're not really using that over time we're using it for for a short term refreshing the runtime you know very regularly so it's all all very often yeah exactly and and the environment itself it it uh you know first you have a cold start when a new execution environment is provisioned then it lives for you know x amount of minutes and that can be anything from like 5 minutes up to I don't know you know one one hour or something like that it's really not defined it depends on a lot of factors but it's still you know with with few resources it's it's not enough that will give you that the same level of benefit as a sustained uh constant or a traditional running application inside of a server will do right yeah so it again it this has some drawbacks the drawbacks is that if you do something for millions of iterations hundred thousands of iterations you do some calculation uh compute heavy things in JavaScript you do hash calculation the Sim monali simulation you your performance will will suffer using LRT you will have a better experience using node but the majority of times uh Lambda is maybe not the best way of doing that type of work anyway right and if you're not doing that if you're doing um Integrations with databases DEC or if you're doing some sort of data transformation data enrichment that glue code you know to communicate between different services or talk to other Downstream apis you will have a positive user experience and an order of magnitude fast to cold start for most use cases and you know two two times lower cost for for warm executions as well and we will see this uh shortly in in a demo as well yeah and I actually I like that Focus actually on the warm start performance because I mean as I said earlier you know the oxygen seems to be taking out the room on the cold starts but yeah what I what I do like about people who are investigating cold starts is that makes them sort of wake up and find out and and make all of their code more efficient which is also going to impact their warm starts and so although we talk about you know cold starts needing some work and being able to reduce that you know some of that work that people are doing to reduce their cold starts definitely impacts the warm starts and that's actually probably the better story about being more efficient with your code and so you know LRT is taking that sort of to the extreme by you know optimizing it as as much as you can and you know I like even the bigger concept of sort of starting from scratch I love that idea you know servus is all about building on top of Primitives well this is starting from A Primitive that's built for Sous and only adding the things you need rather than being as general purposes maybe needing everything yeah and and then then exactly then then of course you know what's the catch right so this has to be some sort of some sort of of caveat with this with this approach and yes uh like I mentioned one of one of the catches if you do compute heavy things in JavaScript you do say you calculate the hash value but you're not using the the native apis that I provide you do it inside of actual JavaScript manually so to speak right that that will have a huge benefit of running through a Justus in time compiler with this optimization pass that the profile runs and starts getting better and better over all of these iterations but again if you're not doing that that that's uh you know you're fine uh the other caveat is since this is a lighter engine and a lighter approach I can't support every nodejs API you know node.js has been around for like 15 years yeah it's a huge API and their focus has been on backwards compatibility as well right so they they have a huge API that works on I don't know how many OSS and devices it's it's you know 10 20 50 I don't know it's a lot of you know all Linux or Windows or Macs everything work browsers whatever right it's it's it's fantastic uh so this is a you know it's not it's not meant to compete with that platform neither is not meant to compete with bun or Dino but it's meant to be as a complement as an alternative when you're doing API integration when you're talking with databases Decay which in fact is included and optimized specifically for LRT we see this also later in in uh in the demo but it takes a different approach um so obviously the tradeoff here is can I keep it lighter and this is by Design the tricky part is to determine what apis should I support and what API should I not right because every time I Implement a new API or the team implements a new API it comes with a cost it will add slightly slightly slightly to the to the cold start so we want to keep it as light as possible so right now the entire LRT binary when deployed on a Lambda the entire runtime with all the apis and the sdks that we support are less 3 megabytes in comparison no JS itself without um the SDK is like 80 megabytes so you can just see you know by by looking at the size of the binder itself is a huge difference and this again uh what you get in return is that really fast startup time and that really low cold start but also that sustained constant warm performance that is actually lower with with con research environments and then someone might ask okay but what if I crank the memory up right I I give no JS and ordino or ban the memory deserves I give it two G gigabytes is it still faster uh the reason the answer is actually uh it's it's about the same same speed because I also get more resources to to execute with LRT but now if you do like API calls you're going to spend a lot of those precious gigabytes and CPU but just waiting for the waiting for exactly waiting for the latency of the API call which can be 10 20 milliseconds depending on what you're talking to right so now you're just wasting CPU time so you're not going to see any cost benefit and I've tried this you know I try adding you know back and forth to seeing what what what would happen and I haven't find a single way uh where node is faster than LRT with all the tricks in the book right and there's a lot of tricks you can do and I know most of them I didn't know about the environment variable uh thing until recently but that's also something I tried every trick in the book and since I know that I'm building for Lambda and and it's an environment that I'm very very familiar with I can incorporate all those tricks inside of the runtime so the customer doesn't have to know about it the users of LRT when deploying on Lambda doesn't have to know about all these tricks a good example of this is when you import a an adabas SDK client inside of LRT what happens in the background is that it it initializes um the connection to that service that you're about to call since imported and since we're uh initializing the connection outside of your Handler it gets better performance because Lambda gets a bit more and get a bit of boost when it's initializing so we utilize this boost and the customer doesn't have to do anything right so that's like that saves like 20 30 50 milliseconds e because it's it takes a lot of CPU resources to initialize a TLS connection that is like a secure SSL connection um so things like that that I can incorporate and I can take database SDK convert it into bite code for the engine that is like a parsed format that the engine first reads through your source code converts it into bite code and execute that I can do it at at compile time so I save an additional milliseconds and I do a lot of the small small tricks that you don't see and doesn't make any difference to how you write the JavaScript application or how your source code is defined in fact you can even run you know in the demo we see this you can take the exact same code just switch run time to LRT and you get you know 10 15 time coold starts Improvement and 2x warm start Improvement by not changing a single line of code and this is like a Dynamo DB call uh which I think is is you know really showcases the potential here excellent well I mean we we sort of teasing a demo maybe we should at something I think why not not getting itchy to go show me show show me the money show me the money yeah exactly enough talking Rich Show show us show us the stuff right so let me share my screen so first uh yeah while you bringing up your your screen one of the the interesting things and uh I think it was B ruse who used to work at AWS uh met him at reinvent he was super cool about the discovering that environment variables actually add a little bit of latency and that's because using KMS to decrypt the environment variables and so it takes some time so I I love how the community finds these little little things that that that work with Lambda and just just so you know internally one of the things that does slightly grate me is when you know people in the industry say oh well of course Lambda is not going to be as fast as possible because you know we charge money by the millisecond and so we don't have a an investment in making things fast because we lose money but I mean I tell you that the opposite is entirely true the the amount of time and effort that goes into shaving you know milliseconds nanoseconds of SDK calls internal processes and services to make Lambda literally as as fast as possible um you know is a massive driver for for what Lambda does and internally we see you know emails coming through where you know they'll shave off you know even 1% CPU usage for some maybe it's going to be an observability process you think 1% performance never mind for customers but across our trillions of invokes a month you know that is a massive uh that is a massive resource and performance boost for everybody so yeah so please never never think lamb is deliberately making anything slow to make money uh no yeah and I think this is like really good observation also Julian before we dive into the future highlights here in demo that you know we're actually in uh incentivized the other way around so we're not incentivized of making it slower because we make money of Lambda we're incentivized of making it faster because customers save money on Lambda that might sound like intuitive but it actually makes a lot of sense because we see like if we provide a great user experience with Lambda and if customers can run really really cost efficient code they save money by switching to Lambda they're likely to you know invest in other database products or use a database or use something else or move their workloads uh over to adabas because the experience is good and we we care about our customers and user experience and also the developer experience so if you can make it faster it's it's a win-win situation our our customers get happy the user experience get better they get lower cost and they're likely to reinvest and and you know um doing Innovation across their other places of their stack yeah Robert tables hits the nail on the head um yeah I don't think we're hiding anything by saying if you offer a good customer experience uh you know people are canna use more of what you're doing so that's that that's someone said here also Robert in in the comments like if if you yeah smart spending if if you save money on on like compute and on one service you can invest in other products which also needs somewhere to run and if you have great experience on on serverless and on Lambda why not continue with that right so it's again it's a win-win uh so cool let's dive into to some of the feature highlight addressed most of these already but just summarize here we see you know with LRT we see negligible call start so most most of the calls will actually happen in less than 100 milliseconds inside of database Lambda right um we LRT supports JavaScript or ecmascript 2020 it's not the latest standard uh I'm working on an update through 2023 which is the latest standard uh technically it's possible but working on that so it will come and it supports a lot of nodejs apis obviously not all of them but basically the most you need uh file system even you know us to be sockets fetch so the goal here is to have as low as migration effort from rojs as possible it comes in included with with the the most popular adabas is the case not all of them but the most popular ones and you can always bundle your own itas is the case if the ones that I bundle and embed inside of the runtime uh doesn't Feit fit your need and it comes with I think the most interesting part it cost and performance benefits so we see those lower cold start and even uh lower sustain warm start and lower costs on those as well right under the hood we talked about this as well using quick JS it uses an asynchronous runtime for rust it's built in Rust using that's named to too and uh the apis that are that conform to the node specification they're all implemented in Rust in contrary in node most of the apis are actually implemented in JavaScript it's heavily dependent on JavaScript whereas LRT try to make as much as possible inside of rust for that added performance benefit and the lightweight uh of the of the application and it's about three megabytes including databases DEC now I think it's time for a demo or do you have a question well I just have one when you rebuild when you rewriting a whole bunch of stuff in Rust I mean is is that a complex job to are you literally looking through the API spec for node.js and reimplementing the code uh uh in Rust I'm sure it wasn't e as easy as just taking a gen model and going dear rust de jsd converts over to over to rust uh no it's it's not yeah I wish almost yeah but it's not that it's not extremely complicated it's it's what most complicated is of course the lowlevel stuff like file system and um networking and things like that and also the event Loop but I also rely on dependencies that makes this really simple there's an excellent uh dependency for for us that is called R quickjs say it's rust bindings for the quickjs engine that that abstracts away a lot of the complexity then I simply which is not you know it's not a simple task but I have to convert apis into into Russ specification but the the crate uh which is the name for independency rust Ma so shout out to to the team behind that and this there's a guy from um sural DB is MB out I'm have conversation with him and also a few PRS to that project also as part of this so yeah um it's um uh yeah it's complicated but it's still it's it's manageable right so that's exactly what I do I look at the node specification I see how it works and I I implement the same thing with l LG and I pick pick the ones I think are most relevant for customers and that customers want right so I think enough teasing right now it's time to go into hope my screen is visible let me bring out you can see my browser hopefully let me zoom in a bit is this visible uh yes that is visible well maybe one more for the for the people one more for the guys with with glasses I have glasses myself so I just hide them abstracted them away behind lenses so um okay let's start with going back and let's see I I'm changing the runtime settings here to nodejs 20 which is the latest one just to um see how how this code would perform using the regular node.js runtime so if you go through the code real quick here this is a very simple example but it kind of showcases because this is kind of a real world example I'm putting some data on Dynamo DB and I'm returning um the status of 200 and a body of okay so what the data I uh put on on Dynamo DB is actually the event that gets passed into Lambda so I took the event here I convert it into Json or Json stringify it and put it inside of an object called content together with a randomized ID in here right that's I'm doing it might sound trivial but there actually a lot of things happening here so let's uh see how this would perform using node.js regular run time and the event that we're GNA it's just a simple you know uh demo hello Json event so let's test it with no JS runtime so we're executing and I'm using so also because you've changed the runtime configuration this is a whole new Lambda function in effect yes starting from scratch again yeah exactly we starting from scratch we can see also that we started from scratch because we had this init duration here so if we have a cold star we will see this init duration here the value I'm highlighting so we have an init duration of 354 milliseconds and we have a duration of 970 milliseconds and the total duration is actually these two values combined so we have the duration the code that happening that is happening or the measurement I should say that is happen inside of your Handler and this is outside your Handler while it's being initialized we combine them you know one say 1.2 seconds right one 1,200 milliseconds something like that uh we have 128 megabytes of memory configured so the lowest possible value for for uh for Lambda but we used already for the simple example used 88 megabytes so pretty you know we we're not we're not having loads of memory over yeah but let's now switch to LRT and I prepared this demo by putting the LRT um binary and rename it bootstrap here you can download LRT from our GitHub we will have links in the show notes and also at the end of presentation here so I've simply downloaded the zip I'm running an arm function here so downloaded the the arm build of LRT H it's actually named bootst bootstrap already and I just put it together with my source code here that's all I did then I shange go down here to runtime settings I edit them and I select uh Amazon Linux 202 always only runtime for rusco custom Etc that's all I'm going to do I'm going to hit save I'm keeping on arm I get slightly better performance on arm and slightly lower cost um so let's go back to test so I didn't touch the code I just leave the code as it is no changes to the code I go to the test tab again this will be a cold start since we changed the the runtime settings I'm going to test you can see this was really fast now we have an in iteration of 35 m milliseconds and a duration of 35 milliseconds and in total we have 71 so remember 1,200 almost in total versus 71 H so it's quite quite a dramatic increase in performance here and I have the same code I do exactly the same thing with the code and putting the same event on Dynamo DB and have a max memory usage of only 20 21 megabytes so if memory service right it was about 88 megabytes for noj so it's it's you know orders of magnitude faster uh let me see 1,200 / 71 16 almost 17 times uh Improvement here um so not 1.7 but 17x Improvement running the same yeah and doing this for for you know if I continue here for warm start we can see 18 17 20 8 11 Etc really really really low right for these warm starts as well the biggest number is this is not just a hello world this is not just turning something into an API this is actually writing the event to Dynam DB as well so you know there's always going to be latency over a network to write a d I know it's fast but yeah yeah we can see here it's about prop Lambda function in a way yeah proper Lambda function it's a proper use we getting you know maximum about you know 20 25 I think I think that's saw 50 for one but that was when the connection uh was was not cached now the connection is established we can see you know really low about the same between say you know 78 to up to 25 say let's say that that also go back here and now see you know the warm star performance on OJs just to get a comparison uh no js20 I go back this will now be a cold start again so there we go go back to thanks frag zore for joining us via twitch um yeah Capal we all talking the all talking the ready fast stuff so this is cool to see yeah and now we see again it's about the same we got 1,200 again and consed 88 for for the uh for the coldstar so let's let's now see warm start you see the first Warm start evoke I got like 50 on LRT I get 400 here and 61 26 62 50 37 45 you can see we we got about you know 25 something like that the reason here why it jumped so much up and down is is it's because of the Jus in time compiler so we'll talk about that and also that that first that first Warm start was still 400 milliseconds because that's creating the initial connection to Dynamo DB which you're optimizing in your run time by running by transparently just running and that in the unit exactly exactly right so there's a you know less consisten here so if you go back and see you know what if I run this for a very long time if I run this for let's go go back to the presentation here so if I if I run this for uh 32,000 invocations with the same same code right and I see here here let's let's bring out this numbers we see LRT with 32k invocations and this is we see cold start on the first uh or sorry on the second row here and we have we have warm start on the second we can see the P99 meaning that 99% of all invokes are below this number we see the maximum you got here was 84 milliseconds and um for for a cold start and the P99 for worm start was 34 and similarly the fastest possible we got for a warm start was five and 50 for a cold start forrt so 50 milliseconds compare this with no JS this is 18 it's not uh you know 20 it's slightly slightly faster but not dramatically I haven't updated this I should probably do but in comparison we see here we' got 1,300 for for P99 and 1,100 so it's a huge difference you can see it's a 23 times difference between the best case for for a cold start for RM no JS and for worst case the worst possible P99 we have a 15 time Improvement and also notice here the number of cold starts that we see since it's so fast like I talked about in the beginning of this call we're also much you know actually much more likely to hit the cold start in fact it's five times more likely to hit a cold start when running no JS with the same code um and also we saw this fluctuation so if you take the the slowest uh or the fastest possible time for the warm start here and the slowest possible time for the warm start there's a fluctuation of 158 that's the span of uh of which the duration differs and take the same look at the same number here in LRT it's only 29 so it's much more consistent get this consistent performance this will um you know correspond to Great not only savings but also of course user experience oh sorry if we go back here we can see that the to total duration of everything executed if I sum up these 32,000 invocations for no. JS was 27 minutes and 40 seconds counting all the times of 32,000 invocations but with LRT was just 7even minutes so we have a cost saving of 2.9x and a time saving of 3.7x and the reason why these two numbers differs is that you're being charged a bit differently depending if you use a customer on time or if you use a provider run time so it's actually the difference between build duration and total duration but regardless we almost a 3X cost saving using thect same and also with the with the oso KN custom runtime you know you are paying for in it as well so yep so you're paying that little bit extra but because it's so much faster you're still saving a bunch of money yeah you are exactly and you get again you get it there's no surprises here you get this consistent performance uh only a difference between in 29 milliseconds right so I think we can end the screen share here well I'm I mean fantastic what I like about that is that the demo is actually so simple where you're not even having to redeploy or change any of your code you're just switching between the L the run times in the console and uh and it's over there and I mean the the bootstrap you you just need to add the bootstrap file into your function that's you don't need anything else no exactly that's it and of course I mean this was you know a prepare demo so I know that the apis that are in here are supported I'm using theab DEC uh you can check out the LRT repo that will be in show notes uh that that has a list of all the supported sdks but the supported sdks are supported in fully right so you can use all of the calls and all the thing that we saw Dynamo DB S3 uh we have KMS I think we have yeah a lot of a lot of SDK clients that are embedded and you can always ship your own as well um we have The Primitives that support it that will work on LRT so um yeah um so if if people are thinking obviously you know scratching their heads thinking this is fantastic save money be faster but they they don't know whether their API is supported what what kind of things can they check what what what would be something that wouldn't be supported that you can straight away say um you know heading up the wrong wrong tree doing this um obviously you can test your functions and see if it works but just for people who thinking about that what should their thought process be yep that's a very good question glad you asked it uh so the first thing you do is is you know upload loading uh your code and testing it with LT is not the smoothest user experience right because you need to do a new deployment every time and then oh you find an aror you have to find you know where where does it does it crash Etc you can actually run it locally on your machine so it supports running on uh you know arm arm Max or Intel Max or Linux uh machines both for arm and x86 as well uh there's no Windows support unless you use the what's it called The W WSL Windows sub exactly yeah then you can run it on Windows as well so that would be the simplest thing there's also so how do you how do you actually run it locally is that you download the their version for your OS there's a Mac build on the release page on the GitHub so you can just you know just a binary so you you binary okay so it's not so it's not even having to use a you know Docker image with some you know pretend of Lambda or you know Sam has a local invoke AWS Sam is local invoke which is all going to be useful you don't even need that you just literally running a binary yeah yeah yeah yeah yeah yeah exactly and LRT will actually detect if it's running inside of a you know Lambda environment or not so it will automatically have the the um the the Lambda API Handler and all of that built in uh so it will detect if it's if it's running inside of landai will perform certain things and there's actually a special build for Lambda that has is a bit more optimized and the one that for local development is optimized for for that environment but also it has a built-in test Runner right so it supports running unit test with expect and and just like API not of course everything from just is supported but you can check out the the test folder in in the repository and you see how how our tests are done you can do the same thing just importing apis running it locally just point to you know an index file and you would see what works uh what would not work right now is uh streams so uh web streams or no JS streams are not supported we have a uh fullback so you can use streams inside of JavaScript but remember as I told before streams is an API that get called thousands or hundreds of thousands of time you know for basically every bite that transer trans flows through theed optimization buil yeah there's an optimization opportunity here and we're working on it to converting that into native code right so building the streams API in Rust and that's like a huge Endeavor but we we're getting there where that's that's the ambition to getting this this done and speaking of compatibility we also aim to have something called winter CG compliance and winter CG is a working group that uh tries to to to have a a common set of apis that are supported cross run times so regardless if you're running like on Lambda or anywhere else uh or in any any other engine I should say so Dino bun node what have you LRT should you should be able to execute the same code and have the same you know set of expectations and that is called winter CG it's like a work work in progress but there also a working group behind it so there's a lot of comple behind that so that's also something that we target for LRT to have it winter CG compliant a web interoperable runtimes community group yeah there we go that's a that's a mouthful right there winter CG sounds much better yeah that's why people just use that exactly winter so check it out it's uh it has like an API there's even a I think there was someone had like a comparison between all the r times and which which CG's API that were implemented and supported and whatnot so uh we can see if we can find links to that as well uh but the simple answer to get start is is to test the test on your local machine first is way way easier we're also we also working on providing uh typing so if you use typ script okay that will make it easier for you to see you know to see what's supported inside of the engine and there's also check out the the demos I know that you know people say okay yeah this is a you know demo in favor of LRT because uses database SDK Dynamo DB you know I don't use that can you do something more cool yeah that we have we have a react to-do list that is also running you know everybody does to-do lists in react but it gives you a good comparison way we actually stor the to do in inside of Dynamo DB it's service side rendered it runs inside of LRT in in comparison We compare that with node and it's still it's still massively more performant and you know react is not a there's a lot of JavaScript in there it's like I don't know how many thousands of lines but it's a lot of JavaScript and it's still you know much more consistent performance and it's also um lower latency than it is than it is with node so still you know rather heavy JavaScript application will still benefit right but of course there's a limit how much ja can I shove in there when I will have an edge of using something else so that's no the simplest answer would be to you know start with LRT say where you hit a corner if you hit a corner you can always switch back to node because we have the same specification of the apis yeah so the whole goal behind this is to not have your efforts wasted so I don't want to I don't want people to you know they invest a ton of time and try to make it compatible with LRT and then they run into a corner oh no this is not supported then you can just switch back to node because they follow the same specification that that you know no effort wasted and you can mix a match right you can have one Lambda function using L another one using node even across versions like you can do AB testing and see you know it does it work for you will it crash how does it perform over time and I do also want to highlight that this is still an experimental amount time it's not an official it's not an official databased product it's still something that is quite early it's like a Beta release right now we're still finding bugs and issues and I really appreciate that everybody taking the time to test it out the best thing you can do is just write an issue try to make it reproducible in an easy manner that will help us and also just raise you know feature request I really want this API I really need this ads SDK support hey Richard and the team please help me and we will do our best to try you know uh to support that we're working on streams we're working on uh cryptography to adding yeah you know to allow them that to be used as a as an API Gateway authorizer or LRT to be used as an authorizer which is fantastic use case since it's you know sits at the edge of most apis yeah yeah and it needs to be fast because that's latency on everything so there's a you know the best thing you can do is test it out and provide feedback and I appreciate all that feedback that's been coming through it's been fantastic so far and really encouraging keep it up well actually based on what you said you know Graham Campbell thanks for joining us grahe you know when will this be available for landroid Edge and that was you know back to your previous comment this isn't an official runtime you know we're in ber testing at the moment but you know the Lambda team is certainly looking at this and going you know you know what can we take from this you know uh um usage of this and uptake of this and interest in this will certainly Drive uh you know product enhancements within within lambra as well yeah specifically for lambro Edge you know lambid Edge does only support support no. JS so as this is a custom run time you know that's and python right yeah and and python yes yeah Python and so yeah unfortunately when is uh we don't know yet but um no we don't know we'll see we'll see what happens right again we're trying to make it as best as possible so the more feedback that we get and the more customers are doing it using it I mean it it only help right it only help people pushing it forward and of course there's also a slight benefit of having it as a provided runtime even you know we can cash it better we can have even lower latency because it can be uh live closer to the workers than than custom run times but that's also one of the reasons why we make it really lightweight because it needs to be shipped together with your code right so it's important to consider these things to look at also not only the time you see inside of Lambda as you saw in My Demo but look at the end to end latency actually you know flowing through API Gateway or doing an invoke from uh your local machine how long does that take because that that will affect things like code size size of your run time all of that so yeah measure and see is the best best advice I have and also tweak the memory settings or use LRT and tweak it down right that's the best thing you can do right I can see the future requests of 128 make for Lambda that's that's too much we need we need to have less memory please because all of my L run times are shorter yeah so I just wanted to quickly cover of some of the deployment options as well obviously it runs on a custom runtime uh but I I understand that there two different ways you can add the bootstrap to the custom runtime one is just including the bootstrap binary uh along with your JavaScript code which you showed in the demo but the other is just adding a Lambda layup so you can you know if you want to not um upload your bootstrap for every single function but you want to just maybe maintain that centrally uh you that's a possibility so you can also use a Lambda layer to just you know add that to your Lambda function and your llr runtime is just there yeah you can just what you can do is if if you go to theit repository you go to the downloads page uh you click on the latest release and you down load LRT Lambda say AR then you can upload that straight away to a layer because it already contains the file name bootstrap and Lambda will automatically detect that you know your your layer has a file named bootstrap and you're running custom runtime I selected that as a runtime and it will just point that as an entry point it will use that bootstrap file as an entry point and you you golden you're good to go so that's maybe the simplest approach if you want to test it like right now in five minutes that's something you can do and also of course you can package LRT in a container image if you're building your Lambda functions with container images some people may think oh that's a bit strange because container images is all about large package sizes but I know many customers who run you know have huge big files that they want to process or or you know they want to use huge amounts of memory I think I don't think these things are necessarily orthogonal maybe you've got a big machine learning model that you want to you or copy elsewhere I don't know do do something like that but you know you can have big container images and still use the performance of the LRT runtime within a container image and that's also just looking on the GitHub Reaper that's also you know super easy to to add it you know it's just a few Lines within your Docker file and you're ready to go yeah exactly you can again add it to your Docker file uh again I I encourage you using a very lightweight Docker file to use something like Alpine or bissy box even as as a base image since LRT contains everything you need you're good to go uh there's a Sam demo there uh cdk as well cdk demo s demo s with container and with sit you can just run you can have a local invoke there as well and it's you should be good to go so uh so also one of the things that people using their node modules for their uh dependencies uh you know one of the recommendations I see is obviously not to deploy your node modules without bundling modification and tree shaking because then you are just adding bloat into your package so there are some you know using different bunders from es build to uh to webpack um if you are and rollup as well yeah so you are yeah you need to do some work if you're going to be adding more dependencies because the point of this is to be lean and mean the more blo you add your Lambda function you're sort of going in the wrong direction yeah exactly and also one of the reasons is the LRT doesn't support the nodejs module resolution which is a whole beast in itself like how how it resolves modules it can import no modules for sure but you have to like be explicit about about the path so it doesn't use the same same mechanism as no you can't just put all the node modules in there and expect it to work they won't work because all apis are not supported but also adds blows it's kind of count counterintuitive I also get a common question here we just you know um why don't why don't it run typescript or can it run typescript technically it can it's not that complicated because they're already typescript transpilers build for rust I can add a dependency transpile the code and make it run typescript but I don't really want to because that adds time again to this precious Cod start and to the precious init time so we we rather see it that you do it once when you deploy your code and then get the benefit of EX executing already transpired code otherwise you're paying a performance penalty for every cold start so it's it's you know again following the best practices of moving as much work as as possible away from the actual init uh and invokes inside of London doing it early that is the same thing build a cash or something if you can do it ahead of time if you can do it at compile time that's all the better yeah and yeah just quickly looking back at some of your comments really appreciate they all coming in velson for said uh this was earlier on when we were talking about cold starts yeah snap start with Java for business logic and some of your commands and then LRT JavaScript for queries seems really good yeah just just different ways you can you can use that as well Mi a mat yeah Tech Hunter kale said this is slightly differently but I wanted to cover this uh just joined I have a lamb welcome thanks for coming via twitch I have a Lambda function on TI script I need to start a step functions execution from an API Gateway call the Lambda needs to respond under three seconds how we do this seems like the whole execution times out due to cold start any caveat for this if you I would say if you are having more than three seconds cold start with a typescript function you've got a lot of you've got a lot of stuff in that typescript function you really uh there there would be ways certainly to respond under 3 seconds um I would look at what your typescript is doing if there's not if they know if you've got um using the modular SDK for example for JavaScript version three remove everything you don't need um if not look at minifying tree bundling tree shaking all of that kind of thing to get your size down but this is very possible of I mean you know it's 100% possible even on um even on David's sorry on Richard's uh sorry got your name's the wrong way around even on Richard's demo you know we were seeing 1.2 second um Lambda function runs going to Dynam medb so you know there proof that that can definitely possible yeah so my my three three hot takes like Jullian said be mindful of the dependencies do as much as work as possible early so minifying tree shaking bundling you know upload a single Javascript file yeah and finally do as much work as possible outside of the Handler can you do work before your event comes in meaning that do you have any work that you can do that is not dependent on the on what's inside of the event yeah so you know creating new instances of the API clients um initializing connections even doing a fake call with some mock data to to this just to initialize the connection yeah and Rob you know avoid table scans if you're talking to Dynam DB uh literally in 10 seconds Ryan Kack hello Ryan how are you LRT and the Lambda web adapter is that possible to work together yet yeah sure wouldn't wouldn't make so much sense because that you will add you know the we after self would be more heavy than LRT so would just add additional binary and it's more meant for web adaptor is more meant for having uh a server application or you know server framework and that don't don't talk directly with with Lambda so it wouldn't make much sense cool quickly uh seconds to go building servess applications with terraform Anon and the buses are on next week uh same time same place in fact slightly different time because American clocks have changed again so if you're in Europe or outside the US it will be an hour later uh but Richard thanks so much for joining us on servus office house amazing work uh love fast stuff in Lambda thank you very much and everybody see you next week
Info
Channel: Serverless Land
Views: 702
Rating: undefined out of 5
Keywords:
Id: RhnofoWc9y8
Channel Id: undefined
Length: 59min 33sec (3573 seconds)
Published: Wed Mar 27 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.