AWS re:Invent 2023 - Keynote with Dr. Werner Vogels

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[sound effects] It's for you. Werner. Hello, Werner? - Who are you? - I am the architect. I've brought you here to see this, - my system. - What, that? Is that a Perl script? Listen, Werner, there's a problem with the system, ergo, I've brought you here. It's glorious in almost every single way. Every single one of them are my babies. I've even named them. However, every time I add a new service, I have to build a new rack. Ergo, it's becoming very expensive. Also, it's very hot in here. Have you considered cloud migration? Cloud migration? I think we need a montage. [music playing] No servers. More! Do it again. One more time. Come on. Wait. So, you're saying I can scan my container images straight from my CI/CD pipeline? [music playing] Yes! Look, you need to build with cost in mind from the outset. You need to be The Frugal Architect. The Frugal Architect. [phone ringing] - Werner, wait. - What? - Can I get some free credits? - No. [sound effects] Please welcome the Vice President and CTO of amazon.com, Dr. Werner Vogels. [music playing] Wow! I'm absolutely blown away. This is day four. You're supposed to be in bed, yeah, and I really, really absolutely, I'm totally humbled by the fact that, I mean, I arrived here at 6:00 AM this morning, and you guys were already standing in line. That is, I don't deserve that, absolutely, but who do deserve an applause, by the way, is the quartet that just played. So, can I get your hands again? [applause] Someone else, some other people that actually need you to give some applause, can I get the heroes to stand up on the front line? Yeah. [cheering] These are invaluable members of our community. They make it what this whole community makes us special, and hopefully, also, why you're all in this room today, and especially, I want to a shout out to Luke who got a Now Go Build award this time around. [cheering] Okay. Thank you. I know you probably all cringed a little bit when I said cloud migration. Now, after all, I actually noticed, that for quite a few of the people in the room, you've grown up your career in cloud. For you, there was no pre-cloud, no hardware, no constraints like that, and but, you know, the great thing about sort of moving out of that whole hardware environment into the cloud was that we suddenly could build these architectures that we always wanted to build. Now, we no longer were constrained by, you know, the physicality of all those servers, and also, I could actually, as a distributed systems guy, to actually start talking to my customers about how to really build reliable, big scale systems, and but, you know, there was something about that all throughout. If you get as old all as me, then you start seeing the past with a little bit of rose-colored glasses. There were things in the past, actually, in this world, that we had these hardware constraints that actually drove a lot of creativity. Now, if I think about, and I'm going to talk a bit about to you, about the cost of all of those kind of all the systems that we're building, and I'm going to drive that by actually my experiences as the CTO of Amazon for the past 20 years almost, yeah, and if I think back about, being in the pre-cloud days of Amazon, the retailer, we were really good at predicting sort of how much capacity we needed. We could sort of maintain the decision to make sure we had 15% hardware over the expected peak for that year, but still, nothing could happen to us, nothing unexpected. Now, I remember that one day, in the days that Wiis and PS4s were very scarce, someone posted a message somewhere saying, tomorrow, Amazon will have 1,000 Wiis for sale at 11:00 o'clock in the morning. Well, you know what happens at five minutes to 11:00, eh? F5, F5, F5, assuming you're a Windows user, yeah, and so, traffic explodes, and absolutely, we really had to… we worked our way around it, and we were very creative in trying to solve the problem to make sure that all the other customers still could be served, but still, you know, it'd be quite a lot of work and a lot of handholding, but more importantly, there were also restrictions on business innovation, because of that. Because it's a few weeks before Black Friday, a team would come to me and say, oh, we have this brilliant idea to do X, Y, or Z and it will give us, what is it, additional revenue at this much. You would scratch your head and think, like, how are we going to do this, but we always made it work. There was sort of an in sort of building these systems and living within the constraints that you had. Now, cloud, of course, removed all of those constraints. Now, suddenly, you were no longer constrained. You could do all these things. I didn't have to have long conversations with the business about sort of reducing the footprint of things. Instead, you could do everything, and as always, when constraints get removed, when we throw off the shackles of something that keeps us down, we have a tendency to swing this pendulum all the way to the other side. Suddenly, what is the most important thing is actually to move fast, to get new products out, to start thinking about all the things you could do now that you couldn't do before, yeah, and that's amazing, and we have seen amazing innovations in the past 15 years happening on top of AWS, but as speed of execution becomes more important, we kind of lost this art, this art of architecting for cost and keeping cost in mind, and if you've, this is your 12th re:Invent, just like it is for me, you may go back to this first re:Invent in 2011, and I put up these sort of, I think there were 12 sort of tenants that I thought you should think about when you were building for the cloud, yeah, and one of them was to architect with cost in mind, because suddenly, you could, yeah? Remember, we were making it clear, cost explicit for all the resources that have been used, and so, it was very, very easy for you to start thinking about how, what is actually the cost of this system that I'm building right now, compared to the other one that I built yesterday, and so, as you no longer have these constraints, it drove amazing innovation, but the macroeconomic climate sometimes changes, and more noticeable in the last few years is that companies, more and more, have become interested in sort of what is this all costing me, and so, I hope that today, you know, you're going to listen a little bit to me about my experiences of the past 20 years of building cost aware architectures, yeah, and by the way, you've already seen amazing innovations and announcements this week. This is not going to be one of those keynotes. So, sit back, take out your notepad, and start making notes today. And so, many of us don't have to live within these constraints anymore, but there's quite a few companies that actually do, and a great example of that is the Public Broadcasting Service. You all know that, you know, they make all these programs for their affiliates, and their famous tagline is, of course, supported by viewers like you and me, but they have to live within a strict budget, and so, it's not only that they provide all these programs for the affiliates, they also stream all content, and at the 40th anniversary of Sesame Street, they completely broke down, because they were streaming out of their own datacenters. It was 2009, and they knew that they couldn't continue to do this, because they just couldn't afford the hardware to do this at massive scale. So, they migrated over to AWS, and, you know, as always, if you just do lift and shift of something that wasn't scalable and efficient in your own datacenter, it isn't certainly scalable and efficient in the cloud either. But they wanted to continue to start off with, at least, with the existing software, and so, they made use of ops works and time-based scaling. That basically meant that they were still not very resource efficient, which is crucial for them, because the money they can save, they can do a lot of other things with, yeah, and so, while they actually moved to the cloud, they also started to realize that they had to re-architect, and they re-architected making every, using every possible AWS service that they could, and so, they were streaming directly out of S3 and out of CloudFront. They moved over to ECS and Fargate, really absolutely driving all of that cost down. They actually reduced their streaming cost by 80%, and if you were a fan of the Ken Burns documentaries, just like I am, they recently had this documentary called The American Buffalo, extremely popular, and a great documentary, and it went off without a hitch at 80% less, with 80% cost savings at the same time, not only because they moved to the cloud, but because they rearchitected for the cloud with cost in mind. Now, next to cost is something else that is really on my mind these days, and it should be on your mind as well. This is a freight train that is coming your way, and you cannot escape it. I think, in absence of us providing you with sort of the information about milligrams CO2 used by your services, cost is a pretty good approximation for sustainability, and we've got quite a few companies that are asking us to really help them build more sustainable architectures, and I think us as a society, as tech society, as technologists, have a major role to play in making sure that our systems are as sustainable as we can be, and remember, in AWS, you know, you pay for each individual resource used, which means that cost is a pretty good approximation for the resources that you've used, and as such, your contribution to sustainability. Now, throughout this talk, when I say cost, I hope you also keep your mind sustainability at the same time. Now, a company that is actually… it is actually the lighting house. This is one of our oldest European customers: WeTransfer, and I don't know if you know them. They actually support having these very large files that you can upload and then distribute. They reorganized themselves as what's called a certified B-corp, a company that has the highest standards in environmental, social, and fiscal transparency, and they are still able to innovate while actually lowering their emissions at the same time, and their server usage was their biggest energy dependency, and they re-architected in such a way, so that they could start to forecast, track, and measure carbon emissions, while serving 80 million people a month, and they've done this for some very unique strategies, and you'll see some of these strategies coming back into my experiences as well. Now, if you look at the video before that, I presented the architect with The Frugal Architect, and basically, this is sort of a book, where I've sort of encoded my experiences of the past 20 years of building cost aware, sustainable architectures, and I think, as builders, we really need to start thinking about this, not only because we want to be frugal in the way that we use our resources, but also, as sustainable as possible. Now, these are not hard rules, and I call them laws, but they're not like legal laws. They're more like bio and physical laws, where you have lots of observations, and then you codify those in framework. Of course, you know, nature doesn't care about those laws. The laws are for us, so that we have a framework to think it, and there's not hard rules, yeah, but to actually give it a little bit of structure, I've put them in three different categories. One of them is design, measure, and optimize. And I want to start off with probably what's the most important thing, yeah, and the most important thing here is that cost needs to be a non-functional requirement, and if you think about non-functional requirements, now, there's all these sort of classical ones. Security, compliance, performance, availability, all of these are actually things that are not the specific features and functions of the application that you're building, but the ones that you have to keep in mind at all times. Now, I think security, compliance and accessibility are non-negotiable, and the other ones you can make all sort the trade-offs on. There's actually two other ones that I believe should be in this list as well: cost and sustainability. Both of them should be treated at equal weight when it comes to non-functional requirements for your business. Now, it is easier these days to measure cost. Now, if I go back to my early days as CTO of Amazon, you know, I basically had to write a big check upfront to this database company before I could start using their architecture, and it wasn't just a little check. It was a big check, because I needed to think five years ahead, how much capacity do I think I have five years ahead, because it was the only way to drive costs down, and so, it was very hard on day one, or on the second year or the third year, to think about how much of that cost is actually going into the systems that I'm building at this moment. In AWS, of course, that's radically different, yeah? The pay-as-you-go goes for used resources, and actually, when we started building S3, as being the first really big AWS service, we had to think about what kind of resources are we using, what is our cost that we need to expose to our customers, because we want our pricing model to be cost following. Cost following means that we expose our costs to you. Now, we sat around the table, and we were thinking, well, what are the two biggest costs that we are going to have in this that we need to put in the pricing bubble? Transfer, of course, bytes on the wire, and storage, but when we started onboarding our first, very first customers, we started to realize that there were more resources being used by them, and we actually needed to add a third dimension to how we would price this service, and the third dimension was the number of requests. Yeah. What I want you to take away from this is that, you know, especially if you build something radically new, you may not have an idea about exactly how your customers are going to use your system, and how much resources they're going to use for each operation. So, make sure you can observe that and immediately react to it, so that you understand exactly the kind of resources that you are using to serve your customers, but by the time we built DynamoDB, we had this completely down to an art, and if you remember, when we launched DynamoDB, we launched with two types of reads. One read was eventually consistent. Basically, it runs a quorum underneath. Then let's say, there's three nodes in the quorum. It goes down. It does one read to a node, and you may get last update or not. We also launched with a strongly consistent read. The strongly consistent read basically went under the covers the two reads to reach the quorum to make sure that you will get the latest update back. You have to do two reads for that. So, we made sure that eventually consistent was half the price of strongly consistent, because we had to do twice as much as work in strongly consistent to access the one read for eventually consistent. So, I want you to take away from that, you have to consider cost at every step of your design to really keep that in mind, and then if you think about sort of your business, because after all, we are not just building technology for technology's sake. We're building technology to support our business, and I hope that all of you are in an organization that probably has some sort of agile development strategy, where you are close with your business partners. We are continuously talking about the functionality of things. How reliable does it need to be? How scalable does it need to be, and most importantly, how much will this cost? And that's a conversation we haven't always had, but we need to have this conversation continuously with our business partners. Now, at one moment, especially when I started advising more and more startups, I tried to hammer this down on day one when a startup is thinking about that product. What is the revenue model that you think you're going to have? How are you going to make your money? And then make sure that you build architectures that follow this money. That's important, because if your costs rise over a completely different dimension, you're going to be toast, eventually. Yeah? So, align cost with revenue. Now, if you think about a company like amazon.com, probably the best measurement for our success of the operations is sort of orders per minute. Right? That's basically sort of the dimension where we are making revenue over, but we then need to make sure that our infrastructure scales in such a way that, actually, cost doesn't grow in a completely different dimension, but also, that we can use economies of scale, eventually, to drive that cost down further, and as you can see, sort of, if the difference between cost and revenue would be profit, you know, profit should increase over time, if your cost rises over the same dimension as that your revenue is. I've also worked with younger businesses in which it didn't go that well. This particular one was actually one of the first ones that were building these MiFi devices. This was before ubiquitous mobile communication, and so, yeah, basically, you had to buy a ten-gig data package every time you ran out, and it's a good model. Now, they basically were running over one of the bigger telco providers, who they had to pay. So, every time, basically, if you look at the one on the left, basically, the green boxes are actually unused capacity that they just had part of their revenue, and then customers came to them and said, you know what, all this buying every time ten gig, so bothersome. Can't we have an unlimited plan? And without thinking, probably really thinking, the company said, yes, of course, we can, and then, you know, it put a decent high price against it, and what happens then, if you remove constraints, certainly, customers start behaving in ways that you didn't anticipate. They started watching Netflix over their mobile device, yeah, and as pretty quickly, actually, their usage ramped up tremendously in ways that customers were no longer paying for it. The company went out of business because of this, yeah? Be really smart to make sure that the dimensions of which you make revenue are also aligned with where your costs are coming from. Now, it's always good to think about flywheels, and you probably have seen this napkin drawn by Jeff Bezos many times. Now, Flywheels are things that sort of continue to put energy into it. The more energy put into it, the better it works. So, it starts off with selection, and selection means the number of products in the catalog. The higher the number of products in the catalog, the higher the likelihood is customers can find what they're looking for, gets a great customer experience, drives more traffic to the site, makes that more sellers want to sell on the site, because there's more traffic, which means that the catalog gets bigger, more products in catalog, selection grows. So, you get this continued cycle that suddenly starts to accelerate and drive growth, and then if you make use of that, the economies of scale of that, to lower your cost structure, and then lowering pricing for your customers, you have another flywheel that drives into that. Suddenly, two things happen when you go into that customer experience, and it really accelerates the way your business grows. So, really, make sure that your business decisions and your technology decisions are in harmony with each other. Now, you know, and sometimes, it's easy to think upfront, you know, these are the resources you going to use. This is sort of what I need in my pricing model. When we started Lambda, that was a whole different ball game, yeah? Again, we knew that customers wanted to have serverless compute, just like serverless storage and serverless databases. They didn't want to think about scale and reliability and things like that. They just needed it to work, and many customers wanted that. We also knew that we wanted to make the principal decision that we should be charging over two dimensions, which would be, you know, milliseconds of CPU used and amount of memory used over a certain period. Now, as always, you know, we didn't read, if you build something radically new like Lambda, you have no idea how your customers are actually really going to use it. That was what we learned from S3, yeah? So, we also knew we needed to get insight in this before we could build the right architecture. We didn't have the right architecture underneath that, that we could build on our own. So, fine-grained isolation and hotspot management and things like that. So, there was this tension between these three different things, yeah, these three different functional requirements: security, strong isolation, yeah, cost and getting insight into your customers. So, we made a decision upfront, given that we didn't have the technology to make this very cost effective, to actually sort of sacrifice, immediately, cost. So, it's day two projects. One was basically starting Lambda, and the second completely greenfield projects underneath to start to figure out what kind of an infrastructure we would need to support it. So, we were willing to take out technical and economic debt on day one. We also knew, immediately on day one, that we had to pay that off eventually, because just like any other debt, you know, the interest keeps compounding, and at some moment, it becomes unattainable. So, we started off with using, to build Lambda, we started off with the smallest building blocks we have that could give isolation, and so, those were the T2s, the two T2s, and you probably all know about virtualization by now. Basically, T2s went on top of a hypervisor, went on top of real hardware, and these are actually pretty coarse-grained. If you think about Lambda functions being really small, these T2s, even though they're the smallest instance type that we have, is much bigger than what we needed to actually execute that function, but we needed the isolation, we needed the security isolation. So, we went ahead and actually implemented using T2s, and so, basically we had a whole compute pool full of T2s where we were executing this Lambdas in, and we made sure that each Lambda from each account would be executed inside isolation, so, inside the T2. Now, you can already see what is happening there, is that some of these T2s are tremendously underutilized. Why? Because there's only two or three of these Lambda functions executing every second, but also, we saw exactly the other side happening, because quite a few of these lambda functions actually happened in synchrony. So, you would get not one execution of this Lambda function. You would get 1,000 or 10,000 at the same time, and so, where, on one hand, they were only utilized, quite a few others were completely overloaded, and we couldn't do fine-grained resource management, because we didn't have these fine- grained capabilities underneath them. So, we knew we had to repay this debt, and we did that by doing massive innovation, and the innovation became what we now know as Firecracker, yeah, the idea of building micro-VMs based on KVM, and we could launch a full isolation virtual machine in a fraction of the time that it would take the spin up a T2, and KVM exploits hardware virtualization. So that makes that it's extremely efficient for these very small VMs that we're doing, and so, also, it allowed us, given that, now, we have a very small boundary in isolation, to make sure that we can use multi-tenancy, that it's very easy to do hotspot management now, because you have these very fine grained isolation boundaries, and we got to make sure that we could fully optimize memory and compute by basically hotspot managing over different types of physical hardware, and so, all of this drove tremendous innovation. It's not only that we were able to move our customers from the T2 environment Lambda, without them noticing, over to Firecracker. Well, to be honest, they did notice. Everything became a lot faster, and their performance became a lot more predictable, but we didn't tell them that. They were just happy with that, yeah, but it also gave us an environment for other innovation. Without Firecracker, we would not have been able to build Fargate. Fargate allows you to run serverless containers, no longer having to think about infrastructure management. So, yeah, and also, what came with that, and something I want to repeat from what I said last year, you have to build evolve architectures, because your architectures will change over time, and you need to make sure you can evolve them without impacting your customers, and also, of what we saw here with Lambda as well, we're very successful, initially, in getting feedback from our customers, how our customers were using it when we were running on the T2s, and then building an environment underneath there that really matches our costs with the pricing model that we gave our customers. Now, actually, let me take a step back. This is a fun story that one of the distinguished engineers of S3, that someone told me, about sort of the evolution of S3 over time. Yeah? S3 started off as a single-engine Cessna, yeah, and then it was upgraded to a small jet, and then to a group of jets, and then eventually, to a whole fleet of 380s that are refueling in midair and actually continuously have our customers moving from one plane to another plane without them ever noticing it, and that's the power of an evolvable architecture, but what I want you to walk away with… this is a fun story, but what I really want you to walk away with is that when you are creating technical and economic debt, because you're not taking cost into account, you have to pay it off. My next observation is that architecting is always a series of tradeoffs, and it's a series of tradeoffs between non-functional requirements and the functional requirements that you have as a designer. Yeah? So, you can look at that sort of cost versus resilience versus security, all of this, and so, I can tell you stories about this at Amazon, but I'd rather have someone else with similar experiences tell you this story as well, and so, my next guest has a great story to tell how they aligned their business and technical priorities to achieve remarkable growth. Please welcome on stage, Cat Swetel, the senior director of engineering at Nubank. [applause/music playing] Thank you, Werner. I'm so honored to be here with you all today. With 90 million customers, Nubank is the fourth largest financial institution in Brazil and the fifth largest in Latin America. But only ten years ago, we were just a few people in a little house in Sao Paulo. Back then, the majority of Brazilian banking institutions were managing mainframes and legacy systems, but with cloud technology, Nubank was able to disrupt the market, making banking more accessible for customers who never had access before. Nubank's journey all started in this casinha, the little house that I just told you about, where Nubankers bankers worked on products that were built to be so efficient that we could charge much more reasonable fees. How did Nubank achieve such rapid growth in only ten years? We were born on AWS utilizing the new region that had just opened in Sao Paulo about a year and a half before our founding, and AWS is still Nubank's preferred cloud provider. Our first product was a credit card with no annual fee and an unparalleled customer experience, but that disruption was only the beginning. Soon, we had a bank account, insurance, investments, loans, an in-app marketplace, the list just keeps growing. In under ten years, our technical environment consisted of over 40 different AWS services underlying over 1,000 closure microservices. We were focused on growth, and we were succeeding. Then in 2020, the Brazilian Central Bank approached financial institutions with a radical new idea for how to transfer money. Before 2020, transfers between accounts in different Brazilian banks were slow and expensive. They took up to a full business day to complete and cost up to $5 U.S. Then to incentivize financial inclusion, Brazil Central Bank proposed a new protocol called Pix. For those of us in the U.S., Pix might be a strange concept. It's truly instant, real-time liquidation, zero cost to customers, available 24/7, 365, and all backed by the Brazilian Central Bank, meaning when someone transfers you money, it's instantly available in your account, your regular account, so that you can make a purchase or pay a bill. So, we spent five months developing Nubank's Pix flows to meet the ten-second latency requirement dictated to us by the central bank. When it hit the market, Pix was a huge success, far outpacing the usage that Nubank had anticipated. In about a year, Pix transactions per month had exceeded the combined total of credit and debit transactions. The scale was massive, and it significantly increased load on our mobile app and our customer facing flows. Our whole technical environment was under an unprecedented level of stress. At this point, we were in a bind. We were facing instability in multiple flows driven by that increased Pix traffic, and we were also facing increased cost scrutiny as we transitioned as a company out of startup hypergrowth mode. How would we deal with the tradeoff between cost and stability? For us, the answer was to choose both. We suspected that a lot of our exploding cost was just due to the misguided ways we were trying to achieve stability. In many cases, we were just throwing more machines, more memory, whatever, at the problem instead of actually solving the problem. Our hypothesis was that, if we stabilized our systems, cost would also stabilize. With AWS, Nubank's Pix team spearheaded a multi-team effort to test that hypothesis. Of course, we initially addressed urgent architectural challenges, but we also made three less obvious but very impactful changes. For the first change, we noticed that some of Nubank' microservices were experiencing instability as a result of long garbage collector pauses. So, in our quest for stable efficiency, we started to experiment with the Z Garbage Collector for those microservices that were experiencing the long, stop-the-world GC pauses. Now, ZGC cost us more in RAM than the G1 garbage collector, and it really made no difference during steady state operations, but it dramatically decreased the maximum GC pause length, which saved time and money for some of our most critical services. After garbage collection was addressed, we started to look towards our database's caching strategy. Our canonical database, Datomic is an append-only database that's backed by Amazon DynamoDB. Datomic makes use of an in-memory cache as well as Amazon ElastiCache as an external cache. As the amount of data grew for some of our most critical services, data locality became a challenge, and more and more transactions had to hit that external cache. At first, we tried to just add more memory to beef up the local cache, but that proved pretty inefficient. So, instead, we decided to start experimenting with a new caching strategy using NVME discs, where we could cache a lot of data and query with pretty low latency. As just one example of the great results for one of our critical microservices, for every $1 that we invested in NVMEs, we avoided spending $3,500 across those flows. So, the stable option ended up being a net cost savings. Our culture also changed, and that was a big part of Nubank's success. In order to make important decisions and good trade-offs in context, leaders at Nubank need to have basic technical understanding of their products and our infrastructure, and that movement kind of started with the Pix leadership team, but the change quickly became a standard across the company, and today, business units at Nubank are expected to have an AWS cost champion to help leadership make informed decisions that balance competing concerns. In the case of Pix, our hypothesis had been proven true. Stable systems were efficient systems. Cost stabilized and became more predictable. Meanwhile, the time we spent in high disparity incidents decreased by an order of magnitude, and the P99 on our latency SLA decreased by 92%. In fact, with the remarkable 35% efficiency ratio, we stand as one of the most efficient companies in our sector, and that transformative impact has saved our 90 million customers $8 billion in fees in 2022. [applause] Nubank's growth is fueled by our low-cost operating platform and our efficiency, which allows us to charge less and invest more in our customers. Now, for every two adults in Brazil, one is a Nubank customer, and we hope to continue closing the gap and make banking accessible to all. Thank you. [applause/music playing] Thanks, Cat. One thing that really stuck me was those words. The business needs to understand AWS costs, and I think someone wrote on Twitter, every engineering decision is a buying decision. Keep that in mind, and also, I like, actually, the way that they put their metrics available for everyone to watch. I know that, when you start making your metrics visible, it can change behavior, right? You have to really figure out, when you think about sort of measurements and observability and things like that, and just like Nubank, I want you to work with your business to align your priorities, and your only way to do that is to really understand them. Now, next to those three laws that are considered to be in the design phase, you will continuously need to sort of understand where your costs over time are actually going. Yeah? Unobserved systems lead to unknown costs, and I have a really great story here. My hometown of Amsterdam, yeah, beautiful houses, old houses out of the 1600s, and things like that. In the 70's when I grew up, there was this oil crisis. I don't know if you remember that. We had carless Sundays, and of course, at that moment, everybody started to understand, started to become concerned about the cost of energy, and there was this great investigation at that time, because it turned out that there were houses that were almost identical, but some of those houses used one third less energy. Why was that? It was mind blowing, because these houses were the same. By the way, there was no double glazing and things like that in those days yet. Yeah? So, these houses are just radiating heat the whole time, but some of them are radiating less heat. So, what was the difference between those houses? The houses that used more energy had their meter in the basement. It was basically hidden. The houses that used less energy had their meter in the hallway. The fact that, every time when you entered your house, you could see how much energy you have been using completely changed behavior, and as such, you need to make sure, first of all, that you understand what you're measuring, of course, and how that measurement can change behavior. Now, if you're in retail for like amazon.com, there's a number of costs that you actually always have to keep in mind. Yeah? On one hand, remember, Amazon is a massively microsystem-driven environment, yeah, where each of the costs to a particular service, each of the requests to a service will have a certain cost. Now, of course, it's often hard to measure that, but you need to. You also need to, if you have actually one top-level request that goes out to all these microservices, you need to be able to get the aggregate of that, and then you need to figure out what is actually my conversion for each of those requests, yeah, and actually, so, there's literally dozens of features on an Amazon homepage, and each of them may go out, actually, to hundreds of backend services. Yeah? So, you need to actually sort of decompose this, and all of these features that you can decompose it come at a certain cost. What's the total cost of this experience, yeah, and you can actually measure those individual costs. They actually have to measure at microservice level. They have to isolate this one spill. For example, the service that can give you an estimation of delivery speed. How much does that cost me to do this, yeah, and of course the easiest way would actually be to just take it over a certain period of time. Take the number of requests, divide them. That's a little bit simple, but it's a good approximation for you to think about, yeah, if your cost is a normal distribution, then that probably will work. More importantly, over time, your costs should be going down. If you don't make any changes to this, and not even maybe some economies of scale, but also, you should be able to re-architect to do profiling, to start looking at sort of moving maybe from one architecture to Gravitons, all these different things that you can do to actually drive your cost down. So, over time, the cost per request to this microservice should be getting down, and then on top of that, you have to figure out what your transitive costs are, yeah. What's the cost of serving this application or this webpage for you? What's the total cost, yeah? Can you figure that one out? Now, I'm going to show you now a slide, and just like when I saw it for the first time, you're going to scratch your head. This is the number of microservices in the back end of amazon.com. Yeah? My request to the homepage originates over there, goes out to all the other microservices to construct this page for you, and you can dive deep into them, in each of them. You can figure out what the cost is of the individual apps, the ones that are actually, again, making calls to other microservices, but you need to understand the complete picture, the complete cost picture that this one page, this homepage actually costs, and you can, because remember, in AWS, each of the resources that you've been using comes with a dollar tag associated with it. So, you know exactly the cost of every single one of these services, and we know the cost of the whole system, and then of course, you need to figure out, you know, well, there's actually the contribution of each one of these features to my conversion rates, yeah, you need to also understand the value of new features. Now, if you start actually spending more money on actually creating this page for you, you should see your revenue coming up. You shouldn't see it actually flattening out, because that means you're making investments that have no return on investment, and there are, indeed, diminishing returns at some moment. Now, one of the things that we are very strong at, and probably everyone else that actually has a web application, is that you understand there's this common knowledge that improving latency of your webpages will improve conversion. So, if you build an evolvable architecture, you probably also make it easy to actually experiment. So, imagine that the 99% percentile for your webpage latency is 1.7 seconds. If you can engineer that to bring it to 1.6, right, you know how much that's going to cost you, how many more resources, and you can see what the impact is on conversion, and at some moment, bringing in the latency does no longer have a return on that. Measure that. Think about how to measure that and make it upfront. Make sure that everybody understands that. Yeah? You have to know your cost, and I think it's often, it's complex as you see the application that I just showed you, the backend for amazon.com is a pretty complex environment, but we have made this really our own, because we need to understand it. Retail margins are razor-thin. Now, we need to have total control over our cost at any time. Now, I also know that quite a few of you are literally running hundreds of applications, and it's sometimes really difficult to really understand sort of what are the metrics that belong to this particular one application, and you've been asking this for quite a while, and I'm happy to announce, today, you know, myApplications. It basically gives you a new experience in AWS control. It gives you visibility to cost, health, security performance per application, yeah? What you can do there is basically you have a new application tag. You assign that to the resources that make up your application, and then you get a single view of this observability into many of the standard functional requirements, non-functional requirements, and cost, and with cost, also, a proxy of sustainability. Now, sometimes, it's hard to instrument your applications, yeah, especially these days, if you start off with Kubernetes for example, with EKS, and you're building this distributed application in many containers and container types and things like that, and instrumenting them in a way, so that you get a good holistic view of them is not always easy, and you need to do it in a consistent way. So, and I know that's a lot of work. So, I'm happy that today we make available for you what's called CloudWatch Application Signals, such that it will automatically instrument the EKS applications that you're building, yeah, so that you can have one single dashboard, immediately looking at all the metrics that are relevant for your EKS application. With all of that, I want you to walk away with this one, yeah? Define your meter, because if you can continue to look at this meter, it will change the behavior and make sure that your meter includes cost and sustainability. Now, another observation I had that, you know, if you build cost aware architectures, you need to implement cost controls. Now, you can't just rely on good intentions, you need to put mechanisms in place, and as such, you need to build and have at your fingertips, and you have it in the cloud, tuneable architectures, and remember, in AWS, the knobs always go to 11. Ah, come on, you must have watched Spinal Tap. [applause] I don't care if you clap on loungers or not, whatever, but if I make a joke, I would really like it if you would laugh, yeah? [laughter] Now, okay, go back. What was I doing? So, software changes, software choices, like, you know, database tapes, API languages, and well-designed systems with NFR, they allow this tuning, yeah, and if you look, bringing it back to amazon.com again, you imagine that, on this homepage, there are all these different components, yeah, and you need to have controls to manipulate those components. Imagine what you're seeing is either cost or performance or one of the other metrics that you're following is going out of bound. You need to be able to switch some of those components off, and that's really important, but you need to build the switch, but the switch should be in the hand of the business. It should not only be something where you as a technologist makes a decision. Yeah? It's a decision that you make in concert with the business. Yeah? Because after all, that's who we are serving. That's who we are working with, yeah, but you need to have these switches and dials available, yeah, so that you can make these decisions. Crucial in all of that is to be able to do decomposition of your application that you have. Start to figure out which are the things in your application that are really, truly important, medium important, maybe not that much. Yeah? If you think about Amazon retail again, what is important? What always needs to work to have the application work? Search, browse, shopping cart, checkout, yeah? Without that, we're dead in the water. The system, the application doesn't work, and then there's a tier two. We call that tier one. Tier two are maybe features such as recommendations, personalization, similarities. They are things that really are important for the customer to actually discover the products that they're looking for, but they're not part of the true core of the application. One of the things that actually moved from tier two into tier one is reviews. It turns out, if reviews are offline, customers are not buying, because they trust the opinion of their peers, and so, you have to make decisions then together with the business to make these trade-offs. How much am I willing to spend on fault tolerance of tier one? Probably as much as you can, because that always needs to be on, because without that, you don't have a business. Replicate over three AZs at minimum. Maybe for tier two, you're willing to actually dial it down a little bit. Maybe for tier, replicating over two AZs is sufficient, and for tier three, best seller list, yeah? Who cares? Yeah? If they're offline for five minutes, it doesn't really have impact on the customer experience. So, you may have a different type of resilience there in mind, but make sure that all of these pieces are controllable, and whether you switch them off, or whether you throttle them, or whether you maybe turn off prefetching, you know what we do, we actually search for something. We look at what is the most likely products that you're looking for and then start prefetching them to make sure you're faster. Maybe you turn that off, maybe fewer details, but all of these knobs, all of these controls are for the business. You have to give control to your customers. So, before all of that, what's my advice there is, you know, establish your tiers. Start thinking about which are the pieces of my system that absolutely need to be up and running with predictable performance all the time. Now, if you think about sort of the… about optimization, there is another, what I consider lost art. You know, given that we have been able to focus on really fast innovation, yeah, and we are actually, we've been moving really fast, and something that we did when we actually living within the constraints was really tinkering at a smaller level, but that tinkering at a smaller level is becoming more and more important, because it turns out quite a few of your costs are actually going there. You need to start thinking about what is sort of the digital waste that is laying around in my system. Yeah? What are the things that I can just stop? Maybe the business doesn't like it anymore. PBS said they had one particular series that maybe someone watched twice a month or something like that, and they had still had it running, and they managed to turn it off. When you go home at night, do you turn off your development environment? You should. There's no reason to keep that running at night, yeah? Or maybe right sizing. Move to a smaller instance or maybe to a bigger instance, or more importantly, move over to Graviton, so you really can drive your costs down. Or maybe, you know, start thinking about how to reduce the kind of capabilities that you are presenting to your customers. Is it really necessary to stream in 8K? Is it really necessary to send this five-megabyte image over, that then the browser actually makes, puts it down to 600x100 pixels? Is that really necessary? Start becoming smart with that, you know? Reduce it to actually the amount of resources that you really need for your application. Now, when I think about this lost art, I think about profiling. I don't know how many of you grew up with this, but this was in my toolbox when I was in school. Now, you really need to be able to dive deep to be able to understand exactly, at a functional level, where your time was going. A CodeGuru Profiler actually gives you this as well next to the language profilers that you just saw. Yeah, a profiler will, in general, generate something like this flame graph. This is actually of a real Amazon service. I'm not going to tell you which one, but I think you can actually dive deep into this and figure out where your cost is going. You see 8% is going into garbage collection. That's your choice of programming language over there, that impacts that, but there's a large part left over here, and that's turned out to be network communication, and that's kind of out of balance. That's not what we would expect in that particular case, and then diving into the code, you suddenly start to understand what happened there. Whenever designing this, they had a common case in mind, and then for anything, maybe 99.0% were the common case, and they had 0.1%, they would follow an exception for it. Turned out they didn't really expect that it was the inverse of that. 99% of the packages came in hit the exception, and so, by simply changing this exception handling into an "if then else" statement, basically completely removed all the processing that was necessary there, the exception processing, and then there, this is the "if then else", and what happened then is that we actually went from 42% down to 27%. So, this is actually a process that you continuously need to do. Even if you don't find, let's say, these big disparities, you still have to understand exactly where your cycles are going, and this is a continuous process. It doesn't stop after day one. You need to completely understand, and especially wouldn't be the first time that I've heard from customers saying, why is our, the backend service for our iOS app so much more expensive than the one for our Android app. Well, maybe you should start looking at how they're implemented, right? The power of profiling allows you to be curious and dive deep. Now, the last one, my observation may be a little bit more controversial, yeah, and so, please hold onto your egos at this particular moment. Yeah? The most dangerous phrase in the English language is: we have always done it this way. Yeah? [applause] The admirable Grace Hopper, you know, the grandmother of all those developers here was a very wise woman, and, you know, it wouldn't be the first time that a customer who says, yeah, but we are a Java shop. Oh, we're really great at Rails, yeah, we've always done it like that, and, you know, we've done this before in my previous company. We're going to do exactly the same. You have to keep in mind that, you know, development is quite often expensive, but the cost to build dwarfs to the cost of operating your applications, that's something you have to keep in mind, and the way that you build your applications, the platforms you use, the programming language that you're in should be continuously on the scrutiny whether you are picking the right one. Now, I mentioned earlier that I thought that cost was a good approximation for sustainability. Maybe the other way you around as well. There isn't that terribly much research into how much certain programming languages will cost you. However, there is brilliant research by Rui Pereira at INESC in Portugal about the energy usage of programming languages, and so, he first launched this paper in 2017, and it shocked the development rules. He released this paper with much deeper insight, much deeper showing different types of applications that were being built. Turns out Ruby and Python are more than 50 times as expensive as that C++ and Rust. Now, I know the reasons why you wouldn't want to use C++ with all security risks that there are, but there is no reason why you should not be programming in Rust, if you are considering cost and sustainability to be high priorities. [applause] We implemented Firecracker in Rust. Large parts of S3 are really implemented in Rust, and not only because the energy usage is lower, which we aimed for, but also because the security, the strong typing, the memory safety that you get in a fast, efficient language like Rust is very important. Now, with all of this, now, I don't want you to immediately start sort of throwing away everything that you know and starting over, but we as technologists live in a world that is moving so fast that we always need to continue to learn. We always need to dis-confirm our own beliefs. Yeah? Put your ego aside as being the master Java programmer but start thinking about the cost, actually, and the complexity of how to deal with garbage collection. Start thinking about that maybe this massive platform underneath there, yeah, maybe that's costing you a lot, even though it allows you to do very fast prototype. So, disconform your beliefs. Now, so, these have been sort of my observations around cost and sustainability that I have learned over time at amazon.com. Yeah? Cost awareness is a lost art. We have to regain this art, mostly also, because sustainability is a freight train that is coming your way, that you cannot escape and should not escape, and cost is a pretty good approximation for the amount of resources that you've used. The constraints from the past, I don't want to go back to them, but we may actually be willing to have self-imposed constraints. Yeah? Put some constraints around the systems that you're building in terms of cost and sustainability. Yeah? That's why I believe that constraints, even self-imposed, can breed creativity. Now, you know, with that, I think The Frugal Architect is live at this moment, not that there is that terribly much information, but we would love to work with you, actually, to incorporate your cost awareness learnings you did over time as well. So, the site is thefrugalarchitect.com. [phone ringing] Sorry, give me one moment. [phone ringing] [music playing] Not what you were expecting? Probably more private islands and sailboats. Ah, not that Oracle. I am sometimes right about the future, literally. That's why you're here. I need your insight. Everything fails all the time, even the simplest of hardware. You see, I have been asked to use my gift to make some tech predictions for a La Predicta magazine, and I've heard that you are something of a tech soothsayer. So, what are your predictions up until now? Envisage, if you will, a world where artificial intelligence is represented by an omnipresent, benevolent attendant. It'll revolutionize industries like healthcare, freeing medical maestros from administrative burden. That's not really the future. You know, healthcare is already deeply ingrained in very advanced analytics and machine learning. Okay. How about this? Dare to dream of a future where developers are no longer solitary mavens coding within the confines of individual experience. No, the builders of the future will dance side by side with AI in a celestial ballet of organic/digital pair programming. I'm not feeling that either. Now, CodeWhisperer is already here. You know, that future is now, no dance necessary. All right, all right. You are going to love this one. La Piece de Resistance, hoverboards. No, no, no, no. No, let me stop you there. Either way, that's not going to McFly. Making tech predictions is tough. Well, the future is not science fiction. To be able to make good predictions, you have to think about the present, because the future is now. [phone ringing] Before you go, here. Have a cookie. Oh, no. I only accept essential cookies. [music playing] So, I'm not an oracle, but… [applause] But observing the present actually helps kind of predict the future, and that is especially at AWS, yeah? The kind of things that we are doing at AWS often defines the technical future, and now, that is actually really important, but I also think that historical context is important. Look at the bigger picture. Look a bit back. You know, and I know that we've all seen amazing innovations being presented to you this week in the area of GenAI and LLMs and how we're going to change development and how businesses are going to change, but where did this actually come from? What's the history of this, yeah? It goes back to two of my favorite early Greek philosophers, Plato and Aristotle. Both of them were thinking about whether machines could actually do the tasks of humans, and they were thinking about sort of what is actually… what is it actually controls humans? Aristotle thought it was the heart, the soul that actually drove humans, but Plato actually thought it was symbolics in your head, and actually, Plato went as far, if you read The Republic, that he created a city state in that book, where actually, machines, robots were doing the chores. Now, that's about, was 20-25 centuries ago? Not much happened for 25 about centuries in that sense, until the first computers arrived, and computers could do much more than just calculations. They were capable of more complex tasks, and as such, everybody started thinking that, oh, maybe if the human is indeed, you know, driven by sort of this symbolic complexity in their head, maybe we can use computers for that as well, and of course, one of our more important philosophers of the last century, Alan Turing, spent a lot of time on that. He really started to think about, can machines, computers, think? Yeah, and his famous paper in 1950, Computing, Machinery, and Intelligence is really sort of the groundbreaking work that we still live by. We still talk about the Turing Test. Now, unfortunately, Turing tragically died before he could join this 1956 workshop at Dartmouth. In this workshop, the term artificial intelligence was coined for the first time, but still, most of the researchers available there were from the symbolic AI field. They're really thinking about sort of, can we implement reasoning. Can we implement the symbolic reasoning? Can we use mathematics for those kind of things? Didn't really go anywhere, not immediately, at least, and these automated reasoning and things like that have become tools that are incredibly important, but not necessarily in the field of AI as we know it now. One of the things that we did start to build in those days was called expert systems. I built a few of those using Prologue, and I still don't like the curtains behind it. You know? So, expert systems actually sort of incorporated knowledge in rules, and they could execute queries against it and sort of get answers back, but they were very laborious, and they weren't, to be honest, they weren't that terribly smart. The big breakthrough came when we could see the shift happening from symbolic AI to embodied AI, and what did that mean. Basically, the groups of researchers who started to think, if we want, maybe, if we start to have these basic building blocks that humans have to perform tasks, maybe out of that, we can build artificial intelligence. This is mostly driven by the idea of that you have robots. What are the kind of capabilities that robots need? What are the kind of sensors that we have that we need to give robots, you know? Speech recognition, image recognition, and even maybe sensors that we, as humans, don't even have, like LIDAR. Now, can we build that ourselves? And that thinking actually has driven us for the past 10-15 years, and we saw new algorithms arriving. Deep learning became important, reinforcement learning, all those different types, and what we saw was software improved, algorithms improved, hardware started to improve, software improved again, and we saw a really quick acceleration of all these different algorithms happening, so that we could do better learning, better build these big models that they could actually help us do tasks. Now, of course, the next step, the most recent one that has created this sort of earthquake in the world of AI has been transformers. The ability to use transformers to build foundational models and to build these large language models are actually a revolution in all of this, yeah, but I'm not going to talk about that, really. I really want to go back one step and talk about what I would call good, old-fashioned AI. Yeah? The cool thing is that now we have good old-fashioned AI, and there is new AI. The new AI doesn't invalidate the old AI that we were having, yeah, and so, if I look at so many of my customers who have built amazing systems using good old-fashioned AI, that I think you should keep in mind that not everything needs to be done with these massive large language models, yeah, and I'll pick a particular area. I've, over the years, become extremely interested in those businesses that actually are combining two things. They are trying to solve really hard human problems and use technology to do that. Now, I could give you lots of examples of companies that have done amazing things with AI for now, but I will pick a few out of my box of these companies that have built things for AI for good. And so, one of them is one of my most favorite organizations to work with. It's the International Rice Research Institute. They sit just outside of Manila, and their mission is to abolish poverty and hunger among rice dependent communities. You have to remember that the prediction is, by the 2025, population has grown by another 25%. How are we going to feed them? How are we going to make sure that that happens? So, they're an amazing organization. They have a massive seed bank, a big freezer, actually, with 200,000 strains of rice in them. They can regrow any type of rice. They can also make improvements of rice. For example, Golden Rice is a good example that has much higher degrees of vitamin A, which is very important for certain communities. And so, but they had a big backlog, because all these seeds are being sent to them, and humans need to sort of sort them and try to figure out which of those seed are actually useful, which they should be storing, and they got backlog and everything in the backlog starts to deteriorate. So, they basically make use of machine learning and vision, yeah, vision management to actually automate this process, not that they are actually, the machine is actually now making the decisions, which seeds should go into the bank. Still, humans do that, but the automation before that, the efficiency before that, using vision, allows them to actually remove the complete backlog. They improve their backlog, that sorting productivity by something like 30-40%. Cergenx, another very interesting company that I recently met in Ireland, apparently, and this was not a problem I was aware of, that most infants that may be born with brain injuries that are not immediately visible, often, those injuries don't actually aren't visible until months or even years later. They have a very simple test with a small cap that basically takes an EEG of the baby and immediately can determine whether a baby actually may have that particular injury. And as such, you know, you can immediately start treating them, which actually improved the quality of life for these babies for a very long time. So, what they did, they actually have all these scans. They put it in a sleeve, ran it through SageMaker, created this model, which is very unique, because baby EEGs are radically different from that of adults. So, their goal is to make this brain testing for infants as commonplace as the hearing test that each baby is getting now as well. Another company, precision.ai, you know, we've all seen drones. Drones are basically a box full of AI capabilities. After all, we are not steering them. We are giving them a task. They need to go away. They need to go somewhere. They need to follow a particular pattern. They need to avoid birds. They need to avoid washing lines. They need to do all these things by themselves autonomously. So, they're a big AI box to start off with. Precision.ai has as a task to avoid the complete plots of land are sprayed with chemicals to remove weeds. They fly off this patch of land, create a map of it, and are then able to attack individual weed plants, significantly reducing the runoff of these dangerous chemicals in the creeks and rivers surrounding the plot of land. Digital Earth, Africa, a one of my favorite companies to work with, they make use of… there's an open dataset of satellite imagery of Africa, and this data is being used by governments all over the place. You know, in Zanzibar for example, they're monitoring coastal erosion. Yeah? In Ghana, they identify the impact of illegal mining. Do these illegal roads being built go to these mines? In South Africa, Kenya, they understand the impact of forest fires. All of this is driven by this open data set of Digital Earth Africa, and many of these organizations are using this to improve the life of Africans across the continent. Now, all of this, and I think Swami hammered down on this point as well, you know, without good data, there is no good AI, and so, in the past, indeed, we had all the structured data. If you think about patient records, that was a structured record. These days, patient records have all sorts of unstructured information scribbled all over them, and so, here, suddenly, you have a mountain of unstructured data, where you think this may be a haystack that has a needle in it. So, how do you find a needle in a haystack? You use a magnet, and the magnet is machine learning, yeah? Basically, you use machine learning to create meaning out of this mayhem, and this AI for now actually gives you practical solutions for real problem, and it's incredible to see what customers can build with these services and the tools that we provide, and if leveraged, AWS, it solves some really hard problems. A prime example of that is an organization called Foreign. Now, our next speaker is going to talk to you about child sexual abuse, and I know, I recognize this can be difficult… this is a difficult topic. I encourage you to do what you need to do, take care of your wellbeing. I'd like to welcome Dr. Rebecca Portnoff on stage to discuss how Foreign utilizes AI to protect children. [music playing] I am going to tell you a story, a true story. A child is being sexually abused. We'll call her Maria. Maria's abuser is taking pictures while he abuses her and then sharing these images and videos onto a content hosting platform, hiding in plain sight among hundreds of millions of other images and videos, but this particular platform doesn't accept this. This platform uses Thorn's Safer products. It uses Safer's child sexual abuse material, or CSAM, classifier to find images and videos that could show a child in an active abuse situation. On one day, the classifier alerts them. They've got a hit. So, they go to work, and here's what they find. A user has shared over 2,000 new abuse files. It's clear a child is being abused. So, they flag the case to law enforcement who launch an investigation, and the child in that content, she is found. Maria is found. An arrest is made. A recovery is complete, and for Maria, a brighter future emerges. This isn't just a story. This is reality. This is the gravity of our work. We, as technologists, we have the power to end a child's real-life nightmare. Our challenge: how do we find them and stop the cycle of trauma? The answer to these questions are buried in haystacks of data. So, then, what's that right magnet to find that needle in a haystack? The answer is complex, and an essential part of it lies with machine learning. According to the National Center for Missing and Exploited Children, in 2022, they received over 88 million files of suspected child sexual abuse reported to them by online platforms. These aren't just files. These are kids, kids who desperately need help. Think about it. With 88 million files and even just one second of review per file, that's going to be almost three years of nonstop review. I don't want any kid to have to wait that long to get help. We are a nonprofit that builds technology to combat child sexual abuse at scale. AWS is our preferred cloud provider in this work. We leverage AWS services to power our machine learning tools. I have dedicated my career to defending children from sexual abuse. So, I can tell you with confidence here today, machine learning, AI, it does make a difference. We built Safer. [applause] So, we build Safer, our all-in-one tool to detect, review, and report child sexual abuse material. Safer uses hashing and matching to find known abuse material content that's already been verified by an analyst and a classifier to find new abuse material. My team at Thorn, we built the classifier to act as that powerful magnet to find new child sexual abuse material at scale. Now, when we first started building the classifier back in 2019, there was already active research on the use of convolutional neural networks for detecting child sexual abuse, but we needed a classifier that went beyond research, something that worked at a production scale. When we build a classifier, we follow the CRISP-DM process. As you can see, it's really all about the data, but within this broader framework, we had to overcome a key hurdle. This data is illegal. Child sexual abuse material is illegal. You can't store it in the same places or the same way as other content. So, the solution was to collaborate. We invested in hardware installed on site at organizations with the legal right to house this data, training the classifier on-prem, and then using Amazon's ECR to distribute that trained model to end-users. So, we've got this critical training infrastructure in place, and now, we can build, starting with data prep. We use techniques like perceptual hashing to de-duplicate the dataset, ensuring that there's no overlap between the training, testing, and validation sets. We use Amazon S3 to store our non-abuse material, and this data is just as important as the abuse material for training our classifier. Now, model training via remote access to an on-prem solution, it's got its challenges. It can be slow and opaque. So, we use Amazon EC2 and EKS to do R&D with benign data first, debugging and fixing any issues we may find in our training pipeline before training on-prem. In the machine learning AI lifecycle of develop, deploy, maintain, that maintain part is key, because models get stale, and models have bias. So, what this means is monitoring performance and regular retrains, what's our performance holistically, and then drilling down, is our model better at classifying abuse material of lighter skin or darker skin children, what kind of trends are we hearing from users about false positives and false negatives, because to fix a problem, we have to first know what that problem is. Then we balance our data sets, adjust the model weights, refresh our training dataset with new images, and retrain. Now, AWS powers our user feedback. We have an API in our AWS hosted safer services, where users can submit false positives to incorporate back into our training. These false positives are often the most valuable negative examples that we have, because they allow us to do targeted retrains and improve performance on data in the wild. Now, none of this hard work that I talked about matters at all if you can't get the classifier into the hands of the users. So, effective model deployment is key, working collaboratively with engineering and product to find the right solution for the user's needs, and when I say effective deployment, I'm specifically thinking privacy forward and human in the loop. I am proud and thankful for how Thorn's engineers deploy the classifier in a way that the customer has full control over when and how the content is reviewed and reported, because the classifier, it acts as that powerful magnet, but it should be a human making the final call on what gets reported. Now, we are all here today because we want to build something that has impact. So, what's that impact look like for Thorn and our partners? It looks like the outcome of the true story I told you earlier: a child gets her life back. It looks like over 2.8 million potential files of child sexual abuse found via Safer. It looks like constant innovation. This month, we launched Safer Essential, an API-based solution for quick detection of known child sexual abuse material. This is a future where every child is free to simply be a kid, and everything we build at Thorn, it's to get to that future, but that technology by itself is never going to be enough. Having impact requires all of us to work together, content hosting platforms, law enforcement, government, survivor services, and the community, including you, you in this room, and I want to do that together with you. I want everyone here to join Thorn in our mission. I want you to show up at the nonprofit impact launch today and ask my amazing colleagues at Thorn how you can use our technology, and I want you to go to thorn.org and learn, learn about what these kids are going through, and what technology can do to help them, and then I want you to pick up your laptop and build, build what these kids need. There are still countless victims who are suffering, but we have the power to help them, if we work together. Thank you. [applause/cheering/music playing] This is the power of technology and the power that we have, yeah? Make sure that, you know, the call to action that Rebecca just gave us doesn't go wasted. We have that power to absolutely make a difference as technologists. With the right technology and the right access to data, we can have a really positive impact on the world, and technology can be a force for good, yeah? That's actually a lesson I learned way before I ended up in computer science, yeah? That's me on the right there. Yes, I had hair at some moment. Before I started computer science, started studying computer studying computer science, I worked in radiology and both radio diagnostics as well as radiotherapy at the Dutch National Cancer Research Hospital, and a lot of technology was already part of that, not just the fundamental x-ray technology, but, you know, CAT scans, MRI scans, nuclear imagery, all was sort of starting out. By the way, that is 40 years ago. The other, this is not, by the way, a fancy gen AI picture, yeah? The person sitting in front of me is Frank, Frank Delleo, to who I owe my computer science career. Frank is a radiologist, and Frank was probably the most passionate medical doctor I've ever met in my life. Frank wouldn't go home at night until he had all his work done. That meant, at 10:00, 11:00 o'clock at night, he would still be sitting in front, looking at his imagery, because he knew that if he wouldn't finish it that day, the next day would start with a backlog, and he would have that amount of work yet again, and so, he worked really long hours, yeah, and sort of, I really admired him, but he's kept hammering on me that I should leave the profession, because he felt that, given that I had a sort of an affinity for technology, he felt that my skills were much better used in actually building technology that could help people, yeah, and I really took that to heart. I basically went back to school again, and it's his fault. It's also his fault that I'm here now. Yeah? So, started thinking, you know, and I have a real affinity still for all the work that we were doing in those days, and it got me thinking, you know, can I build something useful with ML? Can I do that myself? You know, after all, I've been telling all of you for years now how easy it is to integrate ML into your applications, yeah? However, I needed to make sure and really wanted to do this myself. So, what do you do? Actually, I tried to catch up on 40 years of radiology work, innovations, talk to doctors, go to hospitals, understand what the current problem space is, and try and find one particular problem, a small problem that maybe I could try to see how much work it would be to build a solution for that, yeah, and I actually, I went, I took some courses in the ML University, spoke to AWS ML experts and things like that just to get me up to speed, not just about the principles, but about the practicality of it, yeah, and then one of these, in this hospital in Dublin, where I talked to one of the stroke specialists, he hammered into me saying that, every second when a person has a stroke counts. A stroke patient loses 1.9 million neurons a minute if they're not being treated. So, quick treatment is crucial. Okay, that sounds like a reasonable, simple enough problem that I can attack, looking at brain imagery in X-rays, and if it will be like this, in case you don't, can't read it, the big white spots there are basically blood hemorrhages in the brain. If they will be this big, that will be easy, but often, they're not, you know, they're microlesions or really sort of micro, continuous micro strokes and things like that, that you can still detect using CAT and MRI scans, but it's much harder, and so, I was starting off thinking, like, you know, can I build an ML pipeline that sort of incorporates, can sort of decide whether, on an image, there is actually a brain hemorrhage, and then if you would actually find a positive, you would actually sort of prioritize that in a radiologist worklist, so that it could be immediately evaluated. So, as all with that, if you don't have good data, you don't have good AI. So, that is immediately step number one on my stumbling block. How would I get access to data and imagery that I actually could train my model on? Well, it actually turned out that was not that hard. Kaggle has a dataset with 700,000 pre-labeled CT brain scans. That allowed me to actually immediately get off to a start. So, download that set, put it in S3, and as always with, when you do this, you should already split the dataset up into the set where you're going to train it with how you're going to validate it and how you later on are going to test whether the model actually worked, yeah. 70/15/15 is a common practice there. Next thing to do, fire up SageMaker, push this button, and now. I have to write some code, and so, pretty common in that is S3 uses Python, and if you just realized, I'm not necessarily that proficient in Python, but the Python SDK actually has a whole bunch of building algorithms and pre-trained models from popular open-source model hubs that you can immediately use. So, you know, I selected an architecture that I've researched, and I can fine-tune it with my dataset. So, the only thing to do now is actually to add the code, to launch the training job, push a button, and it gets immediately scanned through, and this model gets created. Wow! That's easy. However, in that button push that I did, there was a lot of work happening, yeah, and I don't know if you know much about sort of deep learning and how these models are being built with multiple layers and things like that. Basically, you know, you push an image through, goes forward, goes that, and a loss score is being calculated. The loss score needs to be as low as possible, right? The lower the loss score, the higher the accuracy of the model that you've built. Okay, if the number is high, you basically go back, you backtrack, and you adjust the weights and the biases in the model, and then you go forward again to see what the outcome is, and you go backwards again, and you do that for 700,000 images, or, no, 70% of the 700,000 images, yeah, and at the end of this pass, always, you end up, you know, you have forward pass backward pass, forward pass again, and all of this happens under the covers when I push that one particular button. Pretty amazing, and so, then you get actually out of that, you get a model, and again, with one click of a button, I actually get an API endpoint for this model, where I can start putting my imagery through and get a prediction score, what the likelihood is that this CT scan brain image actually has hemorrhage in it, and it actually turns out, it works pretty well. I was quite proud of myself when I built this, and of course, then, you know, being the person that likes evolvable architectures, I started looking at all these different AWS components around it. You know, reorganized the, is it the radiologist work list, and then something came to mind that when I visited that hospital in Dublin where the neurologist actually said that he'd rather get woken up at night at 3:00 AM for a false positive than not being woken up at all, because every minute counts. So, he doesn't want to wait until the radiologist gets to his reprioritized work list. He wants to get an SMS at night, at the moment that my model detects that there is a brain hemorrhage. Evolvable architecture. Added SMS, can actually send an SMS to the neurologist, and he can immediately, on his phone, take a look at the image, whether or not he should immediately jump in his car and drive to the hospital. Now, it is that simple. It is not that hard to build these models. You know, anybody who is talking to you about AI, and ML and building these models, how hard it is, it is not that hard, you know, and actually, by the way, this is in no way a production system. This is my hacking on a Friday afternoon, five weeks in a row, and of course, you can start augmenting it, because you know, as always, when you've done one thing you want to do all the other things. You could do, for example, use class activation maps, which sort of indicate what is it actually in this image that the model was actually looking at, and again, you're building a machine learning model. It's really sort of an interactive cycle. Now, as I said, this is in no way a production system, but however, I'm very, very happy that a number of people on the AWS team actually did pick it up, and I'd like to thank Priya, Wale, and Ekta for actually taking my horrible code and turning into something that might be a learning experience for you guys as well. So, all of this is available for you on GitHub to experiment with and to see how easy it is to actually start building these ML models. This is really what I want you to walk away with. If I can do it, you can do it. You know? [applause] Important in all of this, and we've talked about large training as well as small models, whatever, important in this particular case is that the model is small, fast, and inexpensive. Why is that? Because the hospital will want to run this for every brain scan that is happening, yeah, and that is many, many of them a day, and they would really like to run the model locally, not require some massive compute power behind it. So, for them, small, fast, and inexpensive is crucial to make this technology work for them. After all, if you're using ML, you should still be a frugal architect. Now, talking to radiologists about sort of the future and what kind of the newer types of AI have there to offer to them, and if you think about that, in talking to them, they would really like to have more of these what are called conversational interfaces, because you have to keep in mind that a CAT or an MRI scan or nuclear imagery is not just one image. It's literally hundreds of thousands of slices from your body in digital format. We just happen to make imagery out of it, because that's the sense that we have. They will really, and often, you know, patients come to a hospital not with a clear symptom that really leads to one diagnosis. Now, often, these are sort of unknown, and the radiologist's job is much more like an explorer, trying, more than just being an image reader, and they really look forward to a world where there are AI based radiologist assistants that could allow them to explore the data that they have at their fingertips, not just looking at images, and it also could include other clinical data. So, suddenly, radiologists get a 360-degree view of sort of the state of the patient, yeah, and it really helps driving diagnosis much faster than they could do before. They're really looking forward to the next generation of AI that helps them to build these conversational assistant interfaces. Those interfaces, though, will still make use of these small, fast, and inexpensive models on the side as agents to dive into specific problems that a patient may be experiencing. Again, I want to hammer this down. I think Swami did yesterday as well. AI makes predictions. Professionals decide. They're assistants. They don't make the decisions for you. We, as humans, are the ones that make the decisions. Now, think about us as builders. You know, think about all the news that we've heard in the past days. What's the impact of that on our profession, on our jobs? Yeah, and it always has been my passion to help builders be successful, and I hope that some of the tools that you've seen this week actually are going to change the way that you built your systems, and I think there's two ways that we see sort of generative AI impacting our world. Yeah? On one hand, how to incorporate generative AI into the application that you're building, yeah, and I actually, given that I'm a big fan of the CDK, the Cloud Development Kit, you know, this is a project on GitHub that actually has the CDK constructs for generative AI, meaning that, if you're building an application with the CDK, you can build data ingestion pipelines. You can build question answering systems, document summarization, Lambda layers, all these kind of things, straight out of the CDK. Check it out. The other way is, and I think that's probably going to have the biggest impact on all of us, is the collaboration between us and these coding assistants that are arriving now. Many of you have already been familiar with CodeWhisperer, yeah, and actually, I don't know if you saw this, but when I was developing the radiology application, I actually used Python, yeah, but CodeWhisperer helped me actually implement that. Yeah? I didn't have to think about that. It actually helped me sort of look at these APIs that are unfamiliar and the languages in that particular familiar for me. You might also have noticed that I wasn't using a notebook. Now, Jupyter Notebooks are the common way in which you sort of describe the machine learning project that you're working on. I'd rather work in VS Code than in a new development environment. So, I'm happy, I don't know if you noticed that, it was sort of an Easter Egg in that part, but you can now fire up a code editor from inside SageMaker Studio. Yeah? It's based… [applause] It's based on the open-source version of VS Code and allows you to actually, within SageMaker Studio, work in the environment that is already familiar for you, which is VS Code, and you can launch this full-blown code editor with all the additional AI powered, tools like CodeWhisperer, directly in the IDE from SageMaker, but I think the biggest part, and I think I hinted at that before, we have a lifelong learning ahead of us. Now, technology has changed rapidly in the past, what is it, a year. Imagine what's going to happen in the coming year, or in the coming five years. We need to be able to stay ahead of that, and for you, it's important to learn and be curious. Yeah? It helps you. It'll help you to actually accept these tools that allow you to explore new problem spaces and languages. It's becoming a creative tool to a sounding board for your ideas and approaches, yeah, and in Adam's keynote on Tuesday, you heard about Amazon Q, and I think this is absolutely set to transform many aspects of software development. Yeah? You suddenly get an expert assistant in building on AWS sitting next to you. That will reduce busy work and free up you to do more higher-value work, and especially, if you think about sort of the cloud space with all technologies in and around it, it is mind blowing. This is too much for a single person to be an expert in. However, these AI assistants can be an expert in all of that and can help you work through that. Then it can go from cloud computing solutions, machine learning algorithms. Options are endless. You know, this is not a bad thing. This is a very exciting time to be a developer. It really is too much for a single person to keep in their head, and that's where Q comes in. Now, I could take an example from earlier in the keynotes. It could say, what AWS services do I need to start building machine learning models, and Q will give me a starting point, or, you know, and it's not just a one-shot thing. It's not just a copy-and-paste kind of thing. It is ask, just iterate back and forth understanding all of these different things that you want to achieve, yeah, and help you make a plan, how to attack the problem space that you're looking for. It's more than just questions. Yeah? Q actually is integrated in quite a few other pieces, yeah? For example, in CodeCatalyst. It can help you start a project immediately inside CodeCatalyst. So, it can generate an entire new feature for you, or a new approach, and I find it a really great way to actually start learning about technology, even though, you know, you may even kill the pull request that it created for you, it's just a great learning tool to understand the complete code base, to understand not just at a file level, but at an overall system level. And of course, you know, Q sits in your IDE. So, it can help you there. You can ask it to explain or create code for you, and, you know, through conversations, it can adjust and iterate, and in the fullness of time, we will see Q operate on each of the different pieces of our development pipeline, yeah, and one example, for example, is the use of Q and Application Composer. I announced that last year, and I'm happy, actually, that Application Composer now is available within VS Code. This means that you can have your YAML file, now, your CloudFormation file, and a visual representation of that file in VS Code. Making changes in code, you'll see that represented in visual, and if you make changes visually, it is immediately reflected in the code at the site of that. Yeah? [applause] It is cool. Okay, fine, y, but Q is also in there, remember? Q sits inside VS Code as well, and you can ask Q questions about CloudFormation and then insert a response in it, and, you know, you can change your YAML file, and the cloud formation file and see this, and this multimodal way of question answer and diving deeper is something really, it's a really interesting new paradigm, I think, that will help us, as builders, have a lot more fun in our world. Now, with all of that, you know, you have one more day of learning ahead of you. I hope that I gave you a few hints today about being a good frugal architect, both in terms of cost, as well as sustainability, and that if I can build ML models, you certainly can. Yeah? I think there's never been a better time to be a builder. So, you have one more day ahead of learning tomorrow as well, a few more sessions, but tonight, tonight, we're going to party. [cheering/applause] Yeah? Major Lazer on the main stage, Portrait of a Man on the live stage. Yeah? With all of that, now, go build. [cheering/applause] You're going to have to make a choice. Hit or stay. Damn! What? That is so predictable. So, what did I miss? Nice keynote Werner, but just one thing. You said I could scan my container images straight from my CI/CD pipeline. Yeah, I looked into it. Doesn't exist. Maybe we just haven't released it yet. I knew it! Ready, Werner? Do it. Three, two, one. [cheering/applause] Now, go build. [music playing]

Info

Channel: Amazon Web Services

Views: 197,604

Rating: undefined out of 5

Keywords: AWS, Amazon Web Services, Cloud, AWS Cloud, Cloud Computing, Amazon AWS

Id: UTRBVPvzt9w

Channel Id: undefined

Length: 113min 41sec (6821 seconds)

Published: Thu Nov 30 2023