[sound effects] It's for you. Werner. Hello, Werner? - Who are you?
- I am the architect. I've brought you here to see this, - my system.
- What, that? Is that a Perl script? Listen, Werner, there's a problem with the system, ergo, I've brought you here. It's glorious
in almost every single way. Every single one of them
are my babies. I've even named them. However,
every time I add a new service, I have to build a new rack. Ergo, it's becoming very expensive. Also, it's very hot in here. Have you considered cloud migration? Cloud migration? I think we need a montage. [music playing] No servers. More! Do it again. One more time. Come on. Wait. So, you're saying
I can scan my container images straight from my CI/CD pipeline? [music playing] Yes! Look, you need to build
with cost in mind from the outset. You need to be The Frugal Architect. The Frugal Architect. [phone ringing] - Werner, wait.
- What? - Can I get some free credits?
- No. [sound effects] Please welcome the Vice President
and CTO of amazon.com, Dr. Werner Vogels. [music playing] Wow! I'm absolutely blown away. This is day four. You're supposed to be in bed, yeah,
and I really, really absolutely, I'm totally humbled by the fact that,
I mean, I arrived here at 6:00 AM this morning, and you guys
were already standing in line. That is, I don't deserve that,
absolutely, but who do deserve an applause, by the way,
is the quartet that just played. So, can I get your hands again? [applause] Someone else, some other people that actually need you
to give some applause, can I get the heroes
to stand up on the front line? Yeah. [cheering] These are invaluable members
of our community. They make it what this whole
community makes us special, and hopefully, also,
why you're all in this room today, and especially,
I want to a shout out to Luke who got a Now Go Build award
this time around. [cheering] Okay. Thank you. I know you probably all cringed
a little bit when I said cloud migration. Now, after all, I actually noticed, that for quite a few
of the people in the room, you've grown up your career in cloud. For you, there was no pre-cloud,
no hardware, no constraints like that, and but, you know,
the great thing about sort of moving out of that whole
hardware environment into the cloud was that we suddenly could build
these architectures that we always wanted to build. Now, we no longer
were constrained by, you know, the physicality
of all those servers, and also, I could actually,
as a distributed systems guy, to actually start
talking to my customers about how to really build
reliable, big scale systems, and but, you know, there was
something about that all throughout. If you get as old all as me,
then you start seeing the past with a little bit
of rose-colored glasses. There were things in the past,
actually, in this world, that we had these
hardware constraints that actually drove
a lot of creativity. Now, if I think about, and I'm going
to talk a bit about to you, about the cost of
all of those kind of all the systems that we're building, and I'm going to drive that
by actually my experiences as the CTO of Amazon
for the past 20 years almost, yeah,
and if I think back about, being in the pre-cloud days
of Amazon, the retailer, we were really good
at predicting sort of how much capacity we needed. We could sort of maintain
the decision to make sure we had 15% hardware over
the expected peak for that year, but still, nothing could happen
to us, nothing unexpected. Now, I remember that one day, in the days that Wiis and PS4s
were very scarce, someone posted a message
somewhere saying, tomorrow, Amazon will have 1,000 Wiis for sale
at 11:00 o'clock in the morning. Well, you know what happens
at five minutes to 11:00, eh? F5, F5, F5, assuming you're
a Windows user, yeah, and so, traffic explodes,
and absolutely, we really had to… we worked our way around it, and we were very creative
in trying to solve the problem to make sure that all the other
customers still could be served, but still, you know,
it'd be quite a lot of work and a lot of handholding,
but more importantly, there were also restrictions
on business innovation, because of that. Because it's a few weeks
before Black Friday, a team would come to me and say, oh, we have this brilliant idea
to do X, Y, or Z and it will give us, what is it,
additional revenue at this much. You would scratch your head
and think, like, how are we going to do this,
but we always made it work. There was sort of an in sort
of building these systems and living within
the constraints that you had. Now, cloud, of course,
removed all of those constraints. Now, suddenly, you were
no longer constrained. You could do all these things. I didn't have to have long
conversations with the business about sort of
reducing the footprint of things. Instead, you could do everything,
and as always, when constraints get removed,
when we throw off the shackles of something that keeps us down, we have a tendency
to swing this pendulum all the way to the other side. Suddenly, what is
the most important thing is actually to move fast,
to get new products out, to start thinking about all
the things you could do now that you couldn't do before,
yeah, and that's amazing, and we have seen amazing innovations
in the past 15 years happening on top of AWS, but as speed of execution
becomes more important, we kind of lost this art, this art of architecting for cost
and keeping cost in mind, and if you've,
this is your 12th re:Invent, just like it is for me, you may go back to this first
re:Invent in 2011, and I put up these sort of,
I think there were 12 sort of tenants that I thought you should think about
when you were building for the cloud, yeah, and one of them
was to architect with cost in mind, because suddenly, you could, yeah? Remember, we were making it clear, cost explicit for all the resources
that have been used, and so, it was very, very easy
for you to start thinking about how, what is actually
the cost of this system that I'm building right now, compared to the other one
that I built yesterday, and so, as you no longer
have these constraints, it drove amazing innovation, but the macroeconomic climate
sometimes changes, and more noticeable
in the last few years is that companies, more and more, have become interested in sort of
what is this all costing me, and so, I hope that today, you know, you're going to listen
a little bit to me about my experiences
of the past 20 years of building cost aware architectures, yeah, and by the way, you've already seen
amazing innovations and announcements this week. This is not going to be
one of those keynotes. So, sit back, take out your notepad,
and start making notes today. And so, many of us don't have to live within these constraints anymore, but there's quite a few
companies that actually do, and a great example of that
is the Public Broadcasting Service. You all know that, you know, they make all these programs
for their affiliates, and their famous tagline
is, of course, supported by viewers like you and me, but they have to live
within a strict budget, and so, it's not only that
they provide all these programs for the affiliates,
they also stream all content, and at the 40th anniversary
of Sesame Street, they completely broke down, because they were streaming out
of their own datacenters. It was 2009, and they knew that they
couldn't continue to do this, because they just couldn't
afford the hardware to do this at massive scale. So, they migrated over to AWS,
and, you know, as always, if you just do lift
and shift of something that wasn't scalable and efficient
in your own datacenter, it isn't certainly scalable
and efficient in the cloud either. But they wanted to continue
to start off with, at least, with the existing software, and so, they made use of ops
works and time-based scaling. That basically meant that they were
still not very resource efficient, which is crucial for them,
because the money they can save, they can do a lot of other things
with, yeah, and so, while they actually moved
to the cloud, they also started to realize
that they had to re-architect, and they re-architected making every, using every possible AWS
service that they could, and so, they were streaming directly
out of S3 and out of CloudFront. They moved over to ECS and Fargate, really absolutely driving
all of that cost down. They actually reduced
their streaming cost by 80%, and if you were a fan
of the Ken Burns documentaries, just like I am, they recently had this documentary
called The American Buffalo, extremely popular,
and a great documentary, and it went off
without a hitch at 80% less, with 80% cost savings
at the same time, not only because
they moved to the cloud, but because they rearchitected
for the cloud with cost in mind. Now, next to cost is something else
that is really on my mind these days, and it should be
on your mind as well. This is a freight train
that is coming your way, and you cannot escape it. I think, in absence of us providing you
with sort of the information about milligrams CO2
used by your services, cost is a pretty good
approximation for sustainability, and we've got quite a few companies
that are asking us to really help them
build more sustainable architectures, and I think us as a society,
as tech society, as technologists, have a major role
to play in making sure that our systems are
as sustainable as we can be, and remember, in AWS, you know, you pay for each
individual resource used, which means that cost
is a pretty good approximation for the resources that you've used, and as such, your contribution
to sustainability. Now, throughout this talk,
when I say cost, I hope you also keep your mind
sustainability at the same time. Now, a company that is actually…
it is actually the lighting house. This is one of our oldest
European customers: WeTransfer, and I don't know
if you know them. They actually support having
these very large files that you can upload
and then distribute. They reorganized themselves
as what's called a certified B-corp, a company that has the highest
standards in environmental, social, and fiscal transparency,
and they are still able to innovate while actually lowering their
emissions at the same time, and their server usage was
their biggest energy dependency, and they re-architected
in such a way, so that they could
start to forecast, track, and measure carbon emissions, while serving 80 million
people a month, and they've done this
for some very unique strategies, and you'll see
some of these strategies coming back into
my experiences as well. Now, if you look at the video
before that, I presented the architect
with The Frugal Architect, and basically,
this is sort of a book, where I've sort of encoded
my experiences of the past 20 years
of building cost aware, sustainable architectures,
and I think, as builders, we really need to start
thinking about this, not only because we want to be frugal
in the way that we use our resources, but also, as sustainable as possible. Now, these are not hard rules,
and I call them laws, but they're not like legal laws. They're more like bio
and physical laws, where you have lots of observations, and then you codify
those in framework. Of course, you know, nature
doesn't care about those laws. The laws are for us, so that we have
a framework to think it, and there's not hard rules, yeah, but to actually give it
a little bit of structure, I've put them
in three different categories. One of them is design,
measure, and optimize. And I want to start off with probably
what's the most important thing, yeah,
and the most important thing here is that cost needs to be
a non-functional requirement, and if you think about
non-functional requirements, now, there's all these
sort of classical ones. Security, compliance,
performance, availability, all of these are actually things
that are not the specific features and functions of the application
that you're building, but the ones that you have
to keep in mind at all times. Now, I think security, compliance
and accessibility are non-negotiable, and the other ones you can make
all sort the trade-offs on. There's actually two other ones that I believe should be in this list
as well: cost and sustainability. Both of them should be treated
at equal weight when it comes to non-functional
requirements for your business. Now, it is easier these days
to measure cost. Now, if I go back to my early days
as CTO of Amazon, you know, I basically had
to write a big check upfront to this database company before I could start
using their architecture, and it wasn't just a little check. It was a big check, because I needed
to think five years ahead, how much capacity do I think
I have five years ahead, because it was the only way
to drive costs down, and so, it was very hard on day one, or on the second year
or the third year, to think about how much of that cost is actually going into the systems
that I'm building at this moment. In AWS, of course,
that's radically different, yeah? The pay-as-you-go
goes for used resources, and actually,
when we started building S3, as being the first really big AWS service, we had to think about
what kind of resources are we using, what is our cost that we need
to expose to our customers, because we want our pricing model
to be cost following. Cost following means
that we expose our costs to you. Now, we sat around the table,
and we were thinking, well, what are the two biggest costs
that we are going to have in this that we need to put
in the pricing bubble? Transfer, of course, bytes on
the wire, and storage, but when we started onboarding
our first, very first customers, we started to realize that there were
more resources being used by them, and we actually needed
to add a third dimension to how we would price this service, and the third dimension
was the number of requests. Yeah. What I want you to take away
from this is that, you know, especially if you build something
radically new, you may not have an idea
about exactly how your customers are going to use your system, and how much resources they're going
to use for each operation. So, make sure you can observe that
and immediately react to it, so that you understand
exactly the kind of resources that you are using
to serve your customers, but by the time we built DynamoDB, we had this completely
down to an art, and if you remember,
when we launched DynamoDB, we launched with two types of reads. One read was eventually consistent. Basically, it runs a quorum
underneath. Then let's say, there's
three nodes in the quorum. It goes down. It does one read to a node,
and you may get last update or not. We also launched with
a strongly consistent read. The strongly consistent read
basically went under the covers the two reads to reach the quorum to make sure that you will get
the latest update back. You have to do two reads for that. So, we made sure that
eventually consistent was half the price
of strongly consistent, because we had to do twice as much
as work in strongly consistent to access the one read
for eventually consistent. So, I want you to take away
from that, you have to consider cost
at every step of your design to really keep that in mind, and then if you think
about sort of your business, because after all, we are not
just building technology for technology's sake. We're building technology
to support our business, and I hope that all of you
are in an organization that probably has some sort
of agile development strategy, where you are close
with your business partners. We are continuously talking about
the functionality of things. How reliable does it need to be? How scalable does it need to be, and most importantly,
how much will this cost? And that's a conversation
we haven't always had, but we need to have
this conversation continuously with our business partners. Now, at one moment, especially when I started advising
more and more startups, I tried to hammer
this down on day one when a startup is thinking
about that product. What is the revenue model
that you think you're going to have? How are you going to make your money? And then make sure that you build
architectures that follow this money. That's important,
because if your costs rise over a completely different
dimension, you're going to be toast, eventually. Yeah? So, align cost with revenue. Now, if you think about a company
like amazon.com, probably the best measurement
for our success of the operations is sort of orders per minute. Right? That's basically sort of
the dimension where we are making revenue over, but we then need to make sure
that our infrastructure scales in such a way that, actually, cost doesn't grow in a completely
different dimension, but also, that we can use
economies of scale, eventually, to drive that cost down further,
and as you can see, sort of, if the difference between cost
and revenue would be profit, you know, profit should
increase over time, if your cost rises over the same
dimension as that your revenue is. I've also worked with
younger businesses in which it didn't go that well. This particular one was actually
one of the first ones that were building
these MiFi devices. This was before ubiquitous
mobile communication, and so, yeah, basically,
you had to buy a ten-gig data package every time you ran out,
and it's a good model. Now, they basically were running over
one of the bigger telco providers, who they had to pay. So, every time, basically,
if you look at the one on the left, basically, the green boxes
are actually unused capacity that they just had part
of their revenue, and then customers came to them
and said, you know what, all this buying
every time ten gig, so bothersome. Can't we have an unlimited plan? And without thinking,
probably really thinking, the company said,
yes, of course, we can, and then, you know, it put a decent
high price against it, and what happens then,
if you remove constraints, certainly, customers start behaving
in ways that you didn't anticipate. They started watching Netflix
over their mobile device, yeah, and as pretty quickly, actually,
their usage ramped up tremendously in ways that customers
were no longer paying for it. The company went out of business
because of this, yeah? Be really smart to make sure that the dimensions
of which you make revenue are also aligned with where
your costs are coming from. Now, it's always good to think
about flywheels, and you probably have seen
this napkin drawn by Jeff Bezos many times. Now, Flywheels are things that sort
of continue to put energy into it. The more energy put into it,
the better it works. So, it starts off with selection, and selection means the number
of products in the catalog. The higher the number of products
in the catalog, the higher the likelihood
is customers can find what they're looking for, gets a great customer experience,
drives more traffic to the site, makes that more sellers
want to sell on the site, because there's more traffic, which means that
the catalog gets bigger, more products in catalog,
selection grows. So, you get this continued cycle
that suddenly starts to accelerate and drive growth,
and then if you make use of that, the economies of scale of that,
to lower your cost structure, and then lowering pricing
for your customers, you have another flywheel
that drives into that. Suddenly, two things happen when you go into
that customer experience, and it really accelerates
the way your business grows. So, really, make sure that
your business decisions and your technology decisions
are in harmony with each other. Now, you know, and sometimes,
it's easy to think upfront, you know, these are the resources
you going to use. This is sort of what I need
in my pricing model. When we started Lambda, that was
a whole different ball game, yeah? Again, we knew that customers
wanted to have serverless compute, just like serverless storage
and serverless databases. They didn't want to think
about scale and reliability and things like that. They just needed it to work,
and many customers wanted that. We also knew that we wanted
to make the principal decision that we should be charging
over two dimensions, which would be, you know,
milliseconds of CPU used and amount of memory
used over a certain period. Now, as always, you know,
we didn't read, if you build something
radically new like Lambda, you have no idea how your customers
are actually really going to use it. That was what we learned
from S3, yeah? So, we also knew we needed
to get insight in this before we could build
the right architecture. We didn't have the right
architecture underneath that, that we could build on our own. So, fine-grained isolation
and hotspot management and things like that. So, there was this tension between
these three different things, yeah, these three different
functional requirements: security, strong isolation, yeah, cost and getting insight
into your customers. So, we made a decision upfront, given that we didn't have
the technology to make this very cost effective, to actually sort of sacrifice,
immediately, cost. So, it's day two projects. One was basically starting Lambda, and the second completely
greenfield projects underneath to start to figure out
what kind of an infrastructure we would need to support it. So, we were willing to
take out technical and economic debt on day one. We also knew, immediately on day one, that we had to pay
that off eventually, because just like any other debt, you know, the interest
keeps compounding, and at some moment,
it becomes unattainable. So, we started off with using,
to build Lambda, we started off
with the smallest building blocks we have that could give isolation, and so, those were the T2s,
the two T2s, and you probably all know
about virtualization by now. Basically, T2s went on top of
a hypervisor, went on top of real hardware, and these are actually
pretty coarse-grained. If you think about Lambda functions
being really small, these T2s, even though they're the smallest
instance type that we have, is much bigger than what we needed
to actually execute that function, but we needed the isolation,
we needed the security isolation. So, we went ahead and actually
implemented using T2s, and so, basically we had
a whole compute pool full of T2s where we were executing
this Lambdas in, and we made sure that each Lambda
from each account would be executed inside isolation,
so, inside the T2. Now, you can already see what is
happening there, is that some of these T2s
are tremendously underutilized. Why? Because there's only two or three
of these Lambda functions executing every second, but also, we saw exactly
the other side happening, because quite a few
of these lambda functions actually happened in synchrony. So, you would get not one execution
of this Lambda function. You would get 1,000 or 10,000
at the same time, and so, where, on one hand,
they were only utilized, quite a few others
were completely overloaded, and we couldn't do
fine-grained resource management, because we didn't have these fine-
grained capabilities underneath them. So, we knew we had
to repay this debt, and we did that by doing
massive innovation, and the innovation became
what we now know as Firecracker, yeah, the idea of building
micro-VMs based on KVM, and we could launch a full isolation
virtual machine in a fraction of the time
that it would take the spin up a T2, and KVM exploits
hardware virtualization. So that makes that it's extremely
efficient for these very small VMs that we're doing,
and so, also, it allowed us, given that, now, we have
a very small boundary in isolation, to make sure that we can use
multi-tenancy, that it's very easy
to do hotspot management now, because you have these very fine
grained isolation boundaries, and we got to make sure that we could
fully optimize memory and compute by basically hotspot managing over
different types of physical hardware, and so, all of this drove
tremendous innovation. It's not only that we were able
to move our customers from the T2 environment Lambda, without them noticing,
over to Firecracker. Well, to be honest, they did notice. Everything became a lot faster, and their performance
became a lot more predictable, but we didn't tell them that. They were just happy with that, yeah, but it also gave us an environment
for other innovation. Without Firecracker, we would not
have been able to build Fargate. Fargate allows you to run
serverless containers, no longer having to think
about infrastructure management. So, yeah, and also,
what came with that, and something I want to repeat
from what I said last year, you have to build
evolve architectures, because your architectures
will change over time, and you need to make sure
you can evolve them without impacting your customers, and also, of what we saw here
with Lambda as well, we're very successful, initially, in getting feedback
from our customers, how our customers were using it
when we were running on the T2s, and then building an environment
underneath there that really matches our costs
with the pricing model that we gave our customers. Now, actually, let me
take a step back. This is a fun story that one of
the distinguished engineers of S3, that someone told me, about sort of
the evolution of S3 over time. Yeah? S3 started off
as a single-engine Cessna, yeah, and then it was upgraded
to a small jet, and then to a group of jets,
and then eventually, to a whole fleet of 380s
that are refueling in midair and actually continuously
have our customers moving from one plane
to another plane without them ever noticing it, and that's the power
of an evolvable architecture, but what I want you
to walk away with… this is a fun story, but what I really want you
to walk away with is that when you are creating
technical and economic debt, because you're not taking cost
into account, you have to pay it off. My next observation is
that architecting is always a series of tradeoffs, and it's a series of tradeoffs
between non-functional requirements and the functional requirements
that you have as a designer. Yeah? So, you can look at that sort of cost versus resilience versus security,
all of this, and so, I can tell you stories
about this at Amazon, but I'd rather have someone else
with similar experiences tell you this story as well, and so, my next guest
has a great story to tell how they aligned
their business and technical priorities
to achieve remarkable growth. Please welcome on stage, Cat Swetel, the senior director
of engineering at Nubank. [applause/music playing] Thank you, Werner. I'm so honored to be here
with you all today. With 90 million customers, Nubank is the fourth largest
financial institution in Brazil and the fifth largest
in Latin America. But only ten years ago, we were just a few people
in a little house in Sao Paulo. Back then, the majority
of Brazilian banking institutions were managing mainframes
and legacy systems, but with cloud technology, Nubank was able
to disrupt the market, making banking more
accessible for customers who never had access before. Nubank's journey all started
in this casinha, the little house that I
just told you about, where Nubankers bankers
worked on products that were built to be so efficient that we could charge
much more reasonable fees. How did Nubank achieve
such rapid growth in only ten years? We were born on AWS utilizing the new region
that had just opened in Sao Paulo about a year and a half
before our founding, and AWS is still Nubank's
preferred cloud provider. Our first product was
a credit card with no annual fee and an unparalleled
customer experience, but that disruption
was only the beginning. Soon, we had a bank account,
insurance, investments, loans, an in-app marketplace,
the list just keeps growing. In under ten years, our technical environment consisted
of over 40 different AWS services underlying over 1,000
closure microservices. We were focused on growth,
and we were succeeding. Then in 2020, the Brazilian
Central Bank approached financial institutions
with a radical new idea for how to transfer money. Before 2020, transfers between
accounts in different Brazilian banks
were slow and expensive. They took up to a full
business day to complete and cost up to $5 U.S. Then to incentivize
financial inclusion, Brazil Central Bank proposed
a new protocol called Pix. For those of us in the U.S.,
Pix might be a strange concept. It's truly instant,
real-time liquidation, zero cost to customers,
available 24/7, 365, and all backed by
the Brazilian Central Bank, meaning when someone
transfers you money, it's instantly available
in your account, your regular account, so that you can make
a purchase or pay a bill. So, we spent five months
developing Nubank's Pix flows to meet
the ten-second latency requirement dictated to us by the central bank. When it hit the market,
Pix was a huge success, far outpacing the usage
that Nubank had anticipated. In about a year,
Pix transactions per month had exceeded the combined total
of credit and debit transactions. The scale was massive, and it significantly increased
load on our mobile app and our customer facing flows. Our whole technical environment was under an unprecedented
level of stress. At this point, we were in a bind. We were facing instability
in multiple flows driven by that increased Pix traffic, and we were also facing
increased cost scrutiny as we transitioned as a company
out of startup hypergrowth mode. How would we deal with the tradeoff
between cost and stability? For us, the answer
was to choose both. We suspected that a lot
of our exploding cost was just due to the misguided ways
we were trying to achieve stability. In many cases, we were just
throwing more machines, more memory, whatever, at the problem instead of actually
solving the problem. Our hypothesis was that,
if we stabilized our systems, cost would also stabilize. With AWS, Nubank's Pix team
spearheaded a multi-team effort to test that hypothesis. Of course, we initially addressed
urgent architectural challenges, but we also made three less obvious
but very impactful changes. For the first change, we noticed
that some of Nubank' microservices were experiencing instability as a result of long
garbage collector pauses. So, in our quest for stable
efficiency, we started to experiment with the Z Garbage Collector
for those microservices that were experiencing the long,
stop-the-world GC pauses. Now, ZGC cost us more in RAM
than the G1 garbage collector, and it really made no difference
during steady state operations, but it dramatically decreased
the maximum GC pause length, which saved time and money for
some of our most critical services. After garbage collection
was addressed, we started to look towards
our database's caching strategy. Our canonical database,
Datomic is an append-only database that's backed by Amazon DynamoDB. Datomic makes use of an in-memory cache as well as Amazon ElastiCache
as an external cache. As the amount of data grew for
some of our most critical services, data locality became a challenge, and more and more transactions
had to hit that external cache. At first, we tried to just add more
memory to beef up the local cache, but that proved pretty inefficient. So, instead, we decided to start
experimenting with a new caching strategy
using NVME discs, where we could cache a lot of data
and query with pretty low latency. As just one example of the great
results for one of our critical
microservices, for every $1
that we invested in NVMEs, we avoided spending $3,500
across those flows. So, the stable option ended up
being a net cost savings. Our culture also changed, and that
was a big part of Nubank's success. In order to make important decisions
and good trade-offs in context, leaders at Nubank need to have basic
technical understanding of their products
and our infrastructure, and that movement kind of started
with the Pix leadership team, but the change quickly became
a standard across the company, and today, business units at Nubank are
expected to have an AWS cost champion to help leadership
make informed decisions that balance competing concerns. In the case of Pix,
our hypothesis had been proven true. Stable systems
were efficient systems. Cost stabilized and became
more predictable. Meanwhile, the time we spent
in high disparity incidents decreased by an order of magnitude, and the P99 on our latency
SLA decreased by 92%. In fact, with the remarkable
35% efficiency ratio, we stand as one of the most efficient
companies in our sector, and that transformative impact
has saved our 90 million customers $8 billion in fees in 2022. [applause] Nubank's growth is fueled by
our low-cost operating platform and our efficiency, which allows us to charge less
and invest more in our customers. Now, for every two adults in Brazil,
one is a Nubank customer, and we hope to continue
closing the gap and make banking accessible to all.
Thank you. [applause/music playing] Thanks, Cat. One thing that really stuck me
was those words. The business needs to
understand AWS costs, and I think someone wrote on Twitter, every engineering decision
is a buying decision. Keep that in mind, and also,
I like, actually, the way that they put their metrics
available for everyone to watch. I know that, when you start
making your metrics visible, it can change behavior, right? You have to really figure out, when you think about sort of
measurements and observability and things like that,
and just like Nubank, I want you to work with your business
to align your priorities, and your only way to do that
is to really understand them. Now, next to those three laws that are considered to be
in the design phase, you will continuously need
to sort of understand where your costs over time
are actually going. Yeah? Unobserved systems lead
to unknown costs, and I have a really great story here. My hometown of Amsterdam,
yeah, beautiful houses, old houses out of the 1600s,
and things like that. In the 70's when I grew up,
there was this oil crisis. I don't know if you remember that. We had carless Sundays,
and of course, at that moment,
everybody started to understand, started to become concerned
about the cost of energy, and there was this great
investigation at that time, because it turned out
that there were houses that were almost identical, but some of those houses
used one third less energy. Why was that? It was mind blowing, because
these houses were the same. By the way, there was no
double glazing and things like that
in those days yet. Yeah? So, these houses are just
radiating heat the whole time, but some of them
are radiating less heat. So, what was the difference
between those houses? The houses that used more energy
had their meter in the basement. It was basically hidden. The houses that used less energy
had their meter in the hallway. The fact that, every time
when you entered your house, you could see how much energy
you have been using completely changed behavior,
and as such, you need to make sure, first of all, that you understand
what you're measuring, of course, and how that measurement
can change behavior. Now, if you're in retail
for like amazon.com, there's a number of costs that you
actually always have to keep in mind. Yeah? On one hand, remember, Amazon is a massively
microsystem-driven environment, yeah, where each of the costs
to a particular service, each of the requests to a service
will have a certain cost. Now, of course, it's often hard
to measure that, but you need to. You also need to, if you have
actually one top-level request that goes out
to all these microservices, you need to be able to get
the aggregate of that, and then you need to figure out
what is actually my conversion for each of those requests,
yeah, and actually, so, there's literally dozens of
features on an Amazon homepage, and each of them may go out,
actually, to hundreds of backend services. Yeah? So, you need to actually
sort of decompose this, and all of these features
that you can decompose it come at a certain cost. What's the total cost
of this experience, yeah, and you can actually measure
those individual costs. They actually have to measure
at microservice level. They have to isolate this one spill. For example, the service
that can give you an estimation of delivery speed. How much does that cost me
to do this, yeah, and of course the easiest way
would actually be to just take it over a certain period of time. Take the number of requests,
divide them. That's a little bit simple, but it's a good approximation
for you to think about, yeah, if your cost
is a normal distribution, then that probably will work. More importantly, over time,
your costs should be going down. If you don't make any changes
to this, and not even maybe
some economies of scale, but also, you should be able
to re-architect to do profiling, to start looking at sort of moving maybe from one architecture
to Gravitons, all these different things
that you can do to actually drive your cost down. So, over time, the cost per request
to this microservice should be getting down,
and then on top of that, you have to figure out what
your transitive costs are, yeah. What's the cost of serving
this application or this webpage for you? What's the total cost, yeah? Can you figure that one out? Now, I'm going to show you
now a slide, and just like when I saw it
for the first time, you're going to scratch your head. This is the number of microservices
in the back end of amazon.com. Yeah? My request to the homepage
originates over there, goes out to all
the other microservices to construct this page for you, and you can dive deep
into them, in each of them. You can figure out what the cost
is of the individual apps, the ones that are actually, again,
making calls to other microservices, but you need to understand
the complete picture, the complete cost picture
that this one page, this homepage actually costs,
and you can, because remember, in AWS, each of the resources
that you've been using comes with a dollar tag associated with it. So, you know exactly the cost of
every single one of these services, and we know the cost
of the whole system, and then of course,
you need to figure out, you know, well, there's actually
the contribution of each one of these features
to my conversion rates, yeah, you need to also understand
the value of new features. Now, if you start actually
spending more money on actually creating
this page for you, you should see
your revenue coming up. You shouldn't see it
actually flattening out, because that means
you're making investments that have no return on investment, and there are, indeed,
diminishing returns at some moment. Now, one of the things
that we are very strong at, and probably everyone else
that actually has a web application, is that you understand
there's this common knowledge that improving latency of your
webpages will improve conversion. So, if you build an evolvable
architecture, you probably also make it easy
to actually experiment. So, imagine that the 99% percentile
for your webpage latency is 1.7 seconds. If you can engineer that to bring it
to 1.6, right, you know how much
that's going to cost you, how many more resources, and you can see what
the impact is on conversion, and at some moment,
bringing in the latency does no longer have a return on that. Measure that. Think about how to measure that
and make it upfront. Make sure that everybody
understands that. Yeah? You have to know your cost,
and I think it's often, it's complex as you see the application
that I just showed you, the backend for amazon.com
is a pretty complex environment, but we have made this really our own, because we need to understand it. Retail margins are razor-thin. Now, we need to have total control
over our cost at any time. Now, I also know that
quite a few of you are literally running
hundreds of applications, and it's sometimes really difficult
to really understand sort of what are the metrics
that belong to this particular one application, and you've been asking this
for quite a while, and I'm happy to announce,
today, you know, myApplications. It basically gives you
a new experience in AWS control. It gives you visibility
to cost, health, security performance
per application, yeah? What you can do there is basically
you have a new application tag. You assign that to the resources
that make up your application, and then you get a single view
of this observability into many of the standard
functional requirements, non-functional requirements,
and cost, and with cost, also, a proxy of sustainability. Now, sometimes, it's hard to
instrument your applications, yeah, especially these days, if you start off with Kubernetes
for example, with EKS, and you're building
this distributed application in many containers and container
types and things like that, and instrumenting them in a way, so that you get a good holistic view
of them is not always easy, and you need to do it
in a consistent way. So, and I know that's a lot of work. So, I'm happy that today
we make available for you what's called CloudWatch
Application Signals, such that it will
automatically instrument the EKS applications
that you're building, yeah, so that you can have
one single dashboard, immediately looking
at all the metrics that are relevant
for your EKS application. With all of that, I want you
to walk away with this one, yeah? Define your meter, because if you can continue
to look at this meter, it will change the behavior and make sure that your meter
includes cost and sustainability. Now, another observation I had that, you know, if you build
cost aware architectures, you need to implement cost controls. Now, you can't just rely
on good intentions, you need to put mechanisms in place, and as such, you need to build
and have at your fingertips, and you have it in the cloud,
tuneable architectures, and remember, in AWS, the knobs always go to 11. Ah, come on, you must
have watched Spinal Tap. [applause] I don't care if you clap
on loungers or not, whatever, but if I make a joke, I would really like it
if you would laugh, yeah? [laughter] Now, okay, go back. What was I doing? So, software changes,
software choices, like, you know, database tapes, API languages, and well-designed
systems with NFR, they allow this tuning, yeah,
and if you look, bringing it back to amazon.com again,
you imagine that, on this homepage, there are all these
different components, yeah, and you need to have controls
to manipulate those components. Imagine what you're seeing
is either cost or performance or one of the other metrics that you're following
is going out of bound. You need to be able to switch
some of those components off, and that's really important,
but you need to build the switch, but the switch should be
in the hand of the business. It should not only be something where you as a technologist
makes a decision. Yeah? It's a decision that you make
in concert with the business. Yeah? Because after all,
that's who we are serving. That's who we are working with, yeah, but you need to have these switches
and dials available, yeah, so that you can make these decisions. Crucial in all of that
is to be able to do decomposition of your application that you have. Start to figure out which
are the things in your application that are really, truly important,
medium important, maybe not that much. Yeah? If you think about Amazon retail
again, what is important? What always needs to work
to have the application work? Search, browse, shopping cart,
checkout, yeah? Without that,
we're dead in the water. The system, the application
doesn't work, and then there's a tier two. We call that tier one. Tier two are maybe features such as recommendations,
personalization, similarities. They are things that really
are important for the customer to actually discover the products
that they're looking for, but they're not part of the true
core of the application. One of the things
that actually moved from tier two into tier one is reviews. It turns out, if reviews are offline,
customers are not buying, because they trust
the opinion of their peers, and so, you have to make decisions then together with the business
to make these trade-offs. How much am I willing to spend
on fault tolerance of tier one? Probably as much as you can,
because that always needs to be on, because without that,
you don't have a business. Replicate over three AZs at minimum. Maybe for tier two, you're willing to actually
dial it down a little bit. Maybe for tier, replicating over
two AZs is sufficient, and for tier three,
best seller list, yeah? Who cares? Yeah? If they're offline for five minutes, it doesn't really have impact
on the customer experience. So, you may have a different type
of resilience there in mind, but make sure that all of
these pieces are controllable, and whether you switch them off,
or whether you throttle them, or whether you maybe
turn off prefetching, you know what we do,
we actually search for something. We look at what is the most likely
products that you're looking for and then start prefetching them
to make sure you're faster. Maybe you turn that off,
maybe fewer details, but all of these knobs, all of these controls
are for the business. You have to give control
to your customers. So, before all of that,
what's my advice there is, you know, establish your tiers. Start thinking about which
are the pieces of my system that absolutely need to be up and running with predictable
performance all the time. Now, if you think about sort
of the… about optimization, there is another,
what I consider lost art. You know, given that we have been
able to focus on really fast innovation, yeah, and we are actually,
we've been moving really fast, and something that we did when we actually living
within the constraints was really tinkering
at a smaller level, but that tinkering at a smaller level
is becoming more and more important, because it turns out quite a few of
your costs are actually going there. You need to start thinking about
what is sort of the digital waste that is laying around in my system. Yeah? What are the things
that I can just stop? Maybe the business
doesn't like it anymore. PBS said they had one
particular series that maybe someone watched twice
a month or something like that, and they had still had it running,
and they managed to turn it off. When you go home at night, do you turn off
your development environment? You should. There's no reason to keep
that running at night, yeah? Or maybe right sizing. Move to a smaller instance
or maybe to a bigger instance, or more importantly,
move over to Graviton, so you really can drive
your costs down. Or maybe, you know,
start thinking about how to reduce the kind of capabilities that you
are presenting to your customers. Is it really necessary
to stream in 8K? Is it really necessary to send
this five-megabyte image over, that then the browser actually makes,
puts it down to 600x100 pixels? Is that really necessary? Start becoming smart
with that, you know? Reduce it to actually
the amount of resources that you really need
for your application. Now, when I think about this
lost art, I think about profiling. I don't know how many of you
grew up with this, but this was in my toolbox
when I was in school. Now, you really need to be able
to dive deep to be able to understand exactly, at a functional level,
where your time was going. A CodeGuru Profiler actually
gives you this as well next to the language profilers
that you just saw. Yeah, a profiler will, in general, generate something
like this flame graph. This is actually of a real
Amazon service. I'm not going to tell you which one,
but I think you can actually dive deep into this and figure out
where your cost is going. You see 8% is going
into garbage collection. That's your choice of programming
language over there, that impacts that, but there's
a large part left over here, and that's turned out
to be network communication, and that's kind of out of balance. That's not what we would expect
in that particular case, and then diving into the code, you suddenly start to understand
what happened there. Whenever designing this,
they had a common case in mind, and then for anything,
maybe 99.0% were the common case, and they had 0.1%, they would
follow an exception for it. Turned out they didn't really expect that it was the inverse of that. 99% of the packages
came in hit the exception, and so, by simply changing
this exception handling into an
"if then else" statement, basically completely removed
all the processing that was necessary there,
the exception processing, and then there, this is
the "if then else", and what happened then is
that we actually went from 42% down to 27%. So, this is actually a process
that you continuously need to do. Even if you don't find,
let's say, these big disparities, you still have to understand
exactly where your cycles are going, and this is a continuous process. It doesn't stop after day one. You need to completely understand, and especially wouldn't be
the first time that I've heard
from customers saying, why is our, the backend service
for our iOS app so much more expensive
than the one for our Android app. Well, maybe you should
start looking at how they're implemented, right? The power of profiling allows you
to be curious and dive deep. Now, the last one, my observation
may be a little bit more controversial, yeah, and so, please hold onto your egos
at this particular moment. Yeah? The most dangerous phrase
in the English language is: we have always done it this way. Yeah? [applause] The admirable Grace Hopper, you know, the grandmother
of all those developers here was a very wise woman, and, you know,
it wouldn't be the first time that a customer who says, yeah,
but we are a Java shop. Oh, we're really great at Rails,
yeah, we've always done it like that, and, you know, we've done this before
in my previous company. We're going to do exactly the same. You have to keep in mind that,
you know, development is quite often expensive, but the cost to build dwarfs to the cost of operating
your applications, that's something
you have to keep in mind, and the way that
you build your applications, the platforms you use,
the programming language that you're in should be continuously
on the scrutiny whether you are
picking the right one. Now, I mentioned earlier
that I thought that cost was a good approximation
for sustainability. Maybe the other way
you around as well. There isn't that terribly
much research into how much certain
programming languages will cost you. However, there is brilliant research
by Rui Pereira at INESC in Portugal about the energy usage
of programming languages, and so, he first launched
this paper in 2017, and it shocked the development rules. He released this paper
with much deeper insight, much deeper showing different
types of applications that were being built. Turns out Ruby and Python
are more than 50 times as expensive as that C++ and Rust. Now, I know the reasons
why you wouldn't want to use C++ with all security risks
that there are, but there is no reason why you
should not be programming in Rust, if you are considering
cost and sustainability to be high priorities. [applause] We implemented Firecracker in Rust. Large parts of S3 are
really implemented in Rust, and not only because
the energy usage is lower, which we aimed for, but also because the security,
the strong typing, the memory safety that you get
in a fast, efficient language like Rust is very important. Now, with all of this, now,
I don't want you to immediately start sort of throwing away everything
that you know and starting over, but we as technologists live
in a world that is moving so fast that we always need
to continue to learn. We always need to dis-confirm
our own beliefs. Yeah? Put your ego aside as being
the master Java programmer but start thinking
about the cost, actually, and the complexity of how to deal
with garbage collection. Start thinking about that maybe
this massive platform underneath there, yeah,
maybe that's costing you a lot, even though it allows you
to do very fast prototype. So, disconform your beliefs. Now, so, these have been sort of
my observations around cost and sustainability that I have learned over time
at amazon.com. Yeah? Cost awareness is a lost art. We have to regain this art,
mostly also, because sustainability is a freight
train that is coming your way, that you cannot escape
and should not escape, and cost is a pretty
good approximation for the amount of resources
that you've used. The constraints from the past,
I don't want to go back to them, but we may actually be willing
to have self-imposed constraints. Yeah? Put some constraints
around the systems that you're building in terms
of cost and sustainability. Yeah? That's why I believe
that constraints, even self-imposed,
can breed creativity. Now, you know, with that, I think The Frugal Architect
is live at this moment, not that there is that
terribly much information, but we would love to work
with you, actually, to incorporate your cost awareness
learnings you did over time as well. So, the site
is thefrugalarchitect.com. [phone ringing] Sorry, give me one moment. [phone ringing] [music playing] Not what you were expecting? Probably more private islands
and sailboats. Ah, not that Oracle. I am sometimes right about
the future, literally. That's why you're here. I need your insight. Everything fails all the time,
even the simplest of hardware. You see, I have been asked
to use my gift to make some tech predictions
for a La Predicta magazine, and I've heard that you are
something of a tech soothsayer. So, what are your predictions
up until now? Envisage, if you will, a world
where artificial intelligence is represented by an omnipresent,
benevolent attendant. It'll revolutionize industries
like healthcare, freeing medical maestros
from administrative burden. That's not really the future. You know, healthcare is already
deeply ingrained in very advanced analytics
and machine learning. Okay. How about this? Dare to dream of a future where developers are no longer
solitary mavens coding within the confines
of individual experience. No, the builders of the future
will dance side by side with AI in a celestial ballet
of organic/digital pair programming. I'm not feeling that either. Now, CodeWhisperer is already here. You know, that future is now,
no dance necessary. All right, all right. You are going to love this one. La Piece de Resistance, hoverboards. No, no, no, no. No, let me stop you there. Either way,
that's not going to McFly. Making tech predictions is tough. Well, the future
is not science fiction. To be able to make good predictions,
you have to think about the present, because the future is now. [phone ringing] Before you go, here. Have a cookie. Oh, no.
I only accept essential cookies. [music playing] So, I'm not an oracle, but… [applause] But observing the present
actually helps kind of predict the future,
and that is especially at AWS, yeah? The kind of things that
we are doing at AWS often defines the technical future, and now, that is actually
really important, but I also think that
historical context is important. Look at the bigger picture. Look a bit back. You know, and I know that
we've all seen amazing innovations being presented to you this week
in the area of GenAI and LLMs and how we're going
to change development and how businesses
are going to change, but where did this
actually come from? What's the history of this, yeah? It goes back to two of my favorite
early Greek philosophers, Plato and Aristotle. Both of them were thinking about
whether machines could actually do
the tasks of humans, and they were thinking about
sort of what is actually… what is it actually controls humans? Aristotle thought it was the heart,
the soul that actually drove humans, but Plato actually thought
it was symbolics in your head, and actually, Plato went as far,
if you read The Republic, that he created
a city state in that book, where actually, machines,
robots were doing the chores. Now, that's about, was 20-25
centuries ago? Not much happened for 25
about centuries in that sense, until the first computers arrived, and computers could do much
more than just calculations. They were capable
of more complex tasks, and as such, everybody
started thinking that, oh, maybe if the human is indeed,
you know, driven by sort of this symbolic
complexity in their head, maybe we can use computers
for that as well, and of course, one of our more important
philosophers of the last century, Alan Turing,
spent a lot of time on that. He really started to think about,
can machines, computers, think? Yeah, and his famous paper in 1950, Computing, Machinery,
and Intelligence is really sort of the groundbreaking
work that we still live by. We still talk about the Turing Test. Now, unfortunately,
Turing tragically died before he could join this
1956 workshop at Dartmouth. In this workshop, the term
artificial intelligence was coined for the first time, but still, most of
the researchers available there were from
the symbolic AI field. They're really thinking about
sort of, can we implement reasoning. Can we implement
the symbolic reasoning? Can we use mathematics
for those kind of things? Didn't really go anywhere,
not immediately, at least, and these automated reasoning
and things like that have become tools
that are incredibly important, but not necessarily in the field
of AI as we know it now. One of the things that we did start
to build in those days was called expert systems. I built a few of those
using Prologue, and I still don't like
the curtains behind it. You know? So, expert systems actually sort
of incorporated knowledge in rules, and they could execute
queries against it and sort of get answers back,
but they were very laborious, and they weren't, to be honest,
they weren't that terribly smart. The big breakthrough came
when we could see the shift happening from symbolic AI
to embodied AI, and what did that mean. Basically, the groups of researchers
who started to think, if we want, maybe, if we start to have
these basic building blocks that humans have to perform tasks, maybe out of that, we can build
artificial intelligence. This is mostly driven by the idea
of that you have robots. What are the kind of capabilities
that robots need? What are the kind of sensors
that we have that we need
to give robots, you know? Speech recognition,
image recognition, and even maybe sensors
that we, as humans, don't even have, like LIDAR. Now, can we build that ourselves? And that thinking actually has
driven us for the past 10-15 years, and we saw new algorithms arriving. Deep learning became important,
reinforcement learning, all those different types, and what we saw was software
improved, algorithms improved, hardware started to improve,
software improved again, and we saw a really
quick acceleration of all these different
algorithms happening, so that we could do better learning, better build these big models that
they could actually help us do tasks. Now, of course, the next step, the most recent one that has created
this sort of earthquake in the world of AI
has been transformers. The ability to use transformers
to build foundational models and to build these
large language models are actually a revolution
in all of this, yeah, but I'm not going
to talk about that, really. I really want to go back one step
and talk about what I would call
good, old-fashioned AI. Yeah? The cool thing is that now
we have good old-fashioned AI, and there is new AI. The new AI doesn't invalidate
the old AI that we were having, yeah, and so,
if I look at so many of my customers who have built amazing systems
using good old-fashioned AI, that I think you should keep in mind
that not everything needs to be done with these massive
large language models, yeah, and I'll pick a particular area. I've, over the years,
become extremely interested in those businesses that actually
are combining two things. They are trying to solve
really hard human problems and use technology to do that. Now, I could give you lots of
examples of companies that have done amazing things
with AI for now, but I will pick a few
out of my box of these companies that have built things
for AI for good. And so, one of them is one of my most
favorite organizations to work with. It's the International Rice
Research Institute. They sit just outside of Manila, and their mission is
to abolish poverty and hunger among rice dependent communities. You have to remember
that the prediction is, by the 2025, population
has grown by another 25%. How are we going to feed them? How are we going to make sure
that that happens? So, they're an amazing organization. They have a massive seed bank,
a big freezer, actually, with 200,000 strains of rice in them. They can regrow any type of rice. They can also make improvements
of rice. For example, Golden Rice is
a good example that has much higher degrees
of vitamin A, which is very important
for certain communities. And so, but they had a big backlog, because all these seeds
are being sent to them, and humans need to sort of sort them and try to figure out which
of those seed are actually useful, which they should be storing, and they got backlog and everything
in the backlog starts to deteriorate. So, they basically make use of
machine learning and vision, yeah, vision management to
actually automate this process, not that they are actually,
the machine is actually now making the decisions,
which seeds should go into the bank. Still, humans do that,
but the automation before that, the efficiency before that,
using vision, allows them to actually
remove the complete backlog. They improve their backlog,
that sorting productivity by something like 30-40%. Cergenx, another very
interesting company that I recently met
in Ireland, apparently, and this was not a problem
I was aware of, that most infants that may be born
with brain injuries that are not immediately visible,
often, those injuries don't actually aren't visible
until months or even years later. They have a very simple test
with a small cap that basically takes
an EEG of the baby and immediately can determine whether a baby actually may have
that particular injury. And as such, you know, you can
immediately start treating them, which actually improved
the quality of life for these babies
for a very long time. So, what they did, they actually
have all these scans. They put it in a sleeve,
ran it through SageMaker, created this model,
which is very unique, because baby EEGs are radically
different from that of adults. So, their goal is to make this brain
testing for infants as commonplace as the hearing test that each baby
is getting now as well. Another company, precision.ai,
you know, we've all seen drones. Drones are basically a box
full of AI capabilities. After all, we are not steering them. We are giving them a task. They need to go away. They need to go somewhere. They need to follow
a particular pattern. They need to avoid birds. They need to avoid washing lines. They need to do all these things
by themselves autonomously. So, they're a big AI box
to start off with. Precision.ai has as a task to avoid
the complete plots of land are sprayed with chemicals
to remove weeds. They fly off this patch of land,
create a map of it, and are then able to
attack individual weed plants, significantly reducing the runoff
of these dangerous chemicals in the creeks and rivers
surrounding the plot of land. Digital Earth, Africa, a one of my favorite
companies to work with, they make use of… there's an open dataset
of satellite imagery of Africa, and this data is being used
by governments all over the place. You know, in Zanzibar for example, they're monitoring coastal erosion. Yeah? In Ghana, they identify
the impact of illegal mining. Do these illegal roads
being built go to these mines? In South Africa, Kenya, they understand the impact
of forest fires. All of this is driven
by this open data set of Digital Earth Africa,
and many of these organizations are using this to improve the life
of Africans across the continent. Now, all of this, and I think Swami
hammered down on this point as well, you know, without good data,
there is no good AI, and so, in the past, indeed,
we had all the structured data. If you think about patient records,
that was a structured record. These days, patient records
have all sorts of unstructured information
scribbled all over them, and so, here, suddenly, you have
a mountain of unstructured data, where you think this may be a
haystack that has a needle in it. So, how do you find a needle
in a haystack? You use a magnet, and the magnet
is machine learning, yeah? Basically, you use machine learning
to create meaning out of this mayhem, and this AI for now actually
gives you practical solutions for real problem, and it's incredible to see
what customers can build with these services
and the tools that we provide, and if leveraged, AWS,
it solves some really hard problems. A prime example of that is
an organization called Foreign. Now, our next speaker
is going to talk to you about child sexual abuse, and I know, I recognize
this can be difficult… this is a difficult topic. I encourage you to do what you need
to do, take care of your wellbeing. I'd like to welcome Dr. Rebecca
Portnoff on stage to discuss how Foreign
utilizes AI to protect children. [music playing] I am going to tell you a story,
a true story. A child is being sexually abused. We'll call her Maria. Maria's abuser is taking pictures
while he abuses her and then sharing these images and videos onto a content
hosting platform, hiding in plain sight among hundreds of millions
of other images and videos, but this particular platform
doesn't accept this. This platform uses
Thorn's Safer products. It uses Safer's child
sexual abuse material, or CSAM, classifier to find images and videos that could show a child
in an active abuse situation. On one day,
the classifier alerts them. They've got a hit. So, they go to work,
and here's what they find. A user has shared over
2,000 new abuse files. It's clear a child is being abused. So, they flag the case
to law enforcement who launch an investigation, and the child in that content,
she is found. Maria is found. An arrest is made. A recovery is complete, and for
Maria, a brighter future emerges. This isn't just a story. This is reality. This is the gravity of our work. We, as technologists,
we have the power to end a child's real-life nightmare. Our challenge: how do we find them
and stop the cycle of trauma? The answer to these questions
are buried in haystacks of data. So, then, what's that right magnet
to find that needle in a haystack? The answer is complex, and an essential part of it
lies with machine learning. According to the National Center
for Missing and Exploited Children, in 2022, they received over 88 million files
of suspected child sexual abuse reported to them by online platforms. These aren't just files. These are kids, kids
who desperately need help. Think about it. With 88 million files and even just
one second of review per file, that's going to be almost
three years of nonstop review. I don't want any kid to have to wait
that long to get help. We are a nonprofit that
builds technology to combat child
sexual abuse at scale. AWS is our preferred
cloud provider in this work. We leverage AWS services to power
our machine learning tools. I have dedicated my career to
defending children from sexual abuse. So, I can tell you
with confidence here today, machine learning, AI,
it does make a difference. We built Safer. [applause] So, we build Safer,
our all-in-one tool to detect, review, and report
child sexual abuse material. Safer uses hashing and matching
to find known abuse material content that's already been verified
by an analyst and a classifier
to find new abuse material. My team at Thorn,
we built the classifier to act as that powerful magnet to find new child
sexual abuse material at scale. Now, when we first started building
the classifier back in 2019, there was already active research on the use of convolutional
neural networks for detecting child sexual abuse, but we needed a classifier
that went beyond research, something that worked
at a production scale. When we build a classifier,
we follow the CRISP-DM process. As you can see, it's really
all about the data, but within this broader framework,
we had to overcome a key hurdle. This data is illegal. Child sexual abuse material
is illegal. You can't store it in the same places
or the same way as other content. So, the solution was to collaborate. We invested in hardware
installed on site at organizations with the legal
right to house this data, training the classifier on-prem, and then using Amazon's ECR
to distribute that trained model to end-users. So, we've got this critical
training infrastructure in place, and now, we can build,
starting with data prep. We use techniques like perceptual
hashing to de-duplicate the dataset, ensuring that there's no overlap between the training,
testing, and validation sets. We use Amazon S3 to store
our non-abuse material, and this data is just as important
as the abuse material for training our classifier. Now, model training via remote access to an on-prem solution,
it's got its challenges. It can be slow and opaque. So, we use Amazon EC2 and EKS
to do R&D with benign data first, debugging and fixing any issues
we may find in our training pipeline before training on-prem. In the machine learning
AI lifecycle of develop, deploy, maintain,
that maintain part is key, because models get stale,
and models have bias. So, what this means is monitoring
performance and regular retrains, what's our performance holistically,
and then drilling down, is our model better
at classifying abuse material of lighter skin
or darker skin children, what kind of trends
are we hearing from users about false positives
and false negatives, because to fix a problem, we have to first know
what that problem is. Then we balance our data sets,
adjust the model weights, refresh our training dataset
with new images, and retrain. Now, AWS powers our user feedback. We have an API in our AWS
hosted safer services, where users can submit
false positives to incorporate
back into our training. These false positives are often
the most valuable negative examples that we have, because they allow us
to do targeted retrains and improve performance
on data in the wild. Now, none of this hard work
that I talked about matters at all if you can't get the classifier
into the hands of the users. So, effective model deployment
is key, working collaboratively
with engineering and product to find the right solution
for the user's needs, and when I say effective deployment, I'm specifically thinking privacy
forward and human in the loop. I am proud and thankful for how
Thorn's engineers deploy the classifier in a way that
the customer has full control over when and how the content
is reviewed and reported, because the classifier,
it acts as that powerful magnet, but it should be a human making
the final call on what gets reported. Now, we are all here today because we want to build
something that has impact. So, what's that impact look like
for Thorn and our partners? It looks like the outcome
of the true story I told you earlier: a child gets her life back. It looks like over 2.8 million
potential files of child sexual abuse
found via Safer. It looks like constant innovation. This month, we launched
Safer Essential, an API-based solution
for quick detection of known child sexual abuse material. This is a future where every child
is free to simply be a kid, and everything we build at Thorn,
it's to get to that future, but that technology by itself
is never going to be enough. Having impact requires all of us
to work together, content hosting platforms, law enforcement,
government, survivor services, and the community,
including you, you in this room, and I want to do
that together with you. I want everyone here to join Thorn
in our mission. I want you to show up
at the nonprofit impact launch today and ask my amazing
colleagues at Thorn how you can use our technology,
and I want you to go to thorn.org and learn, learn about
what these kids are going through, and what technology
can do to help them, and then I want you to pick up
your laptop and build, build what these kids need. There are still countless victims
who are suffering, but we have the power
to help them, if we work together. Thank you. [applause/cheering/music playing] This is the power of technology
and the power that we have, yeah? Make sure that, you know,
the call to action that Rebecca just gave us
doesn't go wasted. We have that power to absolutely
make a difference as technologists. With the right technology
and the right access to data, we can have a really
positive impact on the world, and technology can be
a force for good, yeah? That's actually a lesson
I learned way before I ended up
in computer science, yeah? That's me on the right there. Yes, I had hair at some moment. Before I started computer science, started studying computer
studying computer science, I worked in radiology
and both radio diagnostics as well as radiotherapy at the Dutch
National Cancer Research Hospital, and a lot of technology
was already part of that, not just the fundamental
x-ray technology, but, you know, CAT scans, MRI scans, nuclear imagery,
all was sort of starting out. By the way, that is 40 years ago. The other, this is not, by the way,
a fancy gen AI picture, yeah? The person sitting in front of me
is Frank, Frank Delleo, to who I owe
my computer science career. Frank is a radiologist, and Frank was probably the most
passionate medical doctor I've ever met in my life. Frank wouldn't go home at night
until he had all his work done. That meant, at 10:00,
11:00 o'clock at night, he would still be sitting in front,
looking at his imagery, because he knew that if he wouldn't
finish it that day, the next day would
start with a backlog, and he would have that amount
of work yet again, and so, he worked
really long hours, yeah, and sort of, I really admired him, but he's kept hammering on me
that I should leave the profession, because he felt that, given that I had a sort of
an affinity for technology, he felt that my skills
were much better used in actually building technology
that could help people, yeah, and I really
took that to heart. I basically went back to school
again, and it's his fault. It's also his fault
that I'm here now. Yeah? So, started thinking, you know,
and I have a real affinity still for all the work
that we were doing in those days, and it got me thinking, you know,
can I build something useful with ML? Can I do that myself? You know, after all, I've been
telling all of you for years now how easy it is to integrate ML
into your applications, yeah? However, I needed to make sure
and really wanted to do this myself. So, what do you do? Actually, I tried to catch up
on 40 years of radiology work, innovations, talk to doctors,
go to hospitals, understand what the current
problem space is, and try and find
one particular problem, a small problem that maybe
I could try to see how much work it would be to build
a solution for that, yeah, and I actually, I went, I took some courses
in the ML University, spoke to AWS ML experts and things
like that just to get me up to speed, not just about the principles,
but about the practicality of it, yeah, and then one of these,
in this hospital in Dublin, where I talked to one
of the stroke specialists, he hammered into me
saying that, every second when a person has a stroke counts. A stroke patient loses
1.9 million neurons a minute if they're not being treated. So, quick treatment is crucial. Okay, that sounds like a reasonable, simple enough problem
that I can attack, looking at brain imagery in X-rays,
and if it will be like this, in case you don't, can't read it,
the big white spots there are basically
blood hemorrhages in the brain. If they will be this big,
that will be easy, but often, they're not, you know, they're microlesions
or really sort of micro, continuous micro strokes
and things like that, that you can still detect
using CAT and MRI scans, but it's much harder,
and so, I was starting off thinking, like, you know, can I build an ML pipeline
that sort of incorporates, can sort of decide
whether, on an image, there is actually a brain hemorrhage, and then if you would
actually find a positive, you would actually sort of prioritize
that in a radiologist worklist, so that it could be
immediately evaluated. So, as all with that,
if you don't have good data, you don't have good AI. So, that is immediately step number
one on my stumbling block. How would I get access
to data and imagery that I actually could
train my model on? Well, it actually turned out
that was not that hard. Kaggle has a dataset with 700,000
pre-labeled CT brain scans. That allowed me to actually
immediately get off to a start. So, download that set, put it in S3, and as always with, when you do this, you should already split
the dataset up into the set where you're going to train it
with how you're going to validate it and how you later on
are going to test whether the model
actually worked, yeah. 70/15/15 is a common practice there. Next thing to do, fire up SageMaker, push this button, and now. I have to write some code,
and so, pretty common in that is S3 uses Python,
and if you just realized, I'm not necessarily
that proficient in Python, but the Python SDK actually has
a whole bunch of building algorithms and pre-trained models from popular
open-source model hubs that you can immediately use. So, you know, I selected an
architecture that I've researched, and I can fine-tune it
with my dataset. So, the only thing to do now
is actually to add the code, to launch the training job,
push a button, and it gets immediately
scanned through, and this model gets created. Wow! That's easy. However, in that button
push that I did, there was a lot of work
happening, yeah, and I don't know if you know much
about sort of deep learning and how these models are being
built with multiple layers and things like that. Basically, you know, you push
an image through, goes forward, goes that,
and a loss score is being calculated. The loss score needs to be
as low as possible, right? The lower the loss score, the higher the accuracy
of the model that you've built. Okay, if the number is high,
you basically go back, you backtrack, and you adjust the weights
and the biases in the model, and then you go forward again
to see what the outcome is, and you go backwards again,
and you do that for 700,000 images, or, no, 70%
of the 700,000 images, yeah, and at the end of this pass,
always, you end up, you know, you have forward
pass backward pass, forward pass again, and all of this happens
under the covers when I push
that one particular button. Pretty amazing, and so, then you get actually out of that,
you get a model, and again,
with one click of a button, I actually get an API
endpoint for this model, where I can start
putting my imagery through and get a prediction score, what the likelihood is that
this CT scan brain image actually has hemorrhage in it, and it actually turns out,
it works pretty well. I was quite proud of myself
when I built this, and of course, then, you know, being the person
that likes evolvable architectures, I started looking at all these
different AWS components around it. You know, reorganized the,
is it the radiologist work list, and then something came to mind that when I visited
that hospital in Dublin where the neurologist actually said that he'd rather get woken up
at night at 3:00 AM for a false positive than not being
woken up at all, because every minute counts. So, he doesn't want to wait
until the radiologist gets to his reprioritized work list. He wants to get an SMS at night, at the moment that my model detects
that there is a brain hemorrhage. Evolvable architecture. Added SMS, can actually send an SMS
to the neurologist, and he can immediately, on his phone,
take a look at the image, whether or not he should
immediately jump in his car and drive to the hospital. Now, it is that simple. It is not that hard
to build these models. You know, anybody who is talking
to you about AI, and ML and building these models,
how hard it is, it is not that hard, you know,
and actually, by the way, this is in no way
a production system. This is my hacking on a Friday
afternoon, five weeks in a row, and of course,
you can start augmenting it, because you know, as always, when you've done one thing
you want to do all the other things. You could do, for example,
use class activation maps, which sort of indicate
what is it actually in this image that the model
was actually looking at, and again, you're building
a machine learning model. It's really sort of
an interactive cycle. Now, as I said, this is in no way
a production system, but however, I'm very, very happy
that a number of people on the AWS team
actually did pick it up, and I'd like to thank Priya, Wale,
and Ekta for actually taking my horrible code
and turning into something that might be a learning experience
for you guys as well. So, all of this is available for you
on GitHub to experiment with and to see how easy it is to actually
start building these ML models. This is really what I want you
to walk away with. If I can do it, you can do it. You know? [applause] Important in all of this,
and we've talked about large training as well as small models, whatever, important
in this particular case is that the model is small, fast,
and inexpensive. Why is that? Because the hospital
will want to run this for every brain scan
that is happening, yeah, and that is many,
many of them a day, and they would really like
to run the model locally, not require some massive
compute power behind it. So, for them, small, fast,
and inexpensive is crucial to make this technology
work for them. After all, if you're using ML, you should still be
a frugal architect. Now, talking to radiologists
about sort of the future and what kind of
the newer types of AI have there to offer to them, and if you think about that,
in talking to them, they would really like
to have more of these what are called
conversational interfaces, because you have to keep in mind
that a CAT or an MRI scan or nuclear imagery
is not just one image. It's literally hundreds of thousands
of slices from your body in digital format. We just happen to make
imagery out of it, because that's the sense
that we have. They will really, and often,
you know, patients come to a hospital not with a clear symptom
that really leads to one diagnosis. Now, often,
these are sort of unknown, and the radiologist's job
is much more like an explorer, trying, more than just
being an image reader, and they really look forward
to a world where there are AI based
radiologist assistants that could allow them
to explore the data that they have at their fingertips,
not just looking at images, and it also could include
other clinical data. So, suddenly, radiologists get
a 360-degree view of sort of the state
of the patient, yeah, and it really helps driving diagnosis much faster than
they could do before. They're really looking forward
to the next generation of AI that helps them to build these
conversational assistant interfaces. Those interfaces, though,
will still make use of these small, fast, and inexpensive models
on the side as agents to dive into specific problems
that a patient may be experiencing. Again, I want to hammer this down. I think Swami did yesterday as well. AI makes predictions. Professionals decide. They're assistants. They don't make
the decisions for you. We, as humans, are the ones
that make the decisions. Now, think about us as builders. You know, think about all the news
that we've heard in the past days. What's the impact of that
on our profession, on our jobs? Yeah, and it always has been
my passion to help builders be successful, and I hope that some of the tools
that you've seen this week actually are going to change the way
that you built your systems, and I think there's two ways
that we see sort of generative AI impacting our world. Yeah? On one hand, how to incorporate
generative AI into the application that you're building,
yeah, and I actually, given that I'm a big fan of the CDK, the Cloud Development Kit,
you know, this is a project on GitHub that actually has the CDK
constructs for generative AI, meaning that, if you're building
an application with the CDK, you can build data
ingestion pipelines. You can build question
answering systems, document summarization,
Lambda layers, all these kind of things,
straight out of the CDK. Check it out. The other way is, and I think
that's probably going to have the biggest impact on all of us,
is the collaboration between us and these coding assistants
that are arriving now. Many of you have already been
familiar with CodeWhisperer, yeah, and actually,
I don't know if you saw this, but when I was developing
the radiology application, I actually used Python, yeah, but CodeWhisperer helped me
actually implement that. Yeah? I didn't have to think about that. It actually helped me
sort of look at these APIs that are unfamiliar and the languages
in that particular familiar for me. You might also have noticed
that I wasn't using a notebook. Now, Jupyter Notebooks
are the common way in which you sort of describe
the machine learning project
that you're working on. I'd rather work in VS Code than in
a new development environment. So, I'm happy, I don't know
if you noticed that, it was sort of an Easter Egg
in that part, but you can now fire up a code editor
from inside SageMaker Studio. Yeah? It's based… [applause] It's based on the open-source version
of VS Code and allows you to actually,
within SageMaker Studio, work in the environment
that is already familiar for you, which is VS Code, and you can launch
this full-blown code editor with all the additional AI powered,
tools like CodeWhisperer, directly in the IDE from SageMaker, but I think the biggest part,
and I think I hinted at that before, we have a lifelong learning
ahead of us. Now, technology has changed
rapidly in the past, what is it, a year. Imagine what's going to happen
in the coming year, or in the coming five years. We need to be able to stay
ahead of that, and for you, it's important
to learn and be curious. Yeah? It helps you. It'll help you to actually
accept these tools that allow you to explore
new problem spaces and languages. It's becoming a creative tool
to a sounding board for your ideas and approaches, yeah,
and in Adam's keynote on Tuesday, you heard about Amazon Q,
and I think this is absolutely set to transform many aspects
of software development. Yeah? You suddenly get an expert
assistant in building on AWS sitting next to you. That will reduce busy work
and free up you to do more higher-value work, and especially, if you think about
sort of the cloud space with all technologies in
and around it, it is mind blowing. This is too much for a single person
to be an expert in. However, these AI assistants
can be an expert in all of that and can help you work through that. Then it can go from cloud
computing solutions, machine learning algorithms. Options are endless. You know, this is not a bad thing. This is a very exciting time
to be a developer. It really is too much for a single
person to keep in their head, and that's where Q comes in. Now, I could take an example
from earlier in the keynotes. It could say, what AWS services do I need to start
building machine learning models, and Q will give me a starting point, or, you know, and it's not
just a one-shot thing. It's not just a copy-and-paste
kind of thing. It is ask, just iterate back and forth understanding
all of these different things that you want to achieve, yeah,
and help you make a plan, how to attack the problem space
that you're looking for. It's more than just questions. Yeah? Q actually is integrated
in quite a few other pieces, yeah? For example, in CodeCatalyst. It can help you start a project
immediately inside CodeCatalyst. So, it can generate an entire
new feature for you, or a new approach,
and I find it a really great way to actually start learning
about technology, even though, you know,
you may even kill the pull request that it created for you, it's just a great learning tool
to understand the complete code base, to understand not just
at a file level, but at an overall system level. And of course, you know,
Q sits in your IDE. So, it can help you there. You can ask it to explain
or create code for you, and, you know, through conversations, it can adjust and iterate,
and in the fullness of time, we will see Q operate
on each of the different pieces of our development pipeline,
yeah, and one example, for example, is the use of Q
and Application Composer. I announced that last year,
and I'm happy, actually, that Application Composer
now is available within VS Code. This means that you can have
your YAML file, now, your CloudFormation file, and a visual representation
of that file in VS Code. Making changes in code, you'll see that represented
in visual, and if you make changes visually, it is immediately reflected in
the code at the site of that. Yeah? [applause] It is cool. Okay, fine, y, but Q is also
in there, remember? Q sits inside VS Code as well,
and you can ask Q questions about CloudFormation
and then insert a response in it, and, you know, you can
change your YAML file, and the cloud formation file
and see this, and this multimodal way
of question answer and diving deeper
is something really, it's a really interesting
new paradigm, I think, that will help us, as builders,
have a lot more fun in our world. Now, with all of that, you know, you have one more day
of learning ahead of you. I hope that I gave you
a few hints today about being a good frugal architect, both in terms of cost,
as well as sustainability, and that if I can build ML models, you certainly can. Yeah? I think there's never been
a better time to be a builder. So, you have one more day ahead
of learning tomorrow as well, a few more sessions, but tonight,
tonight, we're going to party. [cheering/applause] Yeah? Major Lazer on the main stage,
Portrait of a Man on the live stage. Yeah? With all of that, now, go build. [cheering/applause] You're going to have
to make a choice. Hit or stay. Damn! What? That is so predictable. So, what did I miss? Nice keynote Werner, but just one thing. You said
I could scan my container images straight from my CI/CD pipeline. Yeah, I looked into it. Doesn't exist. Maybe
we just haven't released it yet. I knew it! Ready, Werner? Do it. Three, two, one. [cheering/applause] Now, go build. [music playing]