[music playing] Hello, and welcome to the AWS
Machine Learning Summit. Today you will learn about
the compelling work happening within our
scientific community in Amazon, including Amazon Scholars
and distinguished engineers. You will hear directly
from our customers including, 3M, Bundesliga, Vanguard, and many more about how they are
addressing business challenges and opening up new opportunities
with machine learning. And you will also learn about
how our services can be applied
to your own use cases. Machine learning is one of
the most transformative technologies we will encounter
in our generation. ML is improving
customer experience, creating more efficiencies
and operations, and spurring new innovations
and discoveries, like helping researchers
discover new vaccines, and enhancing agriculture output
with better crop monitoring. But we are just scratching
the surface about what is possible, and there is so much
invention yet to be done. Accelerating adoption of ML requires
bright minds to come together and share learnings,
advances and best practices. This is why I am so excited
to bring all of you together today for a day dedicated
to machine learning. In fact, the scientific work
in machine learning is exploding. Scientific paper publication
has grown exponentially over the past years,
and today it has been estimated that more than 100 papers
are published a day. Advancements
in machine learning, fueled by scientific research, abundance of compute resources
and access to data has also meant that machine
learning is going mainstream. We see this in our
customers' adoption of the machine
learning technology. More than 100,000 customers use AWS
for machine learning, and machine learning
is enabling companies to reinvent entire tranches
of their business. For example, Roche, the second largest pharmaceutical
company in the world, uses Amazon SageMaker
to accelerate the delivery of treatments
and tailor medical experiences. Discovery Plus Streaming Service
is using Amazon Personalize to help its customers
cure choice paralysis by offering tailored
content suggestions. The New York Times uses Contact Lens'
newly introduced real-time analysis features to respond
to customer issues in the moment. The BMW group is using
Amazon SageMaker to process, analyze
and enrich petabytes of data in order to forecast
the demand of both model makes and individual equipment
on a worldwide scale. We can see right before our eyes
that machine-learning science is translating
to real customer adoption and that's not by accident. At Amazon, everything we do
starts with the customer and we work backwards
from there. We are well known
for our customer obsession, and that is also the case with how
we approach scientific innovation. We call it customer-obsessed science. So what does customer-obsessed
science mean? This means a few things. First, rather than doing science
for science's sake, we work backwards
from the customer problem and invent to meet those needs. While our researchers do
publish papers and contribute
to the overall industry, their focus is on bringing
new experiences and products to our customers. At AWS 90% of what we do
and what we build is driven by what customers
tell us matters. and the other 10% are things
we hear from customers. Well, they may not exactly
articulate what they want, but we try to read between the lines
and invent on their behalf. Second, our teams
of researchers are usually embedded
in the business. So science goes directly
to the customer. Researchers that come to AWS
and Amazon want to help create
powerful innovations that impact millions of people
and make machine learning accessible
to every organization. For instance, Yoelle Maarek,
VP of Science for Alexa Shopping, who you will hear later
from in this talk, sits within the Alexa
Shopping team, so that the incredible innovations
that our team invents can be applied directly towards enhancing
the Alexa user experience. Third, once we have invented
these amazing technologies, we get the results of that work into the hands
of our customers at scale. And then we learn and iterate
from their feedback, and start the process
of working back from the customer
needs all over again. Now, let's look at a few examples where we have worked backwards
from the customer to develop science, and then brought it directly
to our services at AWS. First, let's take a look
at one of the prerequisites to all machine learning,
training data, and the need to learn
with less of it. Humans are incredibly good
at learning from a few data samples, but machine learning
still requires lot of data. For instance, a few years ago,
when my daughter was two, she could easily learn the difference
between an apple and an orange
with just a few examples. On the other end,
a machine-learning model might have needed hundreds
of labeled pieces of data to reliably identify
between an apple and an orange. Moreover, the process of data
labeling is time consuming and a labor-intensive
process altogether. And as machine learning
has become more mainstream, and more and more companies
want to use it, accessing vast rows of data, and annotating that data is too
tedious and expensive to scale. For instance, the NFL wanted
to use computer vision to more easily and quickly search
through thousands of media assets. But the manpower to tag all these
assets at scale was time and cost prohibitive. Or take the example of Dafgards, a family business
that has been making frozen foods for consumers in Sweden
and around the world for 80 years. They had to ensure that every pizza
coming off of that line has the perfect amount of cheese in order to meet the needs
of their discerning customers. Dafgards has wanted to use a more
intelligent method for quality control
of its pizza making, in order to increase
quality and efficiency. Their IT team of 12 had limited
expertise in machine learning. So they partnered with us to build
an automated machine learning system to do
visual quality inspection. Now, to solve this problem
for our customers, our team of scientists
are investing in a technique called Few Shot Learning and they wanted to bring
Few-Shot Learning to our services. Few-Shot Learning tries to replicate
the human ability to learn a specific task
from just a few examples by incorporating previous knowledge. For instance, if you know how to add,
you will learn how to multiply faster. It's a different operation,
but the same underlying framework. Now we have taken this cutting-edge
technique made optimizations and incorporated it to our services so that our customers can create
models custom to their own use cases
with very little data. We use Few-Shot Learning today and the custom labels
feature of Amazon Recognition, our service for image
and video analysis, which allows users
to identify objects and scenes and images that are specific
to their needs. It asks you as 10 images per label. The NFL uses custom labels feature
to apply detailed tags for players, teams, objects,
action, jerseys, location and more to their entire
photo collection in a fraction of time
and took them previously. In industrial we use it in Amazon
Lookout for Vision, our service for automated
quality inspection so that customers like Dafgards
can start to identify quality defects in industrial processes
with as few as 30 images to train the model, 10 images or defects
or anomalies plus 20 normal images. But in fact, while building
Lookout for Vision we actually had
an interesting scientific challenge. Because modern manufacturing systems
are so finely tuned, the defect rates
are often 1% or less, and they are typically
very slight defects. As a result, the data we needed
to train the algorithms that power Lookout for Vision should
as much as possible reflect the reality of having a small
percentage of defects by then, but that are not just obvious defects
but slight or nuanced effects. So the scientists and engineers
working on this project realized early on that the sample defects
that they were training models on didn't match
the shop floor reality. So we did something really creative. We actually built a mock factory. The team procured conveyor belts
and cameras and objects of various types to simulate
various manufacturing environments. The goal was to create data sets that
included normal images and objects, and then draw or create
synthetic anomalies, such as missing components, scratches, discoloration,
and other effects. Few-Shot Learning allowed the team
to occasionally work with no images or defects at all. That real life, trial
and error iterative process eventually led to the development
of Lookout for Vision, which today is being used
by customers like Dafgards to inspect
and verify product quality. Let's move to another example, which also happens to be
in computer vision, but related to text extraction. Many of our customers use machine
learning to extract meaning from documents or images
in order to save time and costs associated with manual
processing of these documents. While traditional text extraction
technology is really good
at understanding regular text when it is clearly laid out,
well-written and horizontal. However, it's not as good
at understanding irregular text, which is blurred or where
the characters aren't aligned
in a horizontal manner. These results from recent state
of the art methods on public academic data sets,
show only a 70 to 80% accuracy. Of course, we live
in the real world, where customers have faded receipts,
blurry images, and doctor's
handwritten notes to analyze. So we knew we needed to solve
this problem for our customers. When addressing irregular texts, models today
can implicitly learn language by encoding contextual information
in order to infer what the word is even when one a few letters
are properly visible. For instance, if a three-letter word
begins with T-H, the model will likely predict
that the word is there. However, this capability
can also lead to errors when models misinterpret
contextual information, or when the model struggles
to understand text that isn't an actual word,
like 100 and social security numbers, because in those cases contextual
information will simply be redundant. So ML models must be able to learn
when to use visual information and when to use
contextual information. And this hasn't been
done well to date. To address this, our team invented
a new method called selective context attentional
scene text recognizer or SCATTER. Here is how SCATTER works. Let's take this word
from a doctor's note. With SCATTER, the image passes
through an architecture that is composed
of a series of stack blocks which model the contextual
information. In each block the contextual
information, we also have an auto decoder, which helps the model learn
whether to use contextual information or visual information
depending on the image itself. As it passes through each block, the model improves the encoding
of contextual dependencies and thus increasingly
refined the predictions. And the final prediction
is taken from the final block. In this case,
the word is Aortic. This method surpasses
the state-of-the-art performance on irregular text recognition
benchmarks by 3.7% on an average. 3.7% doesn't sound like a lot,
but when you think about it, that's millions of words each day that get the right prediction
for our customers. SCATTER is available today
in Amazon Textract, a service that automatically
extracts text, handwriting, and data from scanned documents,
and also in Amazon Recognition. Few-Shot Learning and SCATTER are just a few examples
of how we apply customer obsessed science
to our products at AWS. This is how science
works across Amazon. Yoelle Maarek, Vice President
of Research and Science at Alexa Shopping,
works with a team of scientists to constantly create state
of the art machine learning in order
to make the experience of interacting
with Alexa even better. One example of making
Alexa more personable is can we give Alexa
a sense of humor. To share more about
how our team is giving Alexa a sense of humor
using machine learning. I'd like to introduce Yoelle Maarek. Hello everyone, and thanks,
Swami, for inviting me here to talk about one of my favorite
topics, computational humor. I'm delighted
to host everyone here virtually with us you can see
quite a nice view on the beach. So let's talk a little bit
about computational humor. You might say that,
computational humor is not really a serious topic
for serious scientists to research. But actually, even if it doesn't
look that important in our very task-oriented world,
we are actually considering and we are addressing
a very hard AI challenge. If you think about it, it goes back
to the early days of AI, when Alan Turing really showed us
the way with his seminal paper on a computing machinery
and intelligence in the ‘50s. And in that paper, Turing being
a true visionary, actually, Turing started to argue
against all the possible opponents of the future of computers who said they might say
something like a computer will never do x
with x being something. He's a mathematician,
so he loves the variable x. And the x in question
would be in his paper, for instance, making mistakes. And we all know that, of course,
computers make mistakes. Another example was being the subject
of its own thought. And if you think of debuggers, debuggers actually are
a counter example of this. Another example would be
diversity of behavior. And Turing really argues and explains
how his vision is that with sufficient
computational resources, the number of, the diversity
of behavior would be huge. And another topic, actually, beside
the enduring strawberry and cream, which is kind of hilarious for me. But another topic he didn't really
address, but he did mention is having
a sense of humor. So already Turing was really looking, having a sense of humor being
a really, really hard challenge. And for us, it's really
a beautiful opportunity for what we call
customer-obsessed science. So you heard Swami introduced
that concept before, it's something
pretty unique to Amazon. We do everything
in a customer-obsessed way, we go backward from the customers
in our thinking. And in our method, we don't start
from technology, we really start from the customer
needs or pain points. And here again, right, if we want
to tackle computational humor, we want to do it
from a customer backward manner. What does that mean in the context
of computational humor? Instead of having the robot,
the machine being the funny one, we want to look customer backward
whether customers are funny, and how should the machine
react to it? In the log management system,
it's called mix initiative, who is taking the initiative,
is it the machine or the user? In our case here, we don't want
to look at right away at the machine taking the initiative
with the customer. So that's what we did. We went basically to these channels
of detecting humor when customers
are the one being funny. And we started with something, before going into the heart
challenge of Alexa, we went first to something
a little bit simpler. We went to our amazon.com website, and we looked at our customers
in the question and answers when they refer to products. And you know what? We actually discovered tons
of pretty funny questions. Let's start with these first example,
for instance, Nintendo Switch Rage Joy-Con, and you see the type of question
that appear there. Basically, "Can you hack into
this machine and into the matrix
and save humanity?" And here for those who are familiar
with sci fi, it's a reference to the matrix, a cult movie
that I actually personally love. And you're really something
that is part of humor. In order for this joke to work you
need to get the cultural reference. If you never heard about the matrix,
you will not laugh, right? You'd say what is he talking about? That's already one of the, you know,
pieces of information we will use in our research. Common cultural reference
is key to detecting humor. Another example is sarcasm. You know, and again, here you have
a product which is a little bit probably
for the customer too expensive. So asking whether the luxury cooler
will make them fly. Sarcasm, funny. Here now a third example
with the Echo Show and said something with that
we start to see here, "Does it cook breakfast?"
That's actually another type of humor that we are going
to detect it belongs to what's called
the superiority theory of humor, when you refer to a robot
as if it was a human being. Again, it's pretty interesting. And then, you know,
here is my favorite one actually, is that when the product itself
is funny, it's going to attract
actually even more humor. The last one, you know whether better
or not in the other direction. This one I really, really find funny,
personally, that's my type of humor. But this product made us think
that some products are like standard products
attract some humor. Some other products
are themselves funny, and they will attract way more humor. Take this other example. It's a joke, right? Unicorn meat,
unfortunately it's a joke. And this type of product
playful product attracts basically tons
of playful questions, because people are already
in the mood of being playful. That gave us some actually insight
when we wanted to build our model, our deep learning model
to detect humor, that we had to be super careful
with what is called domain bias, and make sure that we will be able
to differentiate between products that attract funny question. Like here is this very weird
Swiss Army Knife Giant, "Does it come with a cell phone,
because it has everything?" I'm explaining the joke,
that kills the humor usually. But in any case, we built
this deep learning model. And we made sure to verify some
element of humans that we detected, we learn from theory,
lacking congruity, subjectivity. We did the regular embedding;
we build a model. And the good news, when we do take
into account this domain bias to make sure
we don't over fit our models, we were pretty good at detecting whether a question
is going to be funny or not. You see our results, between 84
to 91% of accuracy. Not bad, we were
pretty happy about that. And we had a publication at [PH] CIGR conference last year
about exactly this topic. So we had a proof of concept
that it's possible to detect humor at the syntax. So now we turn to the second thing
bigger challenge, basically humorous utterances
when customers are joking with Alexa. So that's where we started. And we had here a conjecture, because we know that in robots
when the robot is trying to be funny, there have been previous research
on the topic, people appreciate it,
it keeps engagement. But the key question here, if you are trying to be funny
with Alexa, and you're showing something
you're trying to be silly, not with Alexa, but at Alexa,
are you going to appreciate it if Alexa answers
gets your humor or not. Maybe you want to this feeling
of superiority, and you don't want Alexa, you want actually to be
the one making fun of Alexa. So that can in any good research,
we have a conjecture, and we need to demonstrate it. So we looked at this type of… I'm going to go back later how we are
going to demonstrate this conjecture. But before that, let's look at some
example utterances, true utterances from our customers. First one, "Alexa, can you buy me
a Lamborghini?" You definitely don't want Alexa to put a Lamborghini
in your shopping list. Another example,
so unfortunately or fortunately, I don't know,
depends on your type of humor, you will see a ton of toilet humor,
because people enjoy it. And it's about actually had
a very serious theory of human which is called relief theory. I know that actually you can try this
one if you want, Alexa will answer. Another example which is even
for me more interesting, because it's something
totally different, which is referring to Alexa
as if Alexa was a human being. "Alexa, what is your blood type?"
And you see here, actually, that it's not really funny,
but it's just playful. Customers are not expecting Alexa
to take this request seriously. If we go back to the theory of humor, it relates to what's called
personification and superiority because you're a human being,
you're superior, and you're not really expecting
Alexa to answer that question. So we had this definition, because funny and humor
is difficult and very subjective. It's very hard to define it formally, but we could define it
to have a formal definition. We defined playfulness and by
that we mean that a customer is being playful, where actually the customer
doesn't expect Alexa to take this request literally. And that means that we should
not act on it, not add anything
to your shopping list, for instance. So we have this definition. We went back to theory, and here I personally had a blast
looking at very old papers dating back to Aristotle and Plato and sharpen our work
on the theory of humor. And you have these three main
theories of humor. Relief, we mentioned it,
before toilet humor. Incongruity, which is my favorite,
where it's really out of context and that's what makes you laugh. You know, a dog eating pizza
is weird, that's incongruous, it might make you smile,
maybe not laugh. Superiority theory, the example
I gave before with this robot thing, actually that the same superiority
theory that makes people laugh when someone slip on a banana peel. I don't think it's that funny,
but it makes some people laugh. And so we have this theory,
we looked at that. It helped us basically think
on how to organize our models. If we go back to the conjecture
we want to really verify to validate, is whether people will enjoy it. And that's really remember,
we are a customer-obsessed company, we do customer-obsessed science. If they are not going to enjoy Alexa
understanding their humor we should not investigate,
it's not important. So we started with that conjecture
with personification. And you remember personification
is when you refer to Alexa as a human being, or as a robot. And the reason for which we did that
is that in Alexa that's really
the majority of the traffic. Which makes sense, because people
are so excited to have a robot to play with. So we started with that, and we
wanted to verify our conjecture. So here is the idea we had
to verify this conjecture. We started with a kind of semi-
assisted the Wizard of Oz experiment. If you're familiar with Wizard of Oz
settings, someone behind the scene is just pretending to do
the real job, to be the wizard. So we recruited 100 students'
questions, a little bit more. And we ask them to ask
personification question to an agent. They didn't know it was Alexa
behind the scene. Actually, they thought that it would
be a new research identical Shirley. Again, maybe poor humor, it's a
cultural reference to Airplane, if you're familiar with the movie, Shirley you must be serious
or something like that, but you can go back to it. And then we asked them to do that. To ask this question,
and then we will react. But because they would ask
too many questions, it wouldn't be like feasible
for all people behind the scenes to be funny all the time
and answer all the time. So we decided to do it semi-assisted
and build a model that would verify that these questions
are really personification questions. So that's what we did. We built a deep-learning model,
pre–Train BERT. You're familiar with BERT. On top of it, we build,
we fine-tuned BERT, and we had this fully
connected layer, where we injected more of these
families of feature coming from humor theory. Simplicity, so that people
understand your joke, emotion because humor is subjective,
so you need sentiment, analysis, polarity, et cetera. We built this model, and we had
a pretty good model, actually. We got something around… we wanted to have something
with high precision and recall to detect these funny personification
utterances on the fly. We needed to train this model, right? Otherwise, we would not reach
these precisions and recall thresholds. So the team had this really cool idea where they went
to a speed dating site. They said, "When are you
really the most personal? When do you ask really
personal questions? In speed dating these are
the type of questions you ask. And for that's
for positive examples, and for the negative examples we just went to that
from Alexa traffic. So we went with that. And here we are the question
that were asked, when they were personification,
as validated by our model, it went to Alexa to us, right? And if Alexa can answer,
for instance, what do you do for fun? Alexa answers and when Alexa
cannot answer our team these types of questions. They have some humor,
some are funny, some less so. And the good news, we did
a qualitative survey afterwards. And the good news that really
all the users enjoyed it. They didn't feel that Alexa
was supposed to try to understand their joke,
they really enjoyed it. So that really convinced us. We want to have fun not at Alexa
but with Alexa. And to conclude, right? I really want to thank
the brilliant team of scientists
behind this research. Big thanks to us, to them. I think it's a new era that we are
starting here in computational humor. Thank you. I look forward to having more
humorous conversations with Alexa in the future. One of the great things
about working at Amazon is the fact that we can deploy new
machine-learning services at scale, learn and then improve upon them
through customer feedback. In fact, many of the services
we at AWS offered to customers come straight
from our experience at Amazon, where we have been investing
in machine learning for 20 plus years and delivering it
to millions of consumers. For example, Amazon Lex,
our chatbot technology at AWS is powered by the same deep learning
technology that goes into Alexa. Amazon Personalize, our service for real time
personalized recommendations is based off of
the Amazon recommendation system, which was launched in the early days
of Amazon and refined over decades. Another example where we
were able to iterate and improve on a product
by implementing at scale with an Amazon is Amazon Monotron, a new end-to-end system
that uses machine learning to detect abnormal behavior
in industrial machinery. Each of the amazon.com fulfillment
centers have miles of conveyor belts weaving throughout the facility, and they deploy
sophisticated equipment to assist employees to pick, pack and ship thousands
of customer orders every day. If equipment fails or requires
unplanned maintenance, it can have a huge impact
on our operations. So it is the perfect environment for
us to pressure test Amazon Monotron. We used a fulfillment center
in Germany as a testbed, installing 800 sensors
on equipment, in order to catch instances
of abnormal vibrations on multiple conveyors and alert
technicians of potential issues. Through this process,
we learnt a lot and iterated it together
on a variety of capabilities, including how to reduce
false alerts, improving the sensor
commissioning user experience, and develop a better understanding
of the optimal range of a sensor to a gateway. In the next 12 months,
Amazon will be installing tens of thousands
of Monotron sensors and thousands of Monotron
gateways across dozens of Amazon fulfillment
centers worldwide. This helps enable our FCs to
reduce unplanned equipment downtime and improve
the customer experience. So I have given you a few examples
of how customer-obsessed science makes it into the products
and services we deliver to our customers. We take this approach to customer obsessed science
across our entire ML stack, whether we are inventing
for data scientists, developers, and increasingly
even business users. For developers, and
increasingly business users, we are building AI services
to address common horizontal, and industry-specific use cases
to easily add intelligence to any applications without
needing machine learning skills. I've spoken about several
of these already, Amazon Textract,
Amazon Rekognition, Amazon Lookout for Vision
and Amazon Monotron. We embed autoML in these AI services so that customers don't need
to worry about data preparation, feature engineering,
algorithm selection, training and tuning, inference
and model monitoring. And instead, they can remain
focused on their business outcomes. These services help customers
do things like personalize the customer experience, identify and triage anomalies
in business metrics, image recognition,
automatically extract meaning from documents and more. We have also built a suite
of solutions for the industrial sector
that use visual data to improve process and services that use data from machines
for predictive maintenance. In healthcare, we have
purpose-built solutions for transcription,
medical text comprehension, and Amazon HealthLake a new HIPAA
eligible service to store, transform, query
and analyze petabytes of health data in the cloud. For those data scientists
and ML developers who are building
their own ML models, we are also invested
in making machine learning faster and easier
to do with Amazon SageMaker. We built Amazon SageMaker
from the ground down to provide every developer
and data scientist with the ability to build,
train and deploy ML models quickly and at a lower cost
by providing the tools required for every step
of the ML development lifecycle in one integrated,
fully managed service. For expert machine learning
practitioners, researchers and data scientists, we focus
on giving a choice and flexibility with optimized versions of the most
popular deep-learning frameworks, including Pytorch,
MXNet and TensorFlow, which set records throughout the year for the final straining times
and lowest inference latency. And AWS provides the broadest
and deepest portfolio of compute, networking and storage
infrastructure services, with the choice of processes
and accelerators to meet our customers'
unique performance and budget needs
for machine learning. Now, going back to our third pillar
of customer-obsessed science, we must be able to learn
a trade at scale in order to deliver good science
to our customers. And at AWS, we are focused
on helping our customers do this with machine learning, but it can be a challenge
to deploy machine learning at scale. To talk more about the work we are
doing there to help our customers learn and iterate at scale
with machine learning, I'd like to introduce Bratin Saha, VP of Machine
Learning Services at AWS. Thank you, Swami. When we launched our machine
learning services a little over three years ago, most customers would deploy
a few models, maybe a dozen models
for different use cases. Today, our customers
deploy thousands, and even millions of models
across the lines of business. Machine learning is becoming
an integral part of how AWS customers
do business, and many of these customers have
standardized on Amazon SageMaker. In fact, today SageMaker supports
hundreds of billions of predictions per month. And customers have reported
training models with billions of parameters, which is orders of magnitude
more than just two years back. From a dozen models to millions
of models and billions of parameters and hundreds of billions
of predictions in just a couple of years. So what I want to talk about
is how we convert the customer-obsessed signs
into customer-obsessed products and enable customers to use the signs
at scale and in the real world. It's tens of thousands of customers
from virtually every industry, including financial services,
health care, media, sports, retail, automotive,
and manufacturing. These customers are seeing
significant results from standardizing their
ML workloads on SageMaker. Let's take a look at some of them. Lyft's autonomous vehicle division
level five reduced model training time
from days to just hours. T Mobile saved data scientist
significant time in labeling thousands upon
thousands of messages by using SageMaker ground truth. iFood, the leader in online
food delivery in Latin America, uses Amazon SageMaker
to optimize delivery routes to decrease the distance traveled by
delivery partners by almost 12%. And finally, ADP reduced time to deploy machine-learning models
from two weeks to just one day. And it's not just our customers, we're making groundbreaking
transformations with SageMaker. Many of us are also using
SageMaker in our daily lives. For example,
when you order on amazon.com, you are using SageMaker. In order to achieve
its current scale, Amazon had to overcome
several challenges along the way, including provisioning and managing
expensive infrastructure like GPUs, and integrating and managing
a variety of tools. Amazon fulfillment technologies
must monitor millions of global shipments annually
to deliver on Amazon's promise that items will be
readily available, and they will arrive on time. Therefore, an internal team
built up a proprietary computer vision-based
software solution that scanned millions of images
across fulfillment centers to identify misplaced
inventory worldwide. However, the solution did not
support piloting new models. That is, it did not enable new models to handle requests
alongside the old models. And so the team could not
test new models using live production data
without risking service disruptions. As a result, teams had to develop
ML models offline and then validate
and test them manually offline and then bring them online,
which often took three to six months. To address these challenges,
the team turned to Amazon SageMaker, and they were able to reduce
the deployment time to just two weeks from
the three to six months before. Moreover, they reduced
their prediction latency by 50% by using GPUs, and SageMaker also relieved
the development team of having to manage
their ML infrastructure. In fact, the team got an extra
month of engineering time that they could devote
to building models rather than maintaining their
infrastructure and operational tasks. In other words, a full month
of engineering time that they could focus
on the differentiated work, rather than on
the undifferentiated heavy lifting of managing
the infrastructure. The results from Amazon
fulfillment centers are a perfect example
of why we built SageMaker. We are building SageMaker
along three vectors, infrastructure that is purpose
built for machine learning, tools that are customized
for machine learning, and ML industrialization. Now, I would like to dive deep
into each of these three vectors, starting with
purpose-built infrastructure. SageMaker provides the broadest
set of instances for your machine-learning needs. And for inference, we launched
Inferentia-based instances. Inferentia provides the lowest
cost of inference in the cloud, up to 70% lower cost, and 130% higher throughput than
current GPU-based instances. After migrating the vast majority
of inferences to Inferentia here, the Amazon Alexa team
saw 25% lower end to end latency for the text to speech workloads. And customers such as Snap,
Autodesk and Conde Nast also found that Inferentia gives them
higher performance and lower cost. Conde Nast, for instance,
observed a 72% reduction in cost than the previously
deployed GPU instances. Truly game changing results
in Inferentia for our customers. For training,
we have two major efforts. The first is Habana Gaudi
accelerators from Intel, which will offer 40%
better price performance over current GPU-based EC2 instances
for training deep-learning models. They will be available
to customers in 2021. The second is AWS Trainium,
a machine-learning chip custom designed by AWS for the most
cost-effective training in the cloud, and is coming later this year. We are building Trainium specifically
to provide the best price performance for training
machine-learning models in the cloud. Now, our infrastructure innovations
must also span to software, because it helps our customers
better utilize their hardware. Many of the most common
use cases for machine learning, such as personalization,
require to manage anywhere from a few hundred
to hundreds of thousands of models. For example, taxi services
train custom models based on each city's traffic patterns
to predict rider wait times. While this approach leads
to higher prediction accuracy, the downside is that the cost to deploy the models
increases significantly. In a traditional ML system, a customer would have to deploy
one model per instance, which means the customer
would have to deploy hundreds or thousands of instances, and the cost would go up
significantly. Therefore, we invented
Amazon SageMaker multi-model endpoints that allow a customer
to host up to 1000 models from a single instance, reducing the cost by orders
of magnitude, with SageMaker doing all
the traffic management and Model Management
on behalf of the customer. For example, SageMaker uses
sophisticated caching algorithms to understand which models should be resident
in memory at a particular instant, so your prediction latency
and throughput can be optimized. Next, let's look at how
we build tools that are tailored
for machine learning. Because ML models and predictions
are only as good as the data that they act upon. It's important for our customers
to analyze the data at each step
of the machine-learning workflow. Therefore, we build
SageMaker Clarify, that allows you to do
statistical analysis of your data and your models across each step
of the machine-learning workflow. We also built Clarify so that
you can understand why are your models
making certain predictions. Many customers such as Varo,
Bundesliga and Zopa are using SageMaker Clarify to increase
confidence in their ML models and provide greater transparency
to stakeholders. I would like to dive deeper into
the science behind SageMaker Clarify. It's based on Lloyd Shapley's work who won the Nobel Prize
in 2012 for Economics, but we had to customize it
for machine learning. In economic game theory,
you use Shapley values to understand which actions have the most
impact on winning a game. Similarly, SageMaker Clarify
runs a number of experiments on your model and utilizes
Shapley values to understand which inputs contribute
the most to a model's predictions. This allows Clarify to provide
more actionable insights and allows you to take more
informed business decisions. For example, if Clarify says
your model is predicting higher customer churn
because of hold times, then you can work on improving
the SLA in your call center. But we didn't stop there. We also improved the algorithm,
so it runs 10 times faster compared to open-source
implementations. With the innovation
and infrastructure and tooling we have discussed so far, you might start to see
a theme emerging. And that is how do we convert machine learning into a systematic
engineering discipline which will help customers
cross the chasm between research results
and production deployments, leading to a third vector,
ML industrialization. For this, we asked ourselves
how did software go from a niche endeavor
to an industry? Now, ML deals not just with code,
but with data as well. But we realized that much of the same
tooling concepts carry over to the ML world as well. And just as IDEs and debuggers
and profilers and CI/CD tools
made software development robust. We are building custom tools
for machine learning, such as SageMaker Studio, which is the world's first IDE
for machine learning, SageMaker Debuggers, SageMaker
Profilers and SageMaker Pipelines to make machine learning robust. Let me dive deeper into one of them. In software engineering, continuous
integration and continuous deployment pipelines are critical to ensuring
automation and robustness. But in machine learning, CI/
CD style tools are rarely available. And when they do exist, they're super
hard to set up, configure and manage. And in the spirit of making analogous
tools available to ML developers, we built Amazon SageMaker Pipelines. SageMaker Pipelines are the first
purpose-built ML CI/CD service accessible to every developer
and data scientist. SageMaker Pipelines has been
tremendously helpful to support governance
and audit requirements, because pipelines
automatically tracks code, data sets and artifacts
at every step of the ML workflow. So just like with software,
you can roll back, replace steps
and troubleshoot your problems, and reliably tracked
the lineage of models at scale across thousands of models
in production. Many customers such as iFood,
care.com, INVISTA and 3M, have been able
to scale using SageMaker Pipelines because with just a few clicks
in SageMaker pipelines, you can create
an entirely automated ML workflow that reduces months
of coding to just a few hours. I've discussed how SageMaker
innovation and infrastructure, tools, and ML industrialization
has made machine learning scale. Another critical factor
to success with ML is making sure
we grow the talent pool and help more people
become ML practitioners. At Amazon, our goal is to train
every developer we hire on machine learning. In fact, machine-learning courses
are now mandatory for any engineer joining Amazon, and we want to make training
accessible to even more developers. Therefore, I'm very excited
to announce a new MOOC-focused on practical applications
of data science. The MOOC is now available
on Coursera. And we built it in partnership
with deeplearning.ai, which is an education technology
company founded by Dr. Andrew Ng. Starting now, you can take
the new course and it's ideal for those
who are now ready to practically implement models
in their organizations. Because this marks the first time
we have collaborated with deeplearning.ai, we decided
to host a fireside chat later today where you can hear
from both Swami and Andrew. They will talk about
the future of ML, how to accelerate model
building and deployment and how to build
a business case for your project. I think it would be
a fascinating discussion by two luminaries in ML today. See you there. And with that,
I'll pass it back to Swami. Thank you, Bratin. At AWS we invent on
behalf of the customers so they can create better
experiences for their customers. One customer we have been
working with since the very early days
of machine learning at AWS is Intuit, and they have been rapidly
increasing their adoption of machine learning throughout their business. To talk more about what they
are doing with AWS to accelerate
the deployment of machine learning at scale
to meet the customer needs, I'd like to introduce
Ashok Srivastava, Chief Data Officer and Senior
Vice President at Intuit. Thanks Swami, it's great to be here. I'm really excited to be able
to tell you about the great work in AI and machine learning
that we're pursuing Intuit to help drive
great customer benefits. At Intuit, we build products such
as TurboTax, QuickBooks, and Mint to help people
make better financial decisions and to help them save more money,
reduce their workload, and all the while have confidence that they're making great
financial decisions. Our mission is to power prosperity
around the world. And if you think about it,
now more than ever, people need to make the best
financial decisions for themselves
and for their families. Small businesses, consumers,
the self-employed, are all being pushed to the limit
because of COVID and other issues. So check out some of these
statistics. Every time I look at them,
I'm amazed and inspired by the value that we bring to our customers. For example, our Intuit Aid
Assist Program helps small businesses
secure more than $1.2 billion through the Paycheck
Protection Program. TurboTax powered over 48 million
tax returns last year. We connected our customers to
over 20,000 financial institutions. QuickBooks Capital delivered over
$750 million in cumulative loans. And we have over 25 million
active Mint users, and those people use those products
to understand their finances. Now, what makes this fantastic
is that many of the game-changing
customer experiences I mentioned are powered by AI
and Machine Learning at scale. Back in 2013, we began our journey
with AWS into the cloud. That transition helped to start
this epic journey to drive innovation in AI
and machine learning. At Intuit, we've been able to put
over 250 AI assets into production, we have over 2000 AI
tasks in production, which essentially counts
the number of customer tasks that are powered by AI. One AI system can power
multiple customer experiences. And as a testament to our innovation, we filed over 600 AI and Machine
Learning patents in last few years. So now the question is how did
we accelerate this process? We looked at all of the work
that it takes for our AI scientists and engineers
to put a model into production. And we thought about the AI
hierarchy of needs, as you can see in this pyramid. Now at the bottom of the pyramid,
we have data infrastructure and we have machine-learning
infrastructure, and at the top, we have the actual
AI model development and deployment. We want our AI scientists and
engineers to focus on the top part. And we want to use great
infrastructure capabilities to eliminate unnecessary
workload at the bottom. And that's where 70% of a person's
time can be spent if you're not careful. This is where our great
collaboration with AWS became critical to our AI journey. As we modernized our infrastructure,
we moved into the cloud to help us have the scale, the speed
and the elasticity that we need. We treated data as a product to enable our AI scientists
and engineers to get the data that they need
to quickly and efficiently. And we built critical data
infrastructure to help our team really get access to clean data
as rapidly as possible with great tools
for implementing data pipelines. So check out what happens
when you make the investment to modernize with AWS. We have a 30% decrease
in downtime. We have tripled the speed of
delivery, and we've seen a 60% increase
in mobile app deployments. And when you make those investments
in core infrastructure, the benefits to your AI teams
are immense. Remember the top of that pyramid? Take a look at where we are now. We have increased the number
of deployed models by 50%. We've saved over 25,000 hours
for our customers, and we've reduced expert
review time by 50%. We've had a fantastic
collaboration with AWS, and it's been foundational
to our strategy to become an AI-driven
expert platform. This platform helps us
deliver more money, no work and complete confidence
to small businesses, to the self-employed
and the consumers around the world. We've connected people to our experts by building
a virtual expert platform. Let me tell you a little bit
about how this works. So in the next few slides,
I'm going to show you how we constructed
the virtual expert platform. And we connected it with our
machine-learning platform to deliver great experiences
for our customers. The foundational layer
has many AWS components, including Amazon SageMaker,
Connect, Lex, Polly, and EKS. Now on top of that foundational
AWS layer, we built the machine-learning
platform which has many features necessary
to build and deploy models. These include data
exploration capabilities, feature extraction
and feature management, model training and evaluation, making
predictions and model execution. We have a significant
MLOps capability to really ensure
operational excellence. Now to finally build that
virtual expert platform, we needed to create
additional capabilities. Many of them are powered by AI,
such as expert routing and matchmaking, collaboration
and expert management to ultimately help get a fast answer
to a question from a real customer. So for example, suppose a customer
calls up and has a question about whether
they can deduct a home renovation. We can assess in real time
the needs of the customer and the context and route them
to the right expert. This is done with capabilities such
as interactive voice assistant, digital assistant,
and natural language processing. The virtual expert platform helps
connect customers with experts so that they can get the best
personalized advice in the business. Let me take you a little bit closer
and show you what TurboTax live does. So TurboTax live is powered
by the virtual expert platform. Just as we discussed, we ask you a few simple questions
about your work, your family and other relevant
information about your taxes. We're connected with over
20,000 financial institutions, so importing your bank information
and W-2 information is really easy. And if you have a question,
you can get answers from our AI
powered digital assistant. But if you want to talk
to a human expert and get even more
personalized advice, you can call and through the magic
of the virtual expert platform, you can get connected
to the right expert and get your questions answered. Great technologies such as
our routing and matchmaking algorithms
helped make this possible. We've had a fantastic
collaboration with AWS and look forward
to so many more. We're always focusing
on our customers, learning about their needs, and building solutions that help them
make great financial decisions. We're building the next generation
of our machine-learning platform to give developers
the deep connections that they need to specific
modules in the platform. We're really excited
about continuing to augment human intelligence
with machine intelligence so that our customers and experts
can have the best experiences, and most importantly,
have the very best outcomes. I'd like to thank
the entire team at AWS for such a great collaboration. Thank you so much, Swami. Thank you, Ashok. It's clear to see throughout
this presentation that machine learning is really
transforming everything, from the way we do business,
to the way we entertain ourselves, to the way we get things done
in our personal lives. In fact, entire business processes are being made easier
with machine learning. Marketers can more easily
tailor their message, supply chain analysts can have faster
and more accurate forecasts. And manufacturers
can easily spot defects in products Over the past few years, machine learning
has come an incredibly long way. The barriers to entry
have been significantly lowered enabling builders
to quickly apply machine learning to their most pressing challenges
and their biggest opportunities. And I'm excited to see
what you all will build next. And with that, I will end by saying
that I hope that you are as excited as I am about the work
happening in machine learning across AWS, Amazon
and the broader industry. We have a powerhouse lineup
of speakers today, with something for everyone
interested in machine learning. Enjoy the rest of the day. Thank you. [music playing]