[music playing] Please welcome the Vice
President of Data and AI at AWS, Dr. Swami Sivasubramanian. [music playing] [applause] Hello, everyone.
Welcome to re:Invent 2023. This year, I'm especially excited
to stand on this stage and share our vision with all of you. That's because, this year, we are standing at the edge
of another technological era, an era in which
a powerful relationship between humans and technology
is unfolding right before us. Generative AI is augmenting
human productivity in many unexpected ways, while also fueling
human intelligence and creativity. This relationship,
in which both humans and AI form new innovations,
is rife with so many possibilities. This partnership is similar
to many symbiotic relationships we observe in nature, where different species
not only co-exist together, but also, benefit from one another, like we see with the whale sharks
and the remora fish. Whale sharks offer the fish
protection from its predators, while the fish keeps
the shark clean and healthy. There are even symbiotic stars
that share gases and heat, occasionally producing
a beautiful explosion of energy called a supernova. Today, the world is captivated
by generative AI's potential to reinvent the way we work
and form new innovations, our own supernovae, if you will, but the exploration
of this symbiotic relationship with humans and technology
is not really new, and in fact, over the 200 last years,
mathematicians, computer scientists, and many visionaries
have dedicated their entire lives to inventing new technologies
that have really reduced manual labor and automated many
complex human tasks, from new machines
for mathematical computation and big data processing
to new architectures and algorithms to recognize patterns and make predictions
to new programming languages that made it significantly easier
for people to work with data. One of those visionaries
was Ada Lovelace, who's also known as the world's
first computer programmer. Ada is often recognized for her work with Charles Babbage
on the Analytical Engine, one of the earliest versions
of computer in the 1830's. Like Babbage, Ada believed
that computers could remove the heavy lifting from many
computational and analytical tasks, but what set Ada apart from many
of her contemporaries at the time was their ability to recognize
the potential of computers, that they go way beyond
just glorified number crunching. She discovered that they
could be programmed to read symbols and perform a logical
sequence of operations. This discovery marked a shift towards using computers
for complex tasks. Ada also speculated
that these computers could understand musical notations, and they could even
create music in the future. This could be called an early ode
to the potential of generative AI. But despite her belief
in their potential, she made one thing really,
really clear. These machines could not
really originate anything. They could only generate
outputs or perform tasks that humans were capable
of ordering them to do. True creativity and intelligence,
she argued, only originates from humans. In this way, humans and computers would have a mutually
beneficial relationship, with each contributing
their own unique strengths. Personally, Ada's analysis
of computing is really inspiring to me, not only because her work
stood the test of time, but also, because I have
a young daughter in STEM. In fact, Ada is one of my daughter's
favorite people from the book called
Goodnight Stories for Rebel Girls. So, if you ask my daughter,
she will tell you that, I have a career right now
because of Ada, which I think is really cool. Now, today, it's clear that Ada's
contributions are gaining even more relevance as the world
of generative AI unfolds. While ML and AI have spurred
rapid innovation over the last couple of decades, I believe this new era
of human-led creativity with GenAI will shift our relationship
with technology towards one of
intelligence augmentation. I find this core evolution of humans
and technology incredibly exciting, because that is also under
the symbiotic relationship happening under the hood, that is key to driving
our progress forward here, the relationship between data
and GenAI. The massive explosion of data
has enabled these foundational models to exist in the first place, the large language models
that power generative AI to finally form and accelerate
new innovations everywhere. With these new advancements,
our customers can harness that data to build and customize these apps
with models faster than ever before, while also using GenAI
to make it easier to work with and organize that data. So, while we typically think
of symbiosis as a relationship
between two different things, I like to think
of what's happening today as a beneficial relationship
among three things, where data, generative AI, and humans are all unleashing
their creativity together. Let's explore this relationship
further, starting with developing GenAI apps. To build a GenAI app,
you need a few essentials. You need access to a variety of LLMs
and foundational models to meet your needs. You need a secure and private
environment to customize these models
with your data. Then you need the right tools
to build and deploy new applications
on top of these models, and you need a really powerful
infrastructure to run these data intensive
and ML intensive applications. AWS has a long history
of providing our customers with a comprehensive set
of AI/ML data and compute stack just for this,
and now, according to HG Insights, more machine learning workloads run
on AWS than any other cloud provider. We like to think of our AI/ML
offerings as a three-layer stack. At the lowest layer is
the infrastructure tier. You will need to cost-effectively
train these foundational models
and deploy them at scale, including our hardware chips
and GPUs. This layer also includes
Amazon SageMaker, our service which enables
ML practitioners to easily build, train, and deploy ML models,
including these LLMs. At the middle layer,
we have Amazon Bedrock, which provides access
to the leading LLMs and other FMs to build and scale
generative AI applications. And at the top layer
are the applications that help you take advantage
of GenAI, without the need for
any specialized knowledge or having to write any code. This includes services like Amazon Q, our new GenAI-powered assistant
that is tailored to your business. Each of these layers
build on the other, and you may need some
or even all of those capabilities at different points
in your GenAI journey. Let's take a closer look at our tools for building GenAI applications,
starting with Amazon Bedrock. Bedrock has a broad set of
capabilities for building and scaling GenAI applications with the highest levels
of privacy and security. One of the main reasons
customers gravitate towards Bedrock is the ability to select
from a wide range of leading foundational models
that support their unique needs. This customer choice is paramount, because we believe
no one model will rule the world. We are still in the early days
with GenAI, and these models will continue
to evolve at unprecedented speeds. That's why customers need
the flexibility to use different models at different
points for different use cases. For this, Bedrock has you covered. We offer customers a broad choice
of the latest foundation models from leading providers like AI21 Labs, Anthropic, Cohere,
stability, and Meta. One of our most popular models
is Anthropic's Claude model, which is used for tasks like
summarization and complex reasoning. Customers also use Stability.AI's
Stable Diffusion model to generate images, arts, and design, and recently, we announced
support for Cohere Command, which can be used for tasks
like copywriting and dialogue, and Cohere Embed for search
and personalization. But there is more. We recently added support
for three new Cohere models: Command Light, Embed English,
and Embed Multilingual. We also introduced Meta's Llama 2
on Bedrock, which customers are rapidly adopting
for high performance and relatively low cost. And just a couple of weeks ago,
we added the Llama 2 13B model, which is optimized for a variety
of small range use cases, and finally, we added support
for Stable Diffusion SD XL 1.0, Stability.AI's advanced
text-to-image model, which is now generally available. But then I told you in this space, it's super early,
and choice is paramount. So, today, we are continuing
our commitment to offering the latest innovation
from many of our model providers, starting with Anthropic. I'm excited to announce
Bedrock support for Claude 2.1. [applause] Claude 2.1 delivers advancements
in key capabilities for enterprises, including an industry
leading 200K context window, which also improves your accuracy
in many remarkable ways. According to Anthropic, this update has 50%
fewer hallucinations, even when you are trying
to do adverse real prompt attacks, and it also has a 2X reduction
in false statements in open-ended conversations. Both of these are super important
for enterprise use cases. And 2.1 also has improved
system prompts, which are model instructions that provide a better experience
for end users, while also reducing
the cost of these prompts and completions by 25%. And we are not just stopping there. We are also excited to announce
new updates for our customers who want to experiment
with publicly available models. That's why, today, I'm pleased
to announce support for Llama 2 70B in Bedrock. [applause] This model is suitable for many
large-scale text-based processing, such as language modeling,
text generation, and dialogue systems. In addition to our partners, AWS is heavily building new innovations
in this area for our customers. We have a longstanding history
in AI and ML. Amazon has invested in AI/ML
technologies for over 25 years, and many of the capabilities
our customers use today are driven by ML models,
including these foundational models. These models power virtually
everything we do, from our customer-facing
e-commerce applications to various facets of our
enterprise business and supply chain. Because this ML innovation
has benefited our business so much, we wanted to share our key learnings
with our customers too. One of the ways we wanted
to enable this is by enhancing search
and personalization experience for our customers through a specific data
type called vector embeddings. Vector embeddings are produced
by these foundation models, which translate text inputs,
like words, phrases, or large units of text
into numerical representations. While we as humans understand text
and the meaning and the context of these words,
machines only understand numbers. So, we have to translate them
into a format that is suitable
for machine learning. Vectors allow your models
to more easily find the relationships
between similar words. For instance, a cat is closer
to a kitten, or a dog is closer to a puppy. This means your foundational models
can now produce more relevant responses
to your customers. Vectors are ideal for
supercharging your applications, like rich media search
and product recommendation. In this scenario, the use
of vector embeddings greatly enhances the accuracy
of the query, like bright-colored golf shoes. We have used embeddings to support
many aspects of our business, like amazon.com. That's why we offer
Titan Text Embeddings, which enables customers,
like AI company Grip Tape, to easily translate their text data
into vector embeddings for a variety of use cases. But as our customers are continuing
to build more and more applications, they want to combine image and text
that support both modalities. For example, imagine a furniture
retail company with thousands of images. They want to enable their customers
to search for a furniture using a phrase, image, or even both. They could use instructions, like,
show me what works well with my sofa. Now, to build such kind of
experience, developers need to spend time
piecing together multiple models. Not only does this increase
the complexity of your GenAI stack, but it also decreases the efficiency,
and it impacts customer experience. We wanted to make these applications
even easier to build. That's why, today, I'm excited
to announce the general availability
of Titan Multimodal Embeddings. [applause] This model enables you to create
richer multimodal search and recommendation options. Now, you can quickly generate store
and retrieve embeddings to build more accurately and contextually
relevant multimodal search. Companies like OfferUp are
using Titan Multimodal Embeddings, as well as Alamy, which
is using this model to revolutionize their stock image search experience
for their customers. In addition to our embeddings models, we also offer models that support
text generation use cases. This includes Titan Text Lite
and Titan Text Express, which are now generally available. [applause] These text models help you
optimize for accuracy, performance, and cost depending on your use cases. Text Lite is a really,
really small model, but it is an extremely
cost-effective model that supports use cases
like chatbots, Q&A, and text summarization. It is lightweight and ideal
for fine tuning, offering you a highly customizable
model for your use case. Text Express can be used
for a wider range of tasks, such as open-ended text generation
and conversational chat. This model provides a sweet spot
for cost and performance, compared to these really big models. Now, finally, GenAI image generation is growing in popularity
for industries such as advertising and retail, where customers need high quality
visuals and at a lower cost. We wanted to help our customers do this easily, accurately,
and responsibly. That's why, today, I'm very excited
to announce Titan Image Generator, which is now available
in preview today. [applause] This model enables customers to produce high quality realistic
images or enhance existing images using simple natural
language prompts. You can customize these images
using your own data to create content that better
reflects your industry or your brand. Titan Image Generator is
trained on a diverse set of datasets to enable you to create
more accurate outputs. It also includes built-in mitigations
for toxicity and bias. Through human evaluation,
we found that Titan Image Generator has higher scores
than many other leading models. More importantly, to build
on our commitments we made at the White House
earlier this year to promote the responsible
development of AI technology, all Titan-generated images
come with an invisible watermark designed to help reduce
the spread of misinformation by providing a discreet mechanism
to identify AI generated images. AWS is among the first model
providers to widely release
built-in invisible watermarks that are integrated
into image outputs and are designed to be
tamper resistant. Now, let's take a look at the model's
editing features in action. First, I will use the image generator
and submit a text prompt, such as image of a green iguana, to get an image quickly
to kind of see what I want. Now, I can use the model to easily
swap out an existing background to a background of a rainforest. This process is known
as out-painting. You can use the model
to seamlessly swap out backgrounds to generate lifestyle images, all while retaining
the main subject of the image, and to create a few more options,
I can use the image playground to generate variations
of my original iguana subject, as well as variations
of the original rainforest background. Or I can completely
change the orientation of the picture
from left-facing to right-facing, using the prompt, like orange iguana
facing right in a rainforest. This is really cool, right, and there are so many incredible
Image Generator features that I cannot cover, such as in-painting
and image customizations, that I have not showcased here today. Because this model is trained
for a broad range of domains, customers across
a variety of industries will be excited to take advantage
of Titan Image Generator. As you can see, each Titan model
has its own unique strengths across capabilities,
price, and performance, and as Adam shared yesterday,
we have carefully chosen how we train our models
and the data we use to do so. We will indemnify customers
against claims that our models or their output
infringe on anyone's copyright. With these investments, our customers
will have the flexibility to select the best models
for their requirements, even as their needs grow and change, and our Bedrock customers
have quickly taken advantage of different models to build
all types of customer experiences. In fact, since we launched Bedrock, more than 10,000 customers
are rapidly developing GenAI-powered applications
for use cases like self-service, customer support, text analysis,
and forecasting trends. This includes customers like SAP, the world's leading provider
of enterprise software, who is using Bedrock to automate the creation of trip
requests within SAP Concur, saving employees hours of time,
including our own Amazon employees. Georgia Pacific, one of the world's leading manufacturers
of paper and pulp, uses Bedrock to power
a chatbot system that helps employees quickly retreat
critical factory data and answer questions
about their machines. And United Airlines,
they use Bedrock to help employees access up-to-date
information and summaries and delays using natural language, helping them
resolve operational issues and customer issues faster. It is so inspiring to see
our customers build with these models on Bedrock
and adapt them for their needs, and now, to show you how to put
GenAI into action for your business, please welcome Nhung Ho,
VP of AI from Intuit. [music playing] Good morning, everyone. Great to see you all bright
and early today. How are you all feeling? Good? Come on, that's right. So, as I've been listening
over the past few days, it's obvious to me why Swami
invited us onto this stage. Over the past decade, Intuit has built on AWS,
first from moving our application onto the cloud to AI/ML
with SageMaker, and now, in the era
of generative AI, with Bedrock. At Intuit, everything we do centers
around our mission to power prosperity around the world
for 100 million consumers and small business customers, and for me,
this mission really hits home. I'm one of ten siblings. Can you imagine being in
a family with ten siblings? There's so many of us, we don't
even all fit in the same photo. This is the most of us that's ever
fit in one photo at any one time, and the reason why this hits
really close to home for me is that half of my siblings
are small business owners, and so, I deeply understand
the everyday challenges that small businesses face, from managing inventory
to dealing with cashflow to understanding taxes
throughout the year, and so, it's really great
for me that, in my everyday work, I get to make
their lives easier with AI. I get to build game
changing applications that solve problems
for myself, my siblings, and probably a lot of you
in the audience here today. At Intuit, we're all about leveling
the playing field for our customers, and so, to do that, we've been on
an incredible transformation journey over the past five years. In 2019, we declared
that we were going to be an AI-driven expert platform, and by combining cutting edge AI
with tax and human expertise, we're delivering unparalleled
experiences for our customers. Today, we've been able
to achieve incredible scale with AWS running all of our data capabilities
as well as our data lake on AWS, and our machine learning platform, of which SageMaker is
a foundational capability, allows our entire community of machine
learning developers to build, deploy, and ship new AI
experiences with speed, and to give you an idea of what this
really means more concretely, this means that we're able to make 65 billion
machine learning predictions per day, utilizing over half
a million datapoints for our small business customers,
60,000 for our consumers, and then driving over
810 million customer-backed AI interactions per year. Now, in the era of generative AI, we're well positioned
to change the game because of our multi-year investment. We've been really honed in on
making sure that our data is clean, that there's strong data governance, and that we're building out
responsible AI principles, and that's really allowed us
to quickly unlock these new opportunities. To enable our technologists
to design, build, and quickly ship out AI applications
in this GenAI world, we built a proprietary
GenAI operating system called GenOS, and this is on AWS, and what GenOS has
is four primary components. The first is Gen Studio. This is where anybody at the company
can build and prototype and test out new-to-the-world
GenAI experiences. When they're ready to ship,
then they can use Gen Runtime, which has connectivity
to a multitude of LLMs and has access to the right
underlying data, so that you can build those
personalized and accurate experiences but have the comfort
that you can scale it out when needed to customers. The next piece is that, when you do deploy and ship
these GenAI experiences, you want that consistency
across your products. You don't want a Franken experience. So, we built a design system
called GenUX, so that you get that transparency, and when a customer
interacts with GenAI, they know that they're
getting that experience, and I would say the most
important component to GenOS is the series of financial
large language models, and this is a set of third-party LLMs
as well as custom-trained LLMs that are specialized
in our domain of tax, accounting, marketing,
and personal finance, and you may ask, why the heck
do I need to do this myself, why not use what's out there, right, and I'll tell you a couple things
that we learned during this journey, is that three things remain constant
in the GenAI world: accuracy, latency and cost. Any experience you build will anchor
on those three things, and so, data is the key
to unlocking accuracy. We all know that, but the ability
to use these smaller, faster models allows us to realize
significant latency gains, and the great thing is that
we're able to host these models on SageMaker,
and so, we're able to finally manage cost as well as scale
according to our needs, but I also mentioned earlier,
we use third party LLMs, because at the end of the day, the thing that you
really optimize for is to build the best customer
experience possible. To do that, you need to be able
to use best-in-class solutions, and that's what Bedrock
gives us the ability to do. It gives you optionality. With the wide library of models
available on Bedrock that Swami just showed,
we're able to fully encompass all of the needs
that our customer has, and the other thing that Swami
mentioned is that, within Bedrock,
we're able to easily scale our underlying inference
infrastructure, and all of that is done
within our AWS VPC, and so, that gives us the confidence
to be able to fully leverage our data and our knowledge base to build
these personalized, responsible, and also, relevant experiences
for our customers, but also, knowing
that the safety, security, and privacy of the data
is maintained for our customers. So, in September we launched
Intuit Assist, our generative AI assistant that is embedded across
all of our product offerings. If you go to TurboTax,
you go to QuickBooks, you go to MailChimp,
you are going to see Assist, and it's all backed by GenOS. With Assist, what we're
really trying to do is help you feel confident in every single financial
decision that you make. It's there with you,
and so, I'm going to show you what does this looks like
in TurboTax. This is live in production
for our customers, and we also gather
significant customer feedback. So, if you can imagine, when you get
to the end of your tax filing experience, you get a number. What does that number mean, right? Like, I have a PhD
in natural physics. I barely know what those numbers
mean, and so, if you can imagine
for the standard everyday user, it's incredibly challenging. By marrying the power
of our knowledge engine that ensures accuracy
with the power of an LLM, we're able to help unpack this
outcome for our customers, so that they truly understand and can feel confident
to take that next step, whatever it may be, and so, for those who are
just beginning their AI journeys, we offer two learnings. The first is take
a holistic approach. Really invest in your
underlying data, because that's going to be
the differentiator for every single experience
that you build, but also, build in
horizontal solutions from day one. At some point, demos need to become
production experiences, right, and you don't want to get caught
by surprise when you're ready to go. The second is that there's no
one-size-fits all LLM solution. Optionality is
so incredibly important, and that's what is offered
on Bedrock and on AWS, and our collaboration with AWS over the past years
has really helped us grow to become a global
financial technology platform, and these are just
some of the services that have gotten us there. So, I agree with Swami,
but I have to agree with him, I'm on the stage,
that the massive explosion of data has enabled these
foundational models, and you saw, Assist is
one of the experiences that is an outcome of our ability
to leverage data in an LLM. At Intuit, Assist is just going to be
one of the many GenAI experiences that we built,
and over our 40-year history, we've gone through
many transformations. GenAI is just one of
those transformations, and we're going to continue
to transform and reinvent over the next 40 years. Thank you so much. [music playing] Thank you, Nhung. Intuit is an excellent example of how you can reimagine
your customer experience with easy-to-use tools and access to a variety
of these foundational models, and as Nhung demonstrated,
there is another component that is critical for creating
these GenAI apps that are unique to your business. That is your data. When you want to build
GenAI applications that are unique to your business,
your data is the differentiator. Data is the key
from a generic AI application to a GenAI application that understands your customer
and your business. So, how do you go about customizing
these models with your data? A common technique to customizing
these foundational models is called fine tuning. With fine tuning, it's pretty simple. You provide a labeled dataset, which are annotated
with additional context to train the model on specific tasks. You can then adapt the model
parameters to your business, extending its knowledge
with lexicon and terminology that are unique to your industry
and your customers. Amazon Bedrock removes
the heavy lifting from the fine-tuning process, but you can also leverage
unlabeled datasets or the raw data
to maintain the accuracy of your foundational model
for your domain, through continual
pre-training process. For example, a healthcare company
can continue to pre-train the model
using medical journals, articles, or research papers
to make it more knowledgeable on the evolving industry terminology. Today, you can leverage both
of these techniques with Amazon Titan Lite
and Titan Express. [applause] These models complement each other and will enable your model
to understand your business over time, but no matter which method you use, the output model is accessible
only to you, and it never goes back
to the base model. We announced some of these
capabilities early on. This week, we also added fine
tuning in Bedrock for Cohere Command and Llama 2, with fine-tuning for Anthropic Claude
coming soon. Now, let me just show you a quick
example of how this works. Imagine a content marketing manager that needs to come up
with a fresh ad campaign for a new line of shoes. To do this, they select Llama
2 model and provide Bedrock with a few examples
of their best performing campaigns. Bedrock makes a separate
copy of the base model that is accessible only
to the customer, and after training, Bedrock generates relevant
social media content, display ads, and web copy
for the new shoes. With fine tuning, you can
build applications that are specific to your business, but what is some of your data
changes frequently, like inventory or pricing does? It is simply not practical
to be constantly fine tuning and updating this model while it is
also serving user queries. That's why, to enable a model with the up-to-date information
from your data sources, you need a different technique called
Retrieval Augmented Generation, also known as RAG. With RAG, you can augment
the prompt that is sent to your foundational model
with contextual information, such as product details, which it draws from
your private data sources. This added context in the prompt
helps the model provide more accurate and relevant response
to the user's query. However, implementing
these RAG-based systems is extremely complex. Developers must first convert
their data into vector embeddings. Then they need to store
these embeddings in a vector database that can handle vector
queries efficiently. Finally, they build custom
integrations with Vector database to perform semantic searches,
retrieve relevant text, and then augment the prompt. All of this process can take
weeks if not months to build. To make this process easier,
that's why, yesterday, we announced Knowledge Bases for Amazon Bedrock, which supports the entire RAG
workflow right from ingestion to retrieval to prompt augmentation. Here, you simply point
to the location of your data, like an S3 bucket, and Bedrock fetches relevant context
and relevant text documents, converts them into embeddings, and stores it
into your vector database, and during inference time,
it augments with the right context to your prompts
sent to your foundational models. Knowledge Bases work with
popular vector databases, like our vector engine
for open set serverless, Redis Enterprise Cloud,
and Pinecone, and coming soon, we will also add support
for Amazon Aurora as well as MongoDB, with more and more databases
being added over time. Now, the ability to customize
these models with your data is incredibly useful, but you can also extend
the power of these models to execute business tasks, like booking travel
or processing insurance clients. To do this, developers perform
several resource-intensive steps to fulfill a user request,
like defining the instructions and orchestrating
or configuring the models to access your data sources. That's why yesterday, Adam announced
the GA of Agents for Amazon Bedrock, a capability that enables
GenAI applications to execute complex tasks
by dynamically invoking these APIs. Bedrock makes it super easy
to create these fully managed agents that connect your internal systems and APIs on your behalf
in just a few steps. Now, now that we have talked
through different ways to customize your model and remove
the heavy lifting with agents, let me walk through
a hypothetical scenario on how to leverage GenAI capabilities for a task that many of us
are familiar with: DIY. How many of you have a home
improvement project on your to-do list? I see a few hands. My wife and I did a lot of work
in our basement this summer, and believe me, getting that work
done was a full-time job. Any new DIY project requires
multiple complex steps. Oftentimes, one of the hardest
things about the project is just figuring out
how to get started. To help with these challenges, we have built a hypothetical
DIY business, called Rad DIY, and this is powered
by a GenAI-powered assistant with Claude 2 on Amazon Bedrock. This assistant is designed to remove
the complexities of a DIY project and provide customers with accurate
and easy-to-follow steps. Let's see how it works. Nina is an ambitious DIYer, who wants to replace
her bathroom vanity and decides to use
the app for her project. She can use natural language
to ask the assistant about any type of project and receive
a list of detailed steps, materials, and tools,
along with any necessary permits. The app also leverages customer inputs to generate images
of their project using Titan Image Generator model. So, after a short interaction,
the app provides Nina with a few images for inspiration that she can further refine
through conversation and feedback. Once Nina selects the design
she likes, our app uses Multimodal Embeddings
to search its extensive inventory and retrieve all of
the products she will need. No multiple trips to the store
will be necessary. In addition, now, the app
uses the Cohere Command model to provide a summary of user reviews
for each reviewed product. This summarization helps
Nina decide if the products and the tools meet her
requirements and skill level. Finally, if Nina wants to find
a specific item for her vanity, like nautical bronze drawer handles, our app uses the Knowledge
Base feature in Bedrock to search inventory for products that meet her budget
and skill level and timeframe. Now that Nina has everything
she needs, all that is left for her to do
is start her project. I hope this hypothetical scenario
sparks some ideas for you on how you can use Bedrock to build GenAI applications
with your data to create new customer experiences. We make it easy
to get started on Bedrock, but some of our customers
also want hands-on support to get started with GenAI. That's why we offer the AWS
GenAI Innovation Center, a program that pairs your team with our own expert
AI/ML scientists and strategy experts to accelerate your GenAI journey. Since we announced it, this program
has been gaining incredible momentum. Many of our customers also told us
they want dedicated support to customize these foundational
models for their needs, which is why we are introducing
even more offerings through our Innovation Center. Today, I'm excited to announce
a new Innovation Center Custom Model Program
for Anthropic Claude. [applause] This program, available
early next year, will be incredibly powerful, because it will enable you to work
with our team of experts to customize these highly powerful
cloud models for your business
needs with your data. This includes everything
from scoping requirements to defining evaluation criteria to working with your
proprietary data for fine tuning. You can then securely access
and deploy your private models on Bedrock, which will be available
only in your VPC. However, customizing these
foundation models isn't the only way to build
innovative AI applications. As I mentioned earlier,
there may still be a need for certain companies
to build their own, and these customers need powerful
machine learning infrastructure. For instance, AWS has partnered
with NVIDIA for 13 years to deliver large scale,
high performance GPU solutions that are widely used
for deep learning workloads, and this week, we announced an expansion
of a strategic collaboration to deliver next generation
infrastructure, software, and services for generative AI. And to provide more choice
for our customers, we have invested in our own ML chips,
AWS Trainium and AWS Inferentia, to push the boundaries
on cost efficiency and performance. We also enable our customers
with best-in-class software tools in the software layer of the stack
with Amazon SageMaker. SageMaker makes it easy
for customers to build, train, and deploy ML models,
including these LLMs, with tools and workflows
for the entire ML lifecycle, right from data preparation
to model deployment. We have also invested
in providing efficient model training with distributed
training libraries and built-in tools
to improve model performance, and today, leading organizations like Stability.AI, AI21 Labs,
Hugging Face, and TII are training their foundational
models on Amazon SageMaker. But with all of these
investments in this area, training a foundation model
can still be incredibly challenging. First, customers need to acquire
large amounts of data, create and maintain a large
cluster of accelerators, write code to distribute model
training across a cluster, frequently inspect
and optimize the model, and manually remediate
any hardware issues. And all of these steps
require deep ML expertise. Let me dive into some of
these challenges to understand why it is so complex. Now, because of the massive size
of these foundation models and the datasets used for training, developers need to split
that data into chunks and load them into the individual
chips in a training cluster, a distributed cluster with hundreds
or even thousands of accelerators. This is a lot of work, because in order
to make efficient use of these compute
and network resources, the distribution needs to be tailored to the characteristics of the data,
your model architecture, as well as the underlying
hardware configurations. That means you have to write a lot
of code and optimize it frequently. In addition, customers need
to frequently pause and inspect the model performance, optimize the code if something
is not working right. To do this, they have to manually
take checkpoints of the model state, so that the training is able to start
without any loss in progress. Finally, when any of these
thousands of accelerators in the cluster fails, the entire training process
is halted. To resolve this issue,
customers had to identify, isolate, repair,
and recover the faulty instance or change the configuration
of the entire cluster, further delaying the progress. We wanted to make it easier
for our customers to train these LLMs
without interruption or delays. That's why, today, I am thrilled
to announce the general availability
of SageMaker HyperPod. [applause] This one is a big deal, because it's a new distributed
training capability that can reduce model
training time by up to 40%. HyperPod is pre-configured with SageMaker's distributed
training libraries. This enables your data
and models to efficiently distribute across thousands of chips
in the cluster and process them in parallel. The HyperPod helps customers
iteratively pause, inspect, and optimize these models, because it automatically
takes checkpoints frequently, and if a hardware failure occurs,
it detects the failure. It replaces the faulty instance and resumes the training
from the last-day checkpoint. With this new capability, customers will see
dramatic improvements by training models for weeks, if not months,
without any disruption. But this is just one of
the many innovations we announced for SageMaker this week. Today, we are announcing
a slew of new SageMaker features across inference,
training, and ML ops. [applause] I had to extend by an hour
if I had to cover all of it, but I'll just do a quick hit and see. SageMaker Inference reduces model
deployment by 50% on average and achieves better latency by 20%. We also introduced new capabilities
in SageMaker Studio, like a new user experience. And all of these updates
help customers build, train, and deploy these new large
language models even easier. I encourage you to check out
Bratin Saha's Innovation session later today to learn more about
these innovations, and now, I'd like to introduce
one of the customers who is leveraging some of
these latest SageMaker innovations and training their own, training
and deploying their own models. Please join me in
welcoming Aravind Srinivas, CEO and Co-Founder
of Perplexity, to the stage. [music playing] At Perplexity, we strive to be the world's leading
conversational answer engine that directly answers
your questions with references provided to you
in the form of citations. Our company is re-imagining
the future of search by trying to take us
from ten blue links to personalized answers
that cut through the noise and gets to exactly what you want. Perplexity's Copilot is
an interactive search companion. As you see, it starts with
a general question that you had in your mind,
digs deeper to clarify your needs, and after a few interactions
with you, gives you a great answer. Ours is the first global
publicly deployed example of generative user interfaces that reduces the need
for prompt engineering. This is such a complex product
to run and a hard problem to solve. Hence, why we decided
to go all in on AWS. We started off by testing
frontier models, like Anthropic's Claude 2
on AWS Bedrock. Bedrock provides cutting edge
inference for these frontier models. This helped us to quickly test
and deploy Claude 2 to improve our general question
answering capabilities by providing more
natural-sounding answers. Claude 2 has also helped
inject new capabilities into Perplexity's product, like the ability to upload
multiple large files and ask questions
about their contents, helping us to be the leading research
assistant there is in the market, but Perplexity is not just a wrapper on top of closed proprietary
large language model APIs. Instead, we orchestrate
several different models in one single product, including those that
we've trained ourselves. We built on top of open-source models
like Llama 2 and Mistral and fine-tuned them
to be accurate and live, with no knowledge cut off, by grounding them with web
search data using cutting edge RAG. This is when we started working
with the AWS Startups team on an Amazon SageMaker HyperPod POC. SageMaker HyperPod makes it
easier to debug large model training and handle distributed
capacity efficiently. We obtained AWS EC2
p4de Capacity for training. This enabled us to fine tune
state-of-the-art open-source models, like Llama 2 and Mistral, and once we moved to HyperPod and
enabled AWS Elastic Fabric Adapter, we observed a significant increase
in the training throughput, by a factor of 2X, but it's not just training
that we've benefitted from AWS. AWS has also helped us
with customized service to support our inferencing needs,
especially on p4d and p5 instances, and this helped us to build
top of the market APIs for our open-source models
and our in-house models that have been fine-tuned
for helpfulness and accuracy. So, today, we are excited to announce
the general availability of all these models
in the form of APIs, including the first of its kind live LLM APIs,
that have no knowledge cutoff and are plugged
into our search index, all fully hosted on AWS. [applause/cheering] Thank you. [applause] Generative AI is still
in its nascent stages, and we still think
we are the beginning of what's going to be a glorious
revolution for all of us, where the biggest winners
are going to be you all, the consumers of the technology,
where you get plenty of choices, great new product experiences,
and competitive pricing. Perplexity is closing the research to decision
to action loop even further, and we plan to get
all our users to a point where you all take this
for granted in the years to come. This is disruption
and innovation at its prime. Perplexity strives to be the earth's
most knowledge-centric company, and we are glad here
to be working with AWS, so that no one here
ever needs to go back to the ten blue link search engine.
Thank you. [applause] [music playing] Wow! Thanks to Aravind for sharing how Perplexity is re-imagining search
with new model innovations on AWS. As you saw from Perplexity
and other examples so far, it is critical that you are able
to store, organize, and access high-quality data
to fuel your GenAI apps, whether you're customizing
your foundation model or building your own. To get high quality data for GenAI, you will need
a strong data foundation, but developing a strong
data strategy is not new. In fact, many of you already have
made strategic investments in this area, from databases that deliver data
to your applications to BI tools that support fast
data-driven decision making. GenAI makes this data
foundation even more critical. So, what should your data
foundation include, and how does it evolve to meet
the needs of generative AI? Across all types of use cases, we have found that
a strong data foundation includes a comprehensive,
integrated set of services, as well as tools to govern your data
across the end-to-end data workflow. First, you will need access
to a comprehensive set of services that account for the scale, volume, and the type of use cases
that you deal with in data. This is where AWS offers
a broad set of tools that enable you to store, organize,
and access various types of data via the broadest selection
of database services, including relational databases,
like Amazon Aurora and Amazon RDS. We also offer eight
non-relational databases, including Amazon DynamoDB, and places to store and query data
for analytics including AI and ML on top of all
the S3-based data lakes, and Amazon Redshift,
our data warehouse that provides up to six times better price performance
than any other cloud data warehouse. You also need tools to act
on your data. We have already discussed tools
for ML and GenAI, but you also need services
to deliver insights from your data, like Amazon QuickSight,
our unified BI service, and you need to catalog
and govern your data with services that help you centralize
access controls. Across all of these areas, AWS provides you
the right tool for the job, so you don't have to compromise
on performance, cost, or results, and we have carried this philosophy
to your GenAI needs as well, including the tools you use
for storing, retrieving, indexing, and searching
these vector embeddings. As our customers use vectors
for GenAI applications, they told us they want to use them
in their existing databases, so that they can eliminate
the learning curve associated in terms of picking up
a new programming paradigm: tools, APIs, and SDKs. They also feel more confident
that the existing databases, that they know how it works,
how it scales, and its availability can also evolve to meet the needs
of vector databases, and more importantly,
when your vectors and business data are stored in the same place,
your applications will run faster, and there is no data sync
or data movement to worry about. For all of these reasons,
we have heavily invested in adding vector capabilities to some
of our most popular data sources, including Amazon Aurora,
Amazon RDS, and OpenSearch Service. And earlier this year, we announced vector engine support
for OpenSearch Serverless, and since we announced in preview, this has been rapidly gaining
in popularity with our customers. They're loving this truly
serverless option, because it removes the need
to manage servers for ingestion of your data
and querying of your data, and today, I'm pleased to announce
our Vector Engine for OpenSearch Serverless
is generally available. [applause] Now, we are just
getting started there. Not only have we added vector
support for these services, but we are also invested
in accelerating the performance of existing ones. For example, Aurora Optimized Reads
can now support billions of vectors with 20X improvement in queries
per second performance with single-digit
millisecond latency, and we are continuing to invest
in ongoing performance improvements in these areas. Now, there are a bunch of customers who are storing that data
in document databases or in key value stores,
like DynamoDB. That's why today, I'm pleased
to announce vector capabilities in two
of our more popular databases: DocumentDB and DynamoDB. [applause] For use cases that need
high schema flexibility or JSON data, DocumentDB customers can now
store their source data and their vector data
together in the same databases, and DynamoDB customers
can access vector capabilities through a zero-ETL
integration with Amazon OpenSearch, but we didn't want to stop there. In 2021, we added one more
purpose-built data store, Amazon MemoryDB for Redis,
our Redis-compatible, durable, in-memory database service
for ultrafast performance. Our customers asked
for an in-memory vector database that provides
millisecond response time, even at the highest recall
and the highest throughput. This is really difficult
to accomplish, because there is an inherent tradeoff between speed versus relevance
of query results and throughput. So, today, I'm excited to announce
Vector search is now available in preview for MemoryDB for Redis. [applause] MemoryDB customers get
ultrafast vector search with high
throughput and concurrency, and they can store
millions of vectors and provide single-digit
millisecond response time, even with tens of thousands
of parties per second, at greater than 99% recall. This kind of throughput and latency is really critical for use cases
like fraud detection and real-time chat bots,
where every second counts. For example, a bank needs to detect
in real-time to mitigate losses, and customers want immediate
response from chatbots. And finally, we also know
that many of our customers are leveraging graphs
for storing, traversing, and analyzing interconnected data. For example, a financial company
uses graph data to correlate historical account
transactions for fraud detection. Since both graph analytics
and vectors are all about uncovering the hidden relationships
across our data, we thought to ourselves,
what if we combined vector search with the ability to analyze
massive amounts of graph data in just seconds. And today, we are doing just that. I'm very, very excited to announce the general availability
of Neptune Analytics. [applause] An analytics database engine
for Neptune, which makes it easier and faster
for data scientists and ad developers to quickly analyze
large amounts of graph data. Customers can perform graph analytics
to find insights and graph up to 80X faster by analyzing
their existing Neptune graph data or their data lakes on S3. Neptune Analytics makes it
easier for you to discover relationships
in your graph with vector search by storing
your graph and vector data together. In addition to using this
relationship information directly, you can also use it to augment your foundation model
prompts through RAG. Snap, an instant messaging app with more than 750 million
monthly active users is using Neptune Analytics
to perform graph analytics on billions of connections
in just seconds to enable friend recommendations
in near-real-time. We are thrilled to add
all these vector capabilities across our portfolio
to give our customers even more flexibility as they build
their GenAI applications. We expect our innovation
velocity in this area to rapidly accelerate,
as new use cases emerge and flourish. Now, let's look at the second pillar
of the strong data foundation. You will want to make sure
that your data is integrated across data silos, so that you get a complete
view of your business. You will want to ensure your data
is readily accessible for your GenAI apps. When you break down the data
silos across your databases, data leaks, data warehouses,
and third-party data sources, you will be able to create better
experiences for your customers. We know that building and managing
these ETL data pipelines has been a traditional pain
point for our customers, and one of the ways
we are helping our customers create a more integrated
data foundation is our ongoing commitment
to a zero-ETL future. Since we announced
this vision last year, we have heavily invested
in building seamless integrations across our data stores, like our fully managed
zero-ETL integration between Aurora, MySQL and Redshift,
which we announced earlier this year. This integration makes it easy
to take advantage of near-real-time analytics, even when millions of transactions
happen in a minute, and you can analyze in Aurora,
that you can analyze in Redshift, and yesterday, we announced even more
zero-ETL integrations to Redshift, including Aurora Postgres,
RDS for MySQL, and DynamoDB. We also announced a new zero-ETL
integration with DynamoDB and OpenSearch Service, enabling you to search
and query large amounts of operational data
in near-real-time. Today, tens of thousands of customers use Amazon OpenSearch
to power, real-time search, monitoring, and analysis of business
and operational data, but if you look at it, most customers do not store
all of their data in OpenSearch and often prefer to pull in data from a variety of logs
on data sources like S3, which provides a low-cost
and flexible option for storing their data. As a result, they have to create
ETL pipelines to OpenSearch to query that data and get actionable results
as fast as possible. For example, your operations team
might want to query the last 90 minutes of data
to troubleshoot an ongoing performance
issue in your application. However, if your ETL jobs
run nightly, the data your team's need
right now won't be available. This slows down your ability to act
and creates risk that a smaller issue might end up becoming
a really big one. That's why, today, I am delighted
to announce a new zero-ETL integration between
Amazon OpenSearch and S3. [applause] This one is a big one
for observability, because it enables you
to seamlessly search, analyze, and visualize all your log data
in a single place without creating any ETL pipelines. To get started, you just use
OpenSearch Console to set up a data connection
and run your queries. It's really that simple. With new indexing capabilities
and materialized views, you can accelerate your queries
and dashboarding capabilities. Teams can use a single dashboard
to investigate observability and security incidents,
eliminating the overhead from managing multiple tools. You can also perform complex
queries for forensic analysis and correlate data
across multiple sources, helping you protect against service
downtime and security events. With all of these zero-ETL
integrations, we are making it even easier for you to get relevant data
for your applications, including GenAI applications. Finally, your data foundation
needs to be secure and governed to ensure
that the data you use throughout the development
of your application stays high quality and compliant. To help with this, we announced
Amazon DataZone last year. DataZone is a data management service that helps catalog, discover, share, and govern your data
across your organization. It makes it really easy
for employees to discover and collaborate with your data
to drive insights for your business. DataZone is used by companies
like Guardant Health to help the developers
focus more on their mission to building cancer solutions instead of worrying about building
these governance platforms. Now, while DataZone helps
you share data in a governed way
within your organizations, many customers want to securely share
that data with their select partners. For that, we have AWS Clean Rooms, which makes it easier
for customers to analyze data with the business partners without having to share
their whole dataset. This capability enables companies
to safely analyze collective data and generate insights that couldn't
be produced on their own, but customers told us that they want
to do more than just run analytics. They want be able
to run machine learning to get predictive insights
on their Clean Room. One of the ways they want
to accomplish this is through what is known
as lookalike models, which take a small sample
of customer records to generate an expanded set
of similar records with a partner, all the while protecting your data. For example, imagine an airline that can take signals
about its loyal customers, collaborate with an online
booking service, and offer promotions to new
but very similar uses. So far, this process has been
extremely difficult to accomplish without one party
sharing data with the other. That's why, today, I'm introducing
the preview of AWS Clean Rooms ML. [applause] This is a first-of-its-kind
capability to apply ML models with your partners
without sharing the underlying data. With Clean Rooms ML, you can train
a private lookalike model across your collective data,
keep control of your models, and delete them when you're done. These models can be easily applied
in just a few steps, saving you months of development
time and resources. Clean Room ML also offers
intuitive controls to tune these model outputs based
on your business needs. Lookalike modeling is available now, but this is just the first of many
models we will be introducing soon. We also plan to introduce modeling
for healthcare in the coming months too. Now, I know that our next speaker
has a lot of experience using their data to build new
innovations for our customers, including applications
that are powered by GenAI. Please join me in welcoming Rob
Francis, SVP and CTO for Booking.com. [music playing] Thank you, Swami. It's great to be here
with all of you. When I first started thinking
about what I would want to share with this group today,
I was reflecting a little bit on the arc of technology
over my own career, and I was thinking about
my first attempts to have a kind of human-like
interaction with technology. It reminded me of Eliza. If anybody remembers this one, this was developed by Joseph Weizenbaum
in the mid-60s, for the EMAX users in the group. I thought to myself, why not just
fire up Escape X Doctor and ask Eliza what I should talk about today. So, as you can see,
it's fairly frustrating. For those of you who are not
familiar with Eliza, it basically just put
the question back on you, and you really got nowhere at all, but I'm very excited
about what's possible today for our customers at booking.com
with the emergence of generative AI, but first, let me tell you
a little something about booking.com. If you're familiar with us,
you probably think of us from a travel perspective,
and it's true. We have accommodations, flights,
rental cars, attractions all over the globe,
but we're a two-sided marketplace, and we have partners
all over the globe that help make
our connected trip possible, but let me give you a sense
of some of our scale. In the accommodation space alone, we have over 28 million listings
of places to stay. If you want to book a flight, you can
choose a flight in 54 countries. You might need a rental car
in the location. You can choose from over 52,000
rental car locations across the globe, and if you'd like
to find something to do, you can book an attraction
before or in the app. As you can imagine, that presents
a lot of data challenges for us in the form of we manage over…
it's over 150 petabytes of data, and several years ago,
we recognized that we need some help. So, we partnered with Amazon
and AWS to help us tackle some of these challenges,
and I'm happy to say, it worked. Our data scientists are thrilled
to see that the number of jobs they can train
concurrently has gone up by 3X. They have a decrease of 2X
in the number of jobs that failed, largely due to limitations
of our own infrastructure, but certainly, their favorite,
it's a 5X reduction in the time it takes to train their jobs,
and I should note that some of these jobs are trained
on over three billion examples, but we didn't partner with AWS
just for our data strategy. We wanted a partner
who was going to help us with the emergence of new technology. So, when generative AI
really hit this year, we set out to build
the AI trip planner, making it much easier
to book a trip with Booking.com in a conversational manner. Let me show you how it works. First, I'd like to point out
that we are big fans of open source at Booking.com and we felt that the Llama 2 model
was the perfect one for us to implement
our intent detection model. So, we started there with hosting
Llama as a SageMaker endpoint. Now, if you notice
on the left-hand side, I entered into our AI trip planner, what I would just have
as a conversation, I'm going to a conference
in Las Vegas. I don't really like to gamble,
but I do like good food. Where should I stay? Well, the first thing you have
to realize is that we have to do
a little bit of moderation first. First, we have to make sure
that the conversation is even related to travel. That isn't always the case,
but more importantly, one of the things that we learn
is that our customers tend to put their personally
identifiable information in the trip planner itself, and we always want to protect
our customer's privacy. So, we want to make sure
that we're stripping those sorts of things out first,
but there's more to it than that. One of the things that I tend to talk
about when I talk about Booking.com with our customers is
our great selection, our flexibility, and great price, but they always
tell us that they love our reviews. There's just years of data there that really helps them
make a good decision. So, our RAG implementation,
leveraging AWS technology, pulls in our review data and helps make it easier
for our travelers to make a decision. Lastly, we ask the LLM to then
populate a JSON object that speaks directly
to our recommendation engine, also powered by machine learning, and give them
the best choice possible and a nice little carousel that they can sweep through
and book right in the app, and we're really
just getting started. As you can see, we're already
making use of SageMaker. We're working closely with
the Bedrock teams on a couple new exciting things
coming up, and we're also working
with the Titan teams, but I thought to myself for today,
why not cover the whole arc, and maybe I should ask Titan
what I should have said today. So, let's see what Titan had to say. Good morning, everyone. Thank you, Swami,
for the introduction. One, travel booking is a large market
with lots of data, but until now, it has been
very difficult to use that data to personalize
the booking experience. Very true. Two, generative AI is
a new technology that can take all of that
booking data, learn from it, and then generate new content that is
personalized to each individual user. You saw that. Three, at Booking.com, we use GenAI to create personalized
hotel recommendations that are tailored to each user's
unique needs and preferences. Want to book a trip? Visit Booking.com. Wow, what a difference! How about that
for an arc of technology? Thank you very much. [music playing] Thanks, Rob. This is a great example
of how you can leverage data to build GenAI apps that provide
a truly customized user experience. So, it's clear that data is the fuel
for GenAI, but for this to be a true symbiosis, GenAI must also benefit
our data foundation. While we typically think about ML
and AI as an outcome of data, we can also use it to transform
the way we manage data. This means AI can actually enhance
the data foundation that fuels it. AWS has infused AI and ML
across many of our data services to remove the heavy lifting
from data management and analytics. However, with all the strides
we have made, managing and getting value out
of data can still be challenging. Some of our customers are even asking if they can leapfrog
their data strategy with GenAI. The truth is: while GenAI
still needs a strong foundation, we can also use this technology to address some of the big
challenges in data management, like making data easier to use, making it more intuitive
to work with, and making it more accessible. One area where we can apply AI
is optimizing the performance of the places where you store
and query your data, like your data warehouses. Data warehouse administrators
need to manage multiple dimensions, like data variability,
the number of concurrent uses, and the query complexity,
all the while, having to balance price
and performance. These multivariant optimizations
are extremely hard for humans, but machine learning algorithms
are really incredible, and they excel in them. That's why, earlier this week,
we launched AI driven scaling and optimizations
for Redshift Serverless, enabling you to proactively scale on multiple dimensions
at the same time, all the while,
managing the price and performance. The end result is queries
just run faster with the optimal
price/performance tradeoff, and customers
are seeing big benefits. These AI-driven scaling
and optimizations will help Honda deliver
better price performance and get actionable insights
from millions of vehicle datapoints that are loaded
into the data warehouse without manual intervention. In addition to optimizing
the data warehouse, we know there are other ways
we can support data management with AI, like with Amazon Q. As I mentioned earlier, Amazon Q is a new type
of GenAI-powered assistant that is tailored to your business. Q supports virtually every area
of your business by connecting to your data
for context of your role, internal processes,
and governance policies. You can ask Q in natural language
to receive actionable information that removes the heavy lifting
from many repetitive tasks. For instance, when we ran
an internal poll within Amazon, asking developers,
how did they spend their time, they told us a large portion
of the day is focused on things like
looking up documentation, building and testing new features, maintenance and upgrades,
and troubleshooting. We know Q could make these tasks
easier, which is why it's now in all the places you build software
and work with AWS. And because Q knows you
and your business, it also helps you manage your data and take the heavy lifting
out of common data-related tasks, like running data queries. We know that GenAI is very good at
translating natural language to code. So, we thought to ourselves,
why not leverage Amazon Q to support the coding language
of our customers to query data warehouses
on a daily basis: SQL? To run a data query, you typically
have to write SQL code to load and analyze petabytes of
structured and semi-structured data. This is where we offer tools
like Redshift Query Editor for builders who write SQL, including detailed
detections during table creation, auto complete suggestions,
and syntax validation. Our querying workloads
can still present challenges. What seems like a really
simple request, such as identifying which venues
sold the most tickets in a given timeframe or location, actually involves a complex SQL query
that takes a lot of trial and error. To remove the heavy lifting
from data querying, I'm excited to announce
the preview of Amazon Q in Redshift. [applause] If you need help creating custom SQL, Q can turn your natural language
prompts into customized recommendations. It's available natively as part
of the Redshift Query Editor. With Q, you can ask in plain
English something like which three venues
sold the most tickets? The underlying model then
analyzes the schema and produces a SQL
query recommendation in just seconds. We can then add to our notebook
and run the query to test it out. Here, it knows the information
on ticket sales can be found in the sales table, but it also knows to search for
the event table to find the venue. We can then ask Q to find out
which event types were the most popular
at those venues. However, we can see that
these results are based on the total number of tickets,
which is not what we want here. So, we can quickly use Q
to course correct and ask it to retrieve the data
based on the number of total events, and for even more accurate
and relevant recommendations, you can enable query
history access to specific uses without compromising
your data privacy. In addition to data querying, you can also solve some of the more
painful data management jobs, like building your data pipelines. While our zero-ETL integrations
can eliminate data pipelines between many of your data sources, we recognize that
many of our customers still have to write custom ETL jobs that they need to create
and maintain constantly. Q can simplify this process
for our customers too. Today, I'm pleased to announce that, coming soon,
you will be able to use Q for creating data integration
pipelines using natural language. [applause] With this new integration, you can build data
integration jobs faster. You can troubleshoot them
with a simple chat interface and get instant data
integration help. Now, let's take a quick example. You can ask Q something like:
read this data from S3, drop the null records,
and write the results to Redshift, and it will return an end-to-end data
integration job to perform this action. Think about how powerful this is. Under the hood, this integration
uses Agents for Amazon Bedrock to break down the prompt
into a specific set of tasks and then combine the results
to build these integration jobs. Q delivers an intuitive interface
for integrating your data without you having any
prior knowledge of Amazon Glue. Now that we have covered
all the elements of a strong data foundation, let's see how they can come together
to spur net new innovation. For this, please welcome
Shannon Kalisky, Senior Product Manager at AWS,
to the stage. [music playing] Thank you, Swami. As humans, we are wonderfully
curious creatures, and that desire to learn
and share has given rise to more data than we've ever had before, and every 'aha' moment, every experience
becomes a story. So, how ironic is it that two
of the toughest challenges we face in our day-to-day jobs is just getting the data we need
where and how we need it, and then using that data
to tell a story? It should be simple. Most of us took flights
to be here today, and some of you may have
encountered a delay or a cancellation along the way, and when that happens,
ugh, it is miserable. So, what if we could change that? Let's imagine we want to create
a new feature, something that knows who you are
and the situation that you're in and can allow you to easily rebook
a flight without stress, but first,
we have to wrangle the data, and there is a lot of it
in a lot of formats, and it is all over the place. In S3, we have data on baggage
information, where every time the status of a bag
changes, a new file's created. In Redshift, we have data
on passenger details, flight schedules,
and aircraft availability. Then in Aurora, we have information
on payment methods and customer preferences,
and then we have real-time data, like weather,
coming in as Amazon Kinesis streams. To bring all of that data together
with a traditional ETL pipeline could take weeks, if not months. So, instead, we'll use the zero-ETL
integrations across AWS to bring all of that data together
without building a pipeline and without writing code. The first thing we'll do is to create
a Redshift Serverless data warehouse, and this will give all of our data
a place to land. Through zero-ETL, our data on S3 is automatically replicated
into Redshift. So, the next time we get one of those
baggage updates, we will see it in Redshift instantly. Now, some of our data was already
in Redshift, and for that, we can use data sharing
to share across the warehouses. There's also a zero-ETL integration
between Aurora and Redshift, and that will allow us to get
the data from Aurora into Redshift the minute it is written, and finally, we can use
Redshift Streaming Ingestion to bring in the real-time data
streams from Kinesis, and with that,
we have all the data we need, and our new feature can take flight. Now, we need to measure success,
and to do that, we will use Amazon Q in QuickSight, where key metrics
and critical data come together. Using Q, I can create an executive
summary of this dashboard, which will show me
the most important insights. For example, I see that, since we
have created our new feature, the time to rebook a flight
has decreased dramatically. It's pretty amazing, and I want
leadership to see the impact that that feature is having. So, I'll use Q
to create a data story. All I do is select the format
and then tell Q what I want to cover, select the visuals,
and then build, and in seconds, I have a beautiful story
based on my actual business data. It covers the problem and the impact
to customers, just like I asked, and the best part is that
it is completely customizable. For example, I can take
this paragraph on recommendations, and I can use Q
to transform it into bullets, so that those key takeaways
are more obvious, and when I'm ready, I can securely share this story
to others throughout my organization, so that we can all use it to drive
towards better decision making, and just like that, we've taken down
two of the toughest challenges we face in our day-to-day jobs. I hope that you all stay curious, and that these tools help you
the next time you are knee-deep in data
or struggling with writer's block. Thank you. [applause/music playing] Thank you, Shannon. With a strong foundation,
you can quickly connect to your data, make strategic decisions,
and build new experiences that improve customer loyalty
and satisfaction. So, we have covered how data supports
GenAI, and how GenAI supports data, but how do we as humans
fit into all of this? Recently, I had the honor of
presenting with the CEO of Hurone AI at the UN General Assembly event
in New York, where we shared how GenAI can make
an enormous impact in the world. Dr. Kingsley built this company
with one mission critical belief: that where you live
should not determine if you live or die of cancer, whether you are in Latin
America or in Africa. Across the Sub-Saharan Africa, there is roughly one oncologist
for every 3,200 cancer patients. By comparison, the U.S. ratio
is about one oncologist for every 300 patients, and in particular,
the country of Rwanda, with a population of about 13
and a half million people have fewer than 15 oncologists. That means Rwandan patients make
arduous commutes to medical facilities and often wait until symptoms
are dangerously severe before even reporting them. To combat this problem,
Hurone AI created applications that helped make the best
possible cancer care accessible to everyone regardless
of their location, and with innovations they are
building on Bedrock, Hurone AI is revolutionizing cancer care
detection and diagnosis predictions to improve treatment access
for patients in Kenya, Nigeria, and Rwanda
that need it the most. They're also advancing equitable
biopharma research and development by filling a critical cancer data gap
for the underrepresented population. Now, I'd like to pause and take
a brief moment to honor Dr. Kingsley as he's here today with us. [applause/cheering] Amazing. Thank you, Dr. Kingsley, for all of the amazing work you do. I love this story,
because it showcases how GenAI can augment our human abilities
to solve many, many critical problems. GenAI will undoubtedly
accelerate our productivity. That's why AWS is continuing
to invest in tools that will completely reinvent the way you work with
new applications and services with GenAI capabilities built inside. This includes tools
like Amazon CodeWhisperer, our AI-powered coding companion. CodeWhisperer is trained
on billions of lines of code that help you build applications
faster with code recommendations in real-time, in your IDE. And earlier this year, we announced
CodeWhisperer customization, which uses your internal
code base to generate recommendations that are more relevant
to your business. You can securely connect
to your private repositories, and with just a few clicks,
you're good to go. Your customizations are isolated
to protect your valuable IP, and only the developers
with the designated access will be able to use them. While customers are rapidly
adapting CodeWhisperer to improve their productivity,
it's not the only way we are helping to get more done
with the power of GenAI. As I mentioned earlier,
Q can help you remove the heavy lifting from common tasks, irrespective of what
your job function is. This means Q can
accelerate productivity, whether you are
building applications, creating financial reports,
presenting data to your exec team, or working in a call center. And just like how we are building Q
to support our customers, our customers are also infusing GenAI
capabilities into their own products to power assistive experiences
for their customers. Now, let's hear from one
such customer, Toyota, who is doing just that,
with a short video. At Toyota, safety is embedded
into everything we do. I often like to tell our engineers that we may not be
doctors, nurses, or firefighters, but we have the opportunity
to help save lives. People don't tend to think
of a data and safety together, but ultimately, having the right data
at the right time can help us determine if a vehicle
has been in a collision, and it allows us to help
get emergency responders to a customer's vehicle
in the most efficient way possible. To achieve this, we pull the data from hundreds of sensors
in the vehicle, from millions of vehicles
globally on our platform, which equates to petabytes of data. So, the challenge was:
how do we process all of that data in real-time
when every second counts? We realized, wow, this is a really
interesting engineering problem to solve, and one of the things
that we love about AWS is there are so many options. It allows us to be so creative. For our cloud migration, we were able to seamlessly
transport our data into AWS really easily by
just flipping a switch. So, the moment that a vehicle
is in a collision, there is a trigger event that goes from the communications
module up to our AWS cloud. And then somebody from our
call center will be talking to your vehicle
within three seconds. At the end of the day,
it's just super exciting to think about the future
for Toyota vehicles. For instance, we are also using newer
technologies like generative AI, and with a managed service,
like Amazon Bedrock, we were able to ingest
the owner's manual and develop a generative
AI powered assistant, and it's going to be able
to tell you anything about it by using some very simple voice
commands, just saying, hey, Toyota,
tell me about this icon. That icon means the traction
control system has been enabled due to slippery road conditions. It's almost like saying your car is
now going to be the expert on itself. Sometimes, when you get involved
in the tech, you just get into the weeds
of the code and fixing things, and you kind of forget the actual
impact that you have on the world. There's a lot of pressure, but there's also a lot of pride
in being able to wake up every day and know that that is what
I'm going to be working on. [music playing] [applause] What a great story! I love seeing how builders
are using the power of data and GenAI
to solve real-world problems. While harnessing all of
this technology is important to driving innovation, we also need to harness
one of the most essential inputs, the power of human intellect, to create customer experiences
that make a bigger impact, and when we look at
how we can benefit from and strengthen the relationship
between data and GenAI, we can think of ourselves
as the facilitators that create a powerful cycle
of reinforcement over time. Let me share a quick example
to show you what I mean. This example comes from deep
within the Panama Rainforest, where there are more
than 1,500 species of trees. But there is one tree in particular,
the Virola tree, that lies at the heart of our story. Growing to more than 130 feet tall,
the Virola produces a small red fruit that is very popular
with the local wildlife, including the toucan, as well as the small mammal
called the agouti. As the fruit matures, toucans perch themselves
in the treetops for a quick snack before dropping the leftover
seeds to the forest floor, and since the agoutis can't climb, they welcome these little
red seeds from the toucan. Then to prepare for the dry season, when the food in short supply, they also collect
as many seeds as possible and bury them
in the soil for safekeeping. Many of these will later grow
when conditions are ripe, sprouting new trees
that continue the cycle for decades
or even hundreds of years. So, the toucan helps both
the forest and the agouti, and the tree supplies food
for both the animal species, and the agouti plants
seeds in the soil, allowing new trees to grow. This story reminds me of
the relationship between humans, data, and generative AI
for a couple of reasons. For one, the relationship creates
longevity, more trees, more food, and longer lives,
and two, this cycle wouldn't exist without the collaboration
and facilitation along the way, ultimately strengthening
each element over time. As humans, we are responsible
for creating and facilitating a flywheel
of success with data and GenAI. That's because we provide
a unique set of benefits that create
more efficient, responsible, and differentiated GenAI apps. We created the innovation
that generates data, making GenAI technology
possible in the first place, as well as the data foundations
that support it. We identify use cases and lead the development of GenAI apps that
support our unique business needs, and at different points
along the way, we provide valuable feedback
to maximize the efficiency of these GenAI apps
and the output that's generated. This is exactly what Ada Lovelace
was talking about. One of the most common ways
you can integrate human feedback into your GenAI strategy is the model evaluation
and selection process. When you pick the best model
for your use case, you can optimize accuracy
and performance, while better aligning
to your brand, style, and voice. However, model evaluation requires a deep level of expertise
and data science, and it can be a tedious,
time-consuming process. Customers will first need to create the right
benchmarking datasets, metrics, and algorithms
to calculate their metrics. Next, they need to set up
a human evaluation workflow for subjective criteria,
like friendliness and brand style, which can be difficult
to build and operate, and finally, they often need
to build benchmarking tools to pick the most appropriate model. This entire process will also need
to be repeated periodically as new models are produced
or new models are fine tuned. We wanted to make it easier
for our customers to evaluate models
for their specific needs. That's why, today, I'm very excited
to announce the preview of Model Evaluation
on Amazon Bedrock. [applause] This new capability allows you
to quickly evaluate, compare, and select the best foundational
model for your use case. With this feature, you can preview
and perform automatic and human-based evaluations
depending on your needs. For automatic evaluations,
you can leverage curated data sets or use your own to evaluate
quantitative criteria like robustness,
toxicity, and accuracy. To evaluate subjective criteria,
like brand voice, Bedrock offers fully managed
human review workflows with support from AWS
experts as well. And then we provide
you comprehensive reports to easily review metrics
on model performance. This capability is also now
available in SageMaker Studio. Human input will continue
to be a critical component of the development process
with GenAI, and as our relationship
with this technology evolves, so will the skillsets
we need to unlock its potential. According to the World
Economic Forum, nearly 75% of companies will
adopt AI technologies by 2027. While some tasks
will become obsolete, GenAI will also create
entirely new roles and even new products and services, and employers will need to support
them with the right people. Hard skills in ML and AI
will continue to be important, but soft skills
like creativity, ethics, and adaptability will grow
increasingly critical with GenAI. Many are calling this
the monumental shift, called the reskilling revolution. We are helping our customers
prepare for it in a variety of ways. To support the workforce of tomorrow, we recently launched the AWS
GenAI Scholarship with Udacity. This program provides more than
$12 million in value in scholarships to over 50,000 high school
and university students. We're also investing in our own AWS
AI/ML Scholarship Program for years, which is making a profound
impact on students globally. As someone who grew up with
limited access to technology, I'm deeply committed
to our investments in this area. Students and professionals
who are joining this reskilling revolution are critical to the future
of this industry. In addition to these opportunities, we offer more than 100 AI and ML
courses and low-cost trainings. These tools will enable you
to build new skills and get started with GenAI. We are also introducing
new ways to experiment and have fun learning GenAI. That's why we recently
announced PartyRock, an Amazon Bedrock playground [applause] Some PartyRock fans out there. It's an easy and accessible way
to learn about GenAI with a hands-on
code-free app builder. Since it released, users have created tens of thousands
of PartyRock applications, using powerful foundation models
from leading AI companies. And starting today, we will also
include the Titan Text Express and Light models as well. Now, to get started with PartyRock,
all you need is a social login. No AWS account required. So, let's see in action how easy
it is to build a PartyRock app
in a few steps. We start on the homepage by pressing
the build your own app button and providing a brief description
of what our app should do. Our new app contains a prompt,
where we can experiment with different prompt
engineering techniques and review the generated responses. We can also add additional
LLM-based widgets, like a chatbot, to make our application
more useful and fun to use, and we can even see
and select different models to see what works best
for your use case. We can then use our chatbot
to discuss the results and engage in further conversation. And once we are happy
with the results, we can publish our application
to the world and invite other people to use it or remix it to make
their own version. You can recommend your apps
on social media or get featured
on the new Discover page, featuring community apps
by the PartyRock team. You might even find
my own personal favorite, the app my daughter created
over our Thanksgiving weekend, that helps you create
your own chocolate factory. I encourage you to check out
PartyRock today. With all of this data and GenAI
innovation happening around us, it is important to remember
that each of you will continue to bring your own unique
inputs and ideas to the table. Like Albert Einstein once said,
"Creativity is seeing what others see and thinking what
no one else thought." Today, the powerful symbiotic
relationship between data, GenAI, and humans
is accelerating our ability to create new innovations
and differentiated experience, and we are just getting started. From a secure place to customize
your foundation models with your data, GenAI-powered services
to strengthen your data foundation, tools with GenAI built in
to augment employee productivity, and mechanisms for
implementing human feedback, AWS has everything you need to unlock
this transformative technology. And with tools like PartyRock to help you kickstart
your GenAI journey, I look forward to seeing
what you create next with AWS. Thank you. [applause/music playing]